jishengpeng / WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
MIT License
717 stars 40 forks source link

why grad norm is so high? #38

Open necrophagists opened 1 month ago

necrophagists commented 1 month ago

s 微信截图_20240923214125

jishengpeng commented 1 month ago

s 微信截图_20240923214125

We did not place significant emphasis on this aspect in WavTokenizer; we plan to implement certain engineering optimizations related to the loss function in WavTokenizer2. Thank you for pointing this.