jishengpeng / WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
MIT License
801 stars 44 forks source link

why grad norm is so high? #38

Open necrophagists opened 2 months ago

necrophagists commented 2 months ago

s 微信截图_20240923214125

jishengpeng commented 2 months ago

s 微信截图_20240923214125

We did not place significant emphasis on this aspect in WavTokenizer; we plan to implement certain engineering optimizations related to the loss function in WavTokenizer2. Thank you for pointing this.