Something wrong when i use the “soundstream” repo

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

MIT License

2.39k stars 255 forks source link

Thank you for your excellent code~ When I train the SoundStream on my own data from scratch, I encountered a problem that the training loss became "nan" at around 4k steps, I reduced the initial learning rate by a factor of 10, the "soundstream total loss" degraded from 40 to 13 at 17k steps, but the generated ".flac" file stll contains obvious noise and the voice is not clear at all, just like the picture below! I have no idea but I keep the same parameters as you. Do I need to train for more steps? What is the approximate number of steps to get good audio？Do I need to adjust any parameters？Thanks！

lucidrains / audiolm-pytorch

Something wrong when i use the “soundstream” repo #184