lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
MIT License
2.44k stars 264 forks source link

i get the noise! #163

Open iamliulong opened 1 year ago

iamliulong commented 1 year ago

i use the file audiolm_pytorch_demo.ipynb to train the .wav file and change num_train_steps to 10000(in README.md file, the number is 1_000_000, should 1000000 works?), , but i get the noise i use the dataset LibriSpeech: "https://us.openslr.org/resources/12/dev-clean.tar.gz" to train the model when i save the .wav file, i listen nothing, so i plot the generated_wav in this picture. image this is my file: https://colab.research.google.com/drive/1UFAjekN0Tp9gGbZRT980ND93LoIPAgX3?usp=sharing. could you show some example the code generated?

xtluo commented 1 year ago

@iamliulong I have tried training soundstream audio codec, the model cannot re-construct promising audio. Maybe that's the cause. cc @lucidrains