chrisdonahue / wavegan

WaveGAN: Learn to synthesize raw audio with generative adversarial networks
MIT License
1.32k stars 283 forks source link

Generate sound of more than 1 second as output #21

Closed SahajRana closed 5 years ago

SahajRana commented 5 years ago

Hello @chrisdonahue, this is a really amazing library as well as an implementation of gan for audios. I've been successful in generating piano sounds with the dataset and model provided but the output is of 1 second. Could we increase this time? Please let me know. Thanks!

chrisdonahue commented 5 years ago

Thanks for your interest! I am currently preparing a v2 version of WaveGAN which will allow training up to 4 seconds and also hopefully solve other outstanding issues. Should be ready within a couple weeks.

chrisdonahue commented 5 years ago

I just merged v2. See the updated README for more information. Thanks again for your interest!

SahajRana commented 5 years ago

Thanks a lot for such a fast response, I will give it a shot this week and will let you know how it goes!

chrisdonahue commented 5 years ago

Great! As an FYI you can enable generation of longer samples by specifying --data_slice_len 65536, which will generate ~4 seconds at 16kHz. If you want to use a different sampling rate, pass it using --data_sample_rate

chrisdonahue commented 5 years ago

Please do let me know about your experience with the new codebase. Especially let me know if you encounter any issues.