Longer sample produce very noisy output

tomlepaine / fast-wavenet

Speedy Wavenet generation using dynamic programming :zap:

GNU General Public License v3.0

1.76k stars 307 forks source link

Longer sample produce very noisy output #13

Open ianni67 opened 7 years ago

ianni67 commented 7 years ago

Yes I'm kinda new to TF and still... Training, so bear with me for my lame questions. I'm experiencing with the demo. It trained and generated correctly with the very short audio sample provided with the code, but then I wanted to try something different. I ran the demo on a short (abt. 20seconds) sample from a well-known Beethoven's symphony and then generated 300000 samples. Well, something strange happened: only the first half a second is fine, the rest of the generated sound is extremely noisy and barely recognizable. In the code, I just changed the path of the input audio and the duration of the generated audio. What am I doing wrong? Thank you for your patience in reading my post (and answering, if possible!)

tomlepaine commented 7 years ago

Hi @ianni67, what are your goals exactly?

This code is designed to demonstrate the fast wavenet generation algorithm.

If you want to learn the structure of music and generate novel samples, that is not what this repo is designed for. Instead try tensorflow-wavenet, which allows you to train on a large body of data.

If you want to memorize a single audio sample this code should work. Though I might have made some assumptions about audio size that bust it. If you can fix it, please make a pull request :smile_cat:.

Best, Tom

ianni67 commented 7 years ago

My short-term goal is experiencing with wavenet. The long term is training a net for music generation. Indeed I tried also tensor-wavenet, and got similar results. The output is very very noisy, while the input is not. Probably I'm pushing the wrong buttons. Could you, please, give me some hints regarding how the input should be pre-conditioned or regarding the kind of output I can expect? Or (even better), some initial indications about how to fiddle with the parameters?