chrisdonahue / wavegan

WaveGAN: Learn to synthesize raw audio with generative adversarial networks
MIT License
1.33k stars 281 forks source link

Issue generating tfrecords #10

Closed anoojpatel closed 6 years ago

anoojpatel commented 6 years ago

I've had trouble generating proper tfrecords using make_tfrecords.py.

The main issue arises from trying to create tfrecords that can properly be previewed and tfrecords that can be trained. I have installed all the prereqs like ffmpeg, tensorflow 1.9.0 (Testing on CPU for now), python 3, etc. The model looks like it trains normally on the tfrecords that you provide through your download link, but is not properly created the script. I tried testing by downloading the piano wav files and generating tfrecords, and then training them, however the model finishes training within a minute. I compared it to training based on the records you provide for the piano samples, and the model trains for the appropriate amount of time.

I have tried commenting out the audio_labels and I modifying the script. Currently I get this error when I'm trying to run the script without modification Traceback (most recent call last): File "data/make_tfrecord.py", line 121, in <module> 'id': tf.train.Feature(bytes_list=tf.train.BytesList(value=audio_id)), TypeError: 'U' has type str, but expected one of: bytes

Would anyone know how to solve this issue?

Vaakapallo commented 6 years ago

Did you use a sh script like https://github.com/chrisdonahue/wavegan/blob/master/data/ljspeech.sh ?

I noticed I had this problem before I managed to set the parameters in that sort of sh-script correctly. Including the nshards, which at least worked well as just the amount of audio clips per folder.

I noticed the training seems to fail silently if you don't have the correct setup and names for tfrecords. So what worked for me was organizing the audio files to 80-10-10 split into folders named 'train' 'valid' and 'test'. Then run a script like ljspeech.sh pointing to those folders. And then running train on the folder with the newly created tfrecords.

leberknecht commented 5 years ago

For the logs: Seems to be a TF 1.6 side effect, changing

'id': tf.train.Feature(bytes_list=tf.train.BytesList(value=audio_id)),

to

'id': tf.train.Feature(bytes_list=tf.train.BytesList(value=[bytes(audio_id, 'utf8')])),

did the trick for me