chrisdonahue / wavegan

WaveGAN: Learn to synthesize raw audio with generative adversarial networks
MIT License
1.32k stars 283 forks source link

Issue after shuffle buffer is filled #31

Closed saraalemadi closed 5 years ago

saraalemadi commented 5 years ago

Hi,

I have been trying to reproduce the results of your paper on the dataset provided (drums, SC09, piano) but here is an issue I have been facing(attached), after the shuffle buffer is filled, everything hangs, I have left the algorithm for 10 days and nothing have changed nor the process have completed. I am using the following command to run the code: "python3 wavegan/train_wavegan.py train ./train --data_dir sc09/train" The version of TensorFlow and the other libraries do match to what what was recommended and I am running it on Nvidia Titan V GPU.

How can I verify that the training has been completed? since nothing is printed out to the screen as the model is training.

Any help would be appreciate it.

Thanks

Buffer
chrisdonahue commented 5 years ago

The training script just trains forever. It will keep training until canceled. To monitor the training in progress, please run tensorboard on the training directory: tensorboard --logdir=./train

saraalemadi commented 5 years ago

Thanks Chris for your feedback. I have two questions regarding the above:

  1. CTRL+C doesn't work to terminate the process at which the training is running, therefore, does force termination have an effect on the trained model?

  2. Which function should I use to generate the audio files once I cancel the training? would it produce batches of audio files or a single audio file at each run?

Thanks -Sara

chrisdonahue commented 5 years ago

Hi Sara,

In response to your questions:

1) I am not sure why CTRL+C does not terminate the process, but forcing termination should not affect the trained model. If you happen to force terminate while it is saving, the latest checkpoint may be messed up, but the training script saves every few minutes by default so you could use the second most recent checkpoint for generation.

2) The infer method in train_wavegan.py produces a Tensorflow Metagraph in the training directory called infer.meta. You can load this graph and use it to generate as many audio slices as you want (either one or a batch of any size). Here is a minimal code snippet appropriate for an IPython notebook:

import tensorflow as tf
from IPython.display import display, Audio

# Load the graph
tf.reset_default_graph()
saver = tf.train.import_meta_graph('infer.meta')
graph = tf.get_default_graph()
sess = tf.InteractiveSession()
saver.restore(sess, 'model.ckpt')

# Create 50 random latent vectors z
_z = (np.random.rand(50, 100) * 2.) - 1

# Synthesize G(z)
z = graph.get_tensor_by_name('z:0')
G_z = graph.get_tensor_by_name('G_z:0')
_G_z = sess.run(G_z, {z: _z})

# Play audio in notebook
display(Audio(_G_z[0, :, 0], rate=16000))

You can also see this Colab notebook for more examples. This pull request also provides a script for doing this if Notebooks aren't your thing.

yumath commented 5 years ago

How long should I train? On Nvidia GTX 1080 Ti GPU. I also have been waiting for a long time.

chrisdonahue commented 5 years ago

@yumath you need to use tensorboard to monitor training. Our training script trains forever; you should stop it when you are satisfied with the results. Please see the README.

saraalemadi commented 5 years ago

Hi Chris,

When I run the script from the pull request which you have mentioned in your comment above to generate audio files, all I get is a single tone 1 second wav file, however, every time I run it, I get the same exact tone (working with the drums dataset). I tried to change the code to produce multiple clips similar to the Collab example, however, they all seem to be exactly the same. Any insight on how to solve this matter is greatly appreciated.

chrisdonahue commented 5 years ago

Hi Sara,

Apologies for the delay. Are you using different latent (z) vectors every time you run it? If so, the only reason the sounds would be the same is that the model has become extremely overfit to a single example. How big is your training dataset? Can you compare the waveforms (e.g. sum(abs(waveformA-waveformB))) to see if they're actually identical?

Cheers, Chris