chrisdonahue / wavegan

WaveGAN: Learn to synthesize raw audio with generative adversarial networks
MIT License
1.32k stars 283 forks source link

About the seed #36

Closed spagliarini closed 5 years ago

spagliarini commented 5 years ago

Hi,

I already asked you some details but I have some other doubts. I am pretty new with tensorflow so this could be related to this fact.

1) Is the same seed used along all the training? 2) In the preview mode I see that the first time the preview is generated, since there's no z, it is defined from the infer_metagraph. Then, is it a z that has been used during training?

Just a wondering: if my dataset is composed by wav audio files with duration > 1s, should I be aware about something in particular? Like some parameters (out of the option for the wav format that I already mastered and applied) that might be changed...

chrisdonahue commented 5 years ago

Is the same seed used along all the training?

I'm not sure what you mean. The code doesn't actually set the random seed (it probably should; my bad). Can you clarify if this doesn't answer your question?

In the preview mode I see that the first time the preview is generated, since there's no z, it is defined from the infer_metagraph. Then, is it a z that has been used during training?

It is extremely unlikely (nearly impossible) that the z vector randomly selected in the preview routine will be identical to any used during training.

if my dataset is composed by wav audio files with duration > 1s, should I be aware about something in particular? Like some parameters (out of the option for the wav format that I already mastered and applied) that might be changed...

The training script should be configured out of the box to be appropriate for wav files >1s. If you are having issues, look at all of the command line args that start with --data. See the README explanation of these args for details: https://github.com/chrisdonahue/wavegan#data-considerations

spagliarini commented 5 years ago

Thanks for the reply!!!!

It is extremely unlikely (nearly impossible) that the z vector randomly selected in the preview routine will be identical to any used during training.

I meant for the generation of the preview, not for the training.

The training script should be configured out of the box to be appropriate for wav files >1s. If you are having issues, look at all of the command line args that start with --data. See the README explanation of these args for details: https://github.com/chrisdonahue/wavegan#data-considerations.

Ok, here it's my bad. I'm actually training with audio files with lenght <1s (order of milliseconds). I'm using the option --data_first_slice indeed. I was just wondering if other parameters should be changed. But I guess this is not the case.