chrisdonahue / wavegan

WaveGAN: Learn to synthesize raw audio with generative adversarial networks
MIT License
1.33k stars 280 forks source link

much longer training times with phaseshuffle on #94

Open markhanslip opened 4 years ago

markhanslip commented 4 years ago

Hi,

Thanks for the fantastic model,

On my current setup (Titan RTX / Ryzen 2700X / 32GB RAM) it takes a LOT longer to train WaveGAN with phase shuffle on than off (the difference is huge, like 10x plus). Also GPU usage with phase shuffle on is much lower. Is this normal? I'm guessing that phase shuffle requires much more CPU intervention during training.

Best wishes,

Mark

chrisdonahue commented 4 years ago

Huh. That's odd. I don't remember this being the case back when I've trained models w/ phase shuffle in the past but it's definitely possible I overlooked it. What version of TF are you using? I wonder if more recent versions of tensorflow rely on CPU operations for the padding?

markhanslip commented 4 years ago

Thanks for responding - I'm using TF 1.14 in the env I setup for WaveGAN, maybe I should downgrade to 1.12?

chrisdonahue commented 4 years ago

You could try that yeah, though I think I've trained on 1.14 as well in the past without issue. The only other things I can think of are CUDA/CuDNN versioning having an impact.

mikemech83 commented 3 years ago

Hi, I just wanted to confirm something. If I am attempting to turn phase shuffle off, contrary to other posts I've seen on the issues here I believe the correct parameter should be --wavegan_disc_phaseshuffle 0 (rather than --phaseshuffle 0). I tried to confirm this as best I could by looking through the python scripts. Anyway, I noticed that when I fire up the script with --wavegan_disc_phaseshuffle 0, I'm still getting the messages about filling up the shuffle buffer. Is this normal?

markhanslip commented 3 years ago

Hi,

I believe the correct parameter is --wavegan_disc_phaseshuffle 0.

I've also experienced the same behaviour as you (still filling up the shuffle buffer when phase shuffle is supposedly turned off).

In my experience turning phase shuffle off works well with data_sample_rate 22050, data_slice_len 32768 and data_num_channels 1 - training is stable and much quicker with these params too. I haven't managed to turn it off with other sample rates, slice lengths or with stereo audio data.

Hopefully Chris Donahue can shed more light :)