Training the new wavegan on speech dataset

chrisdonahue / wavegan

WaveGAN: Learn to synthesize raw audio with generative adversarial networks

MIT License

1.32k stars 283 forks source link

Training the new wavegan on speech dataset #47

Closed spagliarini closed 5 years ago

spagliarini commented 5 years ago

Hi Chris,

I wanted to train the model with the speech dataset you used. I would like to look at the latent space properties.

I already trained the model it with the old version and I had no issues in training. On the other hand, the .pkl file I got are not decodable using a pickle open-load.

Since I trained another dataset with the new version and I could decode the latent vectors, I would like to use the speech dataset. I run into the error "audioread.NoBackEndError". Have you tried this training and/or experienced problems while reading the dataset? Should I add some particular option?

Thanks a lot for the help!

shin777kk commented 5 years ago

Can you tell me how to perform wavegan and specgan? I am following the Read Me but can't execute, Do I need to import a dataset? What is the command?

chrisdonahue commented 5 years ago

Sorry for the late reply. Was on vacation. This error is thrown by librosa when you don't have a suitable audio IO backend installed. If you are training on the spoken digits dataset, you should be able to add the flag --data_fastwav which will both get rid of this error and make training faster.

spagliarini commented 5 years ago

Sorry to bother you again. Actually, using that option I run into a format error. I copy paste here to make the error clear:

"File format b'\x80\x00\x01\x00'... not understood."

I also tried to use the option --data_first_slice but it does not solve the error.

chrisdonahue commented 5 years ago

Hi there. This looks like perhaps a Python 2/3 compatibility issue. Or perhaps an issue with the way you are passing the file format via the command line. Can you tell me the line of code where this error is raised?

spagliarini commented 4 years ago

Hi, sorry for the delay.

Here the lines where the error is raised.

in loader.py line 128, fast_wav=decode_fast_wav)

line 36, in decode_audio _wav, _fs = librosa.core.load(fp, sr=fs, mono=False)

line 112, in load with audioread.audio_open(os.path.realpath(path)) as input_file:

line 116, in audio_open raise NoBackendError()

audioread.NoBackendError

chrisdonahue commented 4 years ago

You don't have a program installed that librosa can use to read audio files. Usually I install ffmpeg. more details: https://github.com/librosa/librosa/issues/589