andabi / deep-voice-conversion

Deep neural networks for voice conversion (voice style transfer) in Tensorflow
MIT License
3.92k stars 843 forks source link

Parameters Used #3

Open Coice opened 6 years ago

Coice commented 6 years ago

How many epochs is the sample 'Kate' voice?

Did you change any parameters (such as learning rate) while training?

Can you post a picture of your TensorBoard net/train/loss and net/eval/loss for the 'Kate' voice?

Thank you.

andabi commented 6 years ago

@coice

2017-11-09 3 03 46

I trained the model over 20K steps with Adam and learning rate 0.0005. It took more a day.

Coice commented 6 years ago

Thank you for posting your chart.

Did you apply any other post processing to the audio?

I am unable to get my loss as low as your Kate dataset (0.5 loss). I am currently at around 2.1 loss with the arctic female dataset. Were you getting similar results with the arctic female dataset? Also, do you have any samples of the audio produced for the arctic dataset?

This is a sample of my model with the arctic female voice at 10K steps: arctic_sample.zip

Again, thank you for your time.

andabi commented 6 years ago

@coice My loss curve was from the simplified model which a cbhg module (mel part) is removed. (therefore, one cbhg is used in train2) I think the result you get is from the model trained not enough. You can train the model more or try the simplified model that I mentioned.

Xiyor commented 6 years ago

@coice I have a question for run the procedure. In CPU tensorflow mode, I could run the code, but so slow. train1.py result is about 70%. train2.py is so slow. When I put the code in container with tensorflow-gpu, it hang forever, could not figure out the cause. Obviously, same code!!! Strange! What is your running environment? Native OS, or using docker. If you use docker, could you provide Dockerfile? Thank you. #

Coice commented 6 years ago

@Xiyor My tests were done without docker on centos 7.1, tensorflow-gpu 1.2.0. Using a GTX 1070 or a GTX Titan Pascal. I personally would not even attempt to use CPU to train the model due to the amount of time needed. I didn't make much changes on my first successful run. My thoughts are to double check the batch_size variable and your file paths in hparams.py

Xiyor commented 6 years ago

@coice Thank you for your suggestion. I will try it again.

d0030253 commented 6 years ago

@coice Hello, I have recently done a voice conversion experiment. Could you please give me more arctic samples to do the comparison?

caicaibins commented 5 years ago

@coice Hey coice, you improved the conversion result ,or still sound like machine?

Coice commented 5 years ago

@caicaibins I gave up on this project a long time ago. I was never able to make it sound much better than the samples I provided. Have you found anything else that gives better results?