Open Coice opened 6 years ago
@coice
I trained the model over 20K steps with Adam and learning rate 0.0005. It took more a day.
Thank you for posting your chart.
Did you apply any other post processing to the audio?
I am unable to get my loss as low as your Kate dataset (0.5 loss). I am currently at around 2.1 loss with the arctic female dataset. Were you getting similar results with the arctic female dataset? Also, do you have any samples of the audio produced for the arctic dataset?
This is a sample of my model with the arctic female voice at 10K steps: arctic_sample.zip
Again, thank you for your time.
@coice My loss curve was from the simplified model which a cbhg module (mel part) is removed. (therefore, one cbhg is used in train2) I think the result you get is from the model trained not enough. You can train the model more or try the simplified model that I mentioned.
@coice I have a question for run the procedure. In CPU tensorflow mode, I could run the code, but so slow. train1.py result is about 70%. train2.py is so slow. When I put the code in container with tensorflow-gpu, it hang forever, could not figure out the cause. Obviously, same code!!! Strange! What is your running environment? Native OS, or using docker. If you use docker, could you provide Dockerfile? Thank you. #
@Xiyor My tests were done without docker on centos 7.1, tensorflow-gpu 1.2.0. Using a GTX 1070 or a GTX Titan Pascal. I personally would not even attempt to use CPU to train the model due to the amount of time needed. I didn't make much changes on my first successful run. My thoughts are to double check the batch_size variable and your file paths in hparams.py
@coice Thank you for your suggestion. I will try it again.
@coice Hello, I have recently done a voice conversion experiment. Could you please give me more arctic samples to do the comparison?
@coice Hey coice, you improved the conversion result ,or still sound like machine?
@caicaibins I gave up on this project a long time ago. I was never able to make it sound much better than the samples I provided. Have you found anything else that gives better results?
How many epochs is the sample 'Kate' voice?
Did you change any parameters (such as learning rate) while training?
Can you post a picture of your TensorBoard net/train/loss and net/eval/loss for the 'Kate' voice?
Thank you.