Question on train1.py's and train2.py's runtimes.

andabi / deep-voice-conversion

Deep neural networks for voice conversion (voice style transfer) in Tensorflow

MIT License

3.92k stars 843 forks source link

Question on train1.py's and train2.py's runtimes. #9

Open rolanvc opened 6 years ago

rolanvc commented 6 years ago

Hi,

I'm trying this out, but I've only a gtx 960 gpu. I was wondering what kinds of training run times do you have for these 2? Will it take me weeks, days, hours? When I started python train1.py, A few lines of logging came out, but seems to have frozen already, with no sign of life nor error.

I hope you can give me an idea what to expect.

Thank you very much, =)

jswilson commented 6 years ago

I actually don't know how it compares to the GTX 960, but I was using a p2.xlarge (which is half of an nVidia K80 Tesla) on AWS, and train1 took about 72 hours to converge on 70% accuracy.

VictoriaBentell commented 6 years ago

@jswilson would you mind sharing how long train2 took you, if you tried it on the arctic dataset?

VictoriaBentell commented 6 years ago

I've been running Train2 for over a day now and my results are more erratic than a mountain goat on LSD, so I'm a tad concerned that it'll never converge.

0i0 commented 6 years ago

@VictoriaBentell any news? did it converged?

VictoriaBentell commented 6 years ago

No, I assume that it's similar to this issue. It seems like the arctic dataset itself might be the problem, and so I'm going to try and find another dataset to replace that.

Super1ZC commented 6 years ago

@VictoriaBentell May I ask when you were using train2, did you use your own voice corpus? If so, how did you do that?

VictoriaBentell commented 6 years ago

Originally I was using the arctic dataset, but I haven't done anything with it since my last response, so I'll try making my own corpus and get back to you on the results. As far as I'm aware, it should be as straightforward as replacing each wav file in bdl and slt with any source and target respectively. As long as the source and target are saying the same things for 3 or 4 seconds each, the result should be fine... I hope.

iamxiaoyubei commented 5 years ago

@VictoriaBentell May I ask if your model converges now? Did it work out using arctic dataset? If you have train2 result, could you please share it with me for a try?