Closed rgzn-aiyun closed 4 years ago
@rgzn-aiyun because the new version calculate f0/energy for FastSpeech2 and it's slow :D
@rgzn-aiyun because the new version calculate f0/energy for FastSpeech2 and it's slow :D
Half an hour has passed, and the pretreatment has not been completed. Is there any way to speed up the generation?
@rgzn-aiyun the preprocessing already use multi-process, for ljspeech it need 10-15p to calculate. I think it's normal and you just need calculate once time :D.
@rgzn-aiyun the preprocessing already use multi-process, for ljspeech it need 10-15p to calculate. I think it's normal and you just need calculate once time :D.
I estimate that it will take at least an hour to generate, which is unacceptable speed!
@rgzn-aiyun why it's unacceptable speed since you just need calculate once time ?. BTW, the speed slow cause by https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder used for calculate F0. I think the nature of it is slow
@rgzn-aiyun why it's unacceptable speed since you just need calculate once time ?. BTW, the speed slow cause by https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder used for calculate F0. I think the nature of it is slow
Yes, because it is cloud computing, it can't waste too much time
@rgzn-aiyun so maybe you need to pre-calculate local then upload to cloud ?, or you can ignore to calculate f0/energy (in case you don't training fastspeech2) by using the old preprocessing files.
@rgzn-aiyun so maybe you need to pre-calculate local then upload to cloud ?, or you can ignore to calculate f0/energy (in case you don't training fastspeech2) by using the old preprocessing files.
This is indeed a solution.
Something went wrong?
Traceback (most recent call last):
File "train_fastspeech2.py", line 515, in
@dathudeptrai Doesn't FastSpeech2 need to extract the duration?
@rgzn-aiyun yes, fastspeech2 need extract durations, the different between fastspeech and fastspeech2 is fastspeech2 use f0/energy
@rgzn-aiyun yes, fastspeech2 need extract durations, the different between fastspeech and fastspeech2 is fastspeech2 use f0/energy
The author's paper does not seem to require extraction
@rgzn-aiyun they use MFA to extract durations, fastspeech need duration file, this is a "MUST"
@rgzn-aiyun As @dathudeptrai said, they use MFA to get rid of the teacher model and extract durations. You can either extract durations from Tacotron or head over to this version which supports MFA and phonetic training. For seeing its performance, you can play with my model at this notebook, although that one's only at 40k since I just turned on rounded durations.
@dathudeptrai @ZDisket can we use CTC decoder like concept to get rid of Durations ?? :thinking:
@manmay-nakhashi yes, any asr algorithm can be use to extract duration.
@dathudeptrai so deepspeech uses CTC loss for alignment , if we use CTC loss and integrate it with fastspeech can we eliminate duration calculation step from tacotron2 ?
@manmay-nakhashi yes, you can use MFA also, i will intergrated MFA to the repo soon
@dathudeptrai sure :smile:
@dathudeptrai
yes, you can use MFA also, i will intergrated MFA to the repo soon
When?
@ZDisket after merge multi-gpu branch. BTW, do you want to make a PR ?
@dathudeptrai I will once I finish training the current model and if I like it. Keep in mind my code is messy and I have no idea what I'm doing 90% of the time.
@dathudeptrai
Recovery training failed? : Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
@rgzn-aiyun can you share ur command line to recovery training ?
@rgzn-aiyun
--resume ./model/checkpoints/ckpt-15000
I will update the readme so it won't confuse anymore :D
@rgzn-aiyun can you share ur command line to recovery training ?
CUDA_VISIBLE_DEVICES=0 python examples/fastspeech2/train_fastspeech2.py \ --train-dir ./dump/train/ --dev-dir ./dump/valid/ --outdir ./model/ --config ./examples/fastspeech2/conf/fastspeech2.v2.yaml --use-norm 1 --f0-stat ./dump/stats_f0.npy --energy-stat ./dump/stats_energy.npy --mixed_precision 1 --resume ./model/checkpoints/model-15000.h5
@rgzn-aiyun
--resume ./model/checkpoints/ckpt-15000
I will update the readme so it won't confuse anymore :D
ok, I will try.
@dathudeptrai An error is reported when saving the model for a period of time after resuming training?
/checkpoints/checkpoint.tmp0030f793c7724ccaa4e3bed038d04f81; Permission denied
@rgzn-aiyun chmod -R 777 checkpoints/ Also Preprocessing calculation is bugged I'll make pr with fix for time calculation and a lot more stuff in next few days.
Hi, I have forgotten to update too late recently, just pulled the latest code and ran a bit, and found that the generated data time has increased? It's been 10 minutes and it hasn't finished generating, why?
tensorflow-tts-preprocess --rootdir ./datasets/ --outdir ./dump/ --conf preprocess/ljspeech_preprocess.yaml
[Preprocessing]: 6% 625/10000 [10:48<2:42:07, 1.04s/it] [Preprocessing]: 6% 625/10000 [11:29<2:52:16, 1.10s/it] [Preprocessing]: 6% 625/10000 [11:47<2:56:51, 1.13s/it] [Preprocessing]: 6% 625/10000 [12:29<3:07:17, 1.20s/it]