NVIDIA / radtts

Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.
MIT License
280 stars 40 forks source link

train decatndur #25

Open 11721206 opened 1 year ago

11721206 commented 1 year ago

image

in readme, when I finished train attention and decoder, next it to train duration. the code is radtts.py if 'dpm' in include_modules: dur_model_config['hparams']['n_speaker_dim'] = n_speaker_dim self.dur_pred_layer = get_attribute_prediction_model( dur_model_config)

I just want to ask if i want train duration, the include_modules should be decatndur ? Thanks

unilight commented 1 year ago

@11721206 I think it's either a typo or the authors changed the code. You should use decatndpm instead of decatndur.

egorsmkv commented 1 year ago

@unilight is right. We need to see loss_duration: 3.021 in training logs.

By the way, don't forget to use appropriate config file during the inference step.