Style not being applied

kannadaraj commented 4 years ago

Hi. I have trained a mellotron model with a single speaker data with multiple styles of speaking, like a story audio book. It has high degree of intonation and pitch variation in the data. Total duration is about 19 hours. Training goes well and curves and alignment also looks good.

But during inference, when i try to use a stylefile to impart style it doesn't apply anything. I will be synthesized as normally as if no style is available. I tried both variation like simple GST and GST+f0+pitch variation which is inference and inference_noattn. Neiter the style or the pitch, f0 variation is applied. the duration of the audio is similar to that of the synthesized audio..

Please can you suggest what might be the problem. Thanks.

rafaelvalle commented 4 years ago

The pitch contour should definitely be applied. Can you share mel-spectrograms, alignments and your pitch contour. What code are you using for running inference?

kannadaraj commented 4 years ago

@rafaelvalle : thanks a lot for replying.

Here i am using a style file for mimic the style. the style file is not same as that of the synthesizing sentence. Here i am using GST only mode using

mel_outputs, mel_outputs_postnet, gate_outputs, _ = mellotron.inference( (text_encoded, mel, speaker_id, pitch_contour))

input text is "I am spending time with the family." the style file is an highly emotive sentence (happy). But i see jsut normal synthesis. It doesnt apply any the style of the file. Please can you help

plot_mel_f0_alignment

rafaelvalle commented 4 years ago

Can you pull from master and try again?

kannadaraj commented 4 years ago

@rafaelvalle thanks fro the update.. I am retraining with your length inclusion modification. Will keep you posted 👍

NVIDIA / mellotron

Style not being applied #64