NVIDIA / flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
https://nv-adlr.github.io/Flowtron
Apache License 2.0
889 stars 177 forks source link

inversed 2 flow alingment #48

Closed kurbobo closed 4 years ago

kurbobo commented 4 years ago

A trained 1st flow until I had not bad alignment image then trained seconds flow as it was said in previous issues and after a while alignment of the seconds flow became to look like this image

which is strange to me because I expected same picture as in 1st flow. Nevertheless, sound is good. So could anybody explain to me why there is such alignment?

rafaelvalle commented 4 years ago

All looks good: in our default setup odd steps of flow attend to mels in the forward direction while even steps of flow attend to the mel in the backward direction. Keep training validation loss keeps decreasing!

kurbobo commented 4 years ago

and what is the reason for such rule? "odd steps of flow attend to mels in the forward direction while even steps of flow attend to the mel in the backward direction"?

rafaelvalle commented 4 years ago

In our experiments we found that inverting the direction at every flow step provides better likelihood and audio quality.