NVIDIA / flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
https://nv-adlr.github.io/Flowtron
Apache License 2.0
887 stars 177 forks source link

Custom trained model and dataset problem #152

Open sickdyingdead opened 2 years ago

sickdyingdead commented 2 years ago

Hi,

When I run inference in Jupyter and my trained model (90 speakers, 45k iters) works, but not exacly how it should. When I tested it with suprised.txt and her datased... I mean I get voices, but alot of speakers sounds like speaker 4 IDs before it. I mean ID 86 sounds like ID 82, but ID 0 or 1 sounds good.

I think the issue is with dataset, because I know I should use my train.txt with all my IDs... Yeah, but what about dataset? Is there some code to use my model as a dataset? Because Colab Flowtron Speech doesnt need any wavs, theyre used for training, which I did already... How to set my flowtron.pt as a dataset in Jupyter, is there any code for this?

Of course I tried to put all my wavs to data folder with train.txt with all IDs, and dataloader sees correct amount of speakers, but... It cannot load, I just wait and Kernel is busy all the time, because dataset is in gigabytes, so yeah...

Please help me.