WelkinYang / GradTTS

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
MIT License
182 stars 19 forks source link

Grad TTS in multispeaker setting #2

Open ajinkyakulkarni14 opened 3 years ago

ajinkyakulkarni14 commented 3 years ago

I observed that in model.py "gin_channels" is provided in DiffusionGenerator.

I would like to know if Grad-TTS supports multispeaker TTS training ?

Can you also provide pretrained model trained with LJS dataset ?

I had some difficulties on installing Horovod on GPU cluters on server side, so I changed the train.py from Horovod to torch.distributed.

Thank you for repo.

WelkinYang commented 3 years ago

I observed that in model.py "gin_channels" is provided in DiffusionGenerator.

I would like to know if Grad-TTS supports multispeaker TTS training ?

Can you also provide pretrained model trained with LJS dataset ?

I had some difficulties on installing Horovod on GPU cluters on server side, so I changed the train.py from Horovod to torch.distributed.

Thank you for repo.

Because the LJSpeech dataset and our internal Mandarin dataset are both single-speaker datasets, I have not tried the multi-speaker dataset. I think it is feasible to do the multi-speaker training as the glowtts do by setting the gin_channels and g.

As the pre-trained model, I would like to provide the checkpoint and I would provide it in a few months.

And I trained my model in a compute Cluster which does not support torch.distributed but horovod, so I changed the code of glowtts.