How to use multiple persons voice datasets for training and inference

NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

BSD 3-Clause "New" or "Revised" License

4.97k stars 1.37k forks source link

How to use multiple persons voice datasets for training and inference #607

Open satwiksunnam19 opened 9 months ago

satwiksunnam19 commented 9 months ago

Hello @cobr123 @sih4sing5hong5 @taras-sereda

I have datasets which are arranged in the format of the LJ Speech dataset.

Let's take an example of dataset names: "Dhoni", "Virat", "shewag" which have the audio clips of the corresponding person and can we train a single model on all the datasets or should we train every single model for every dataset.

Does tacotron2 support this type of training.