MoonInTheRiver / DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
MIT License
4.27k stars 713 forks source link

How to train a model from scratch, with new data set? #18

Closed duheyu closed 1 year ago

duheyu commented 2 years ago

The training steps given in the readme.md for DiffSinger require your saved checkpoints and your training data. Can you please indicate how train a model from scratch, with a new data set?

MoonInTheRiver commented 2 years ago

This repo has supported datasets with various types:

  1. Ljspeech (open-source dataset)
  2. PopCS (our dataset)
  3. Opencpop (dataset from opencpop team) Also, we find that DiffGAN-TTS re-implemented our DiffSpeech on their own multi-spk dataset. We think that people can read and understand our data pipeline, and then train a model with a new dataset, like team DiffGAN-TTS.
duheyu commented 2 years ago

We've been struggling with this for a while now, without success. Can someone please advise on how to train DiffSinger from scratch?

We have a data set prepared and ready , but don't understand which commands to run to to train without the given checkpoint, and if something needs to be modified in the code. Any help is appreciated, thank you.

ghost commented 2 years ago

Yes, please add an instruction on how to run the training steps of DiffSinger from scratch. You can use the PopCS dataset as example, just instruct how to start training from the beginning, without any saved checkpoints (creating new ones).