Closed Tortoise17 closed 1 month ago
@Tortoise17 Of course.
You can follow https://github.com/SpeechColab/GigaSpeech2/blob/main/pipeline/crawler/README.md.
Make sure you assign the right language id.
For example, the ISO 639-1 language code
of Thai
is th
and the ISO 639-2 language code
of Thai
is tha
.
Great, and the format than can be used for training audio LDM2 for like speech model? I guess.
Just one question, considering reference estimate, how much time it took to generate a prepared dataset 30,000 hours and on which GPU you have prepared the dataset?
Great, and the format than can be used for training audio LDM2 for like speech model? I guess.
@Tortoise17 Yes. The original file type is probably .webm
and it will be converted into .wav
.
Just one question, considering reference estimate, how much time it took to generate a prepared dataset 30,000 hours and on which GPU you have prepared the dataset?
@Tortoise17 You can refer:
We suggest using faster-whisper and multi-gpu to parallelize the transcription.
Giving the crawler other channels path, is it possible to generate dataset of my own in other language?