Open mutonix opened 7 months ago
Hi @mutonix, Thanks for your interest about this dataset! The subtitles are directly from youtube and we don't use another ASR model to get them. If you use the script in this repo to download the dataset, you can also get the youtube subtitles.
Great thanks to the great contribution of your work! I have some doubts about how you collect the subtitles. Do you directly download the subtitles from the youtube website or use some ASR models?