Open karim23657 opened 6 months ago
Sure, I would like to add more contributors on creating a close-to-perfect Persian/Farsi TTS. It is not easy to explain end-to-end everything.
how did you get transcription of rokhpodcast.ir audio tracks? for voice transcription, I am using Subtitle Edit tools, to align text with voice, then I am splitting the audio, based on the aligned subtitle in xml format.
how did aligned voice with transcription? I am usign the auto generated subtitle and also it's youtube channel.
Size of dataset? 10 hours.
tools you used for creating your dataset? python script, Subtitle Edit
I'll try to record a video on how to gather data and train, and we can also set up meetings with other contributors online, willing to contribute to start creating something close-to-perfect.
Please join https://discord.gg/JerZTvVK if you'd like to get in touch and collaborate more :)
Thank you @SadeghKrmi , i wanted to know these about your dataset creation process :