huggingface / dataspeech

MIT License
222 stars 23 forks source link

Dataset not from datasets lib #4

Open netagl opened 2 months ago

netagl commented 2 months ago

Hi, Is it possible to run this pipline for dataset that is not in dataset library? tnx!

ylacombe commented 2 months ago

Hey @netagl, not currently, what dataset and what dataset format do you have in mind ? Note that it's quite easy to add a dataset to the library (and you can keep it private if you want of course)!

netagl commented 2 months ago

How can I add dataset to the library? I do not have a specific dataset yet. right now Im working on getting relevant audios & transcription for my task, and would like to add them as a dataset to the dataset lib in order to run your pipeline (and eventually, run parler TTS) @ylacombe

netagl commented 2 months ago

How can I add dataset to the library? I do not have a specific dataset yet. right now Im working on getting relevant audios & transcription for my task, and would like to add them as a dataset to the dataset lib in order to run your pipeline (and eventually, run parler TTS) @ylacombe

Hey @netagl, not currently, what dataset and what dataset format do you have in mind ? Note that it's quite easy to add a dataset to the library (and you can keep it private if you want of course)!

How can I add dataset to the library? I do not have a specific dataset yet. right now Im working on getting relevant audios & transcription for my task, and would like to add them as a dataset to the dataset lib in order to run your pipeline (and eventually, run parler TTS) @ylacombe

ylacombe commented 2 months ago

You should be able to do it following instructions you can find on the datasets docs here, let me know if that helps

ylacombe commented 2 months ago

Additionally, I've added a FAQ with your question answered here: https://github.com/huggingface/dataspeech?tab=readme-ov-file#how-do-i-use-datasets-that-i-have-with-this-repository