NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

How to use new audio+text dataset #304

Closed saj1919 closed 5 years ago

saj1919 commented 5 years ago

HI, I have 500+ hours of audio data mp3 files with corresponding text. Already segmented over 1 mins chunks. But not sure about data format being used here. Can someone suggest me how can I create dataset out to train with openseq2seq ?? Only relevant thing found was here -> https://github.com/klintan/swedish-asr-dataset But not explained how to run the training.

saj1919 commented 5 years ago

Went through the code. I think I can build my own dataset by modifying following files. So closing the issue.

  1. Config - https://github.com/NVIDIA/OpenSeq2Seq/blob/master/example_configs/speech2text/ds2_small_1gpu.py
  2. Data Preprocessing - https://github.com/NVIDIA/OpenSeq2Seq/blob/master/scripts/import_librivox.py
  3. Training - https://github.com/NVIDIA/OpenSeq2Seq/blob/master/run.py