How to use new audio+text dataset

NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

https://nvidia.github.io/OpenSeq2Seq

Apache License 2.0

1.54k stars 369 forks source link

How to use new audio+text dataset #304

Closed saj1919 closed 5 years ago

saj1919 commented 5 years ago

HI, I have 500+ hours of audio data mp3 files with corresponding text. Already segmented over 1 mins chunks. But not sure about data format being used here. Can someone suggest me how can I create dataset out to train with openseq2seq ?? Only relevant thing found was here -> https://github.com/klintan/swedish-asr-dataset But not explained how to run the training.

saj1919 commented 5 years ago

Went through the code. I think I can build my own dataset by modifying following files. So closing the issue.

Config - https://github.com/NVIDIA/OpenSeq2Seq/blob/master/example_configs/speech2text/ds2_small_1gpu.py
Data Preprocessing - https://github.com/NVIDIA/OpenSeq2Seq/blob/master/scripts/import_librivox.py
Training - https://github.com/NVIDIA/OpenSeq2Seq/blob/master/run.py