NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 371 forks source link

Audio captioning #436

Closed flassTer closed 5 years ago

flassTer commented 5 years ago

Hello team, does OpenSeq2Seq support audio captioning (detecting the start time and end time of a predicted word)? Thanks.

vsl9 commented 5 years ago

Hello George, there is an experimental tool for that. We tested it with Jasper models. First of all, you'll need to install Baidu's beam search decoder:

scripts/install_decoders.sh

Then please follow these instructions: https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/speech-to-text-align.html

flassTer commented 5 years ago

Thank you @vsl9