Open vadimkantorov opened 4 years ago
We are trying to reproduce some of your results on the newly available Russian speech-to-text dataset: https://github.com/snakers4/open_stt . The key questions are model capacity, model depth, compute requirements for training.
Could you please share the training learning curves for wav2letter++ and Jasper (5x3, 10x5) : loss, CER / WER)? It would be a great addition to the paper or to https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/jasper.html and make clearer the trade-offs Jasper makes.
Thank you very much!
We are trying to reproduce some of your results on the newly available Russian speech-to-text dataset: https://github.com/snakers4/open_stt . The key questions are model capacity, model depth, compute requirements for training.
Could you please share the training learning curves for wav2letter++ and Jasper (5x3, 10x5) : loss, CER / WER)? It would be a great addition to the paper or to https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/jasper.html and make clearer the trade-offs Jasper makes.
Thank you very much!