NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

Jasper / wav2letter receptive field #338

Open fminkin opened 5 years ago

fminkin commented 5 years ago

Hello! The Jasper / Wav2Letter+ models consist of blocks with dilated convolutions. It seems that they can't work online, cause overall time context is usually wider than piece of audio. Although one could pad the future context until the quality remains comparable, I guess?

Am I correct or miscalculated the receptive field? Thanks beforehand.

borisgin commented 5 years ago

Current implementation is targeted for off-line ASR.