Closed abhinavkulkarni closed 2 years ago
Hi,
We described these limitations in the accompanying article - https://habr.com/ru/post/581960/:
We had to put a full stop somewhere (pun intended), so the following ideas were left for future work:
Support inputs consisting of several sentences;
Try model factorization and pruning (i.e. attention head pruning);
Add some relevant meta-data from the spoken utterances, i.e. pauses or intonations (or any other embedding);
Support for paragraphs consisting of several sentences will be added in next version.
Hi,
I fed this small audio to an STT engine and obtained following transcription:
Feeding this as is to text enhancer model in
example.ipynb
produces the following output:You can see it misses almost all the periods.
Thanks!