Open ralf3u opened 1 year ago
We have postprocessing network for that. See https://alphacephei.com/vosk/models#punctuation-models
We have postprocessing network for that. See https://alphacephei.com/vosk/models#punctuation-models
When I click on on the webpage on the link https://github.com/uhh-lt/vosk-model-tuda-de, then there is a "Page not found".
@ralf3u Many links, many outdated links :-(
But this is not a punctuation / recasing model, the one you might be interested in is named vosk-recasepunc-de-0.21
@svenha
... the one you might be interested in is named vosk-recasepunc-de-0.21
Yes, you are right. Thank you for the hint. When I clicked on the link of https://github.com/alphacep/vosk-api/issues/1204#issuecomment-1321274192, then the headline "Punctuation models" is not visible. I didn't see it because it is hidden. Just try yourself. I think it would be good to move the anchor more upwards on the webpage, so that the headline is visible.
"For punctuation and case restoration ...". What does that mean?
Does that mean, that if I say colon it will write :
?
Does that mean that words which starts with a capital letter will be respected?
@nshmyrev
We have postprocessing network for that.
I don't understand "postprocessing network". So first I do the normal Model list, and then afterwards I do Punctuation models?
Model list
English: default: does not respect capital letters vosk-model-en-us-0.22: does not respect capital letters vosk-model-en-us-0.42-gigaspeech: does not respect capital letters
German: vosk-model-de-0.21: does not respect capital letters vosk-model-de-tuda-0.6-900k: does respect capital letters (see also https://github.com/alphacep/vosk-api/issues/1208).
So, the English models from above don't respect capital letters, but the German model vosk-model-de-tuda-0.6-900k does respect capital letters.
Some words in English start with a capital letter, like Monday or January. Vosk does not respect this at the moment. In the German language there are many words that start with a capital letter, what means that a lot of corrections are necessary.
Is there not the possibility to check automatically the text with a dictionary before Vosk will output the text in a text-document, so that the capital letter of some words are respected?