microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
160.95k stars 28.24k forks source link

Difficulties with German speech to text and english words #213664

Open chrmarti opened 1 month ago

chrmarti commented 1 month ago

Testing #213355

It seems to autocorrect "Bash" to something else in: "Liste alle Textdateien in Bash."

bpasero commented 1 month ago

Yeah in general these models are optimised for words of the language, I doubt they are optimised for working with a mix of languages, at least not in the configuration we are using them for where we explicitly pick one language.

I think a model where you are using a multi-language speech model would be better in this case here, but that will probably come at the cost of a much larger size on disk and maybe also performance.

chrmarti commented 1 month ago

Makes sense. It might make it difficult to use in any other language than English though. Maybe the speech models could come with English as a second language to compensate for the fact that so many terms in software development are in English even when the user otherwise works in their local language? (As opposed to using a speech model with all supported languages combined which might be too large.)