Closed jdittrich closed 5 years ago
Thanks for your report! Actually I wasn't aware, that the English download section doesn't provide a full setup. I would suggest to use the model used by Pocketsphinx, the library that does the speech recognition. It has only one (english) model that I decided not to ship with Parlatype because of its size and I guess not everybody wants an English model, maybe some users don't want this feature at all for any language.
Pocketsphinx is on Github, you can download the whole project as a zip file: https://github.com/cmusphinx/pocketsphinx/archive/master.zip
Extract the directory model/en_us/
and save it somewhere in your home directory. (The flathub version of Parlatype has read only permissions for your home directory only.) Choose this directory on the first page of the assistant. On the second page choose the language model without "phone" in the name. Confirm and this should actually work.
In the CMU Sphinx download section the German models are complete, you can download for example cmusphinx-de-voxforge-5.2.tar.gz
and it's fully setup.
I have to admit finding a suitable model isn't always easy and then the results are not always usable. First of all you need a good quality recording. I'm thinking of marking this feature as experimental as it's also below my own expectations. A more serious approach would have to include some adapting/training to improve accuracy but that's out of scope in the moment.
I have to admit finding a suitable model isn't always easy and then the results are not always usable.
I really liked that I could try this out, but yes, the results were not great.
I'm thinking of marking this feature as experimental as it's also below my own expectations
Probably makes sense. It is a fun feature to play with but for non-hacker-ish purposes it is probable more of a distraction, currently.
A more serious approach…
I have high hopes for mozilla’s common voice/ deep speech project, but so far there are no easy-to-integrate results.
Installation and version
Parlatype 1.6 from dl.flathub.org
Your desktop environment
Issue
I tried to setup speech recognition. I used several setups, non has shown any effect. Currently I tried with
(but I also tried with others, namely the pruned and ptm files)
What I do: I load an mp3 (or wav) file, set the marker to beginning, set transcription to automatic and play.
Result: It does not put out any text in the textpad, as far as I can see (I can type in there, though)
Note: Possibly my setup is not correct. In this case it could help to suggest combinations of files to use in the settings.