Open termx88 opened 3 years ago
Thanks for the report! I may need to adjust the default decoding parameters to avoid this error. Curious that it happens for you on Linux but not on Windows. I wonder if there could be a difference between them in the audio volume/noise that affects it. Regardless, you can adjust this parameter by setting the engine parameter decoder_init_config={'lattice_beam':6}
, but there currently isn't an easy way to set it from the standard Caster loader. I submitted a pull request to dragonfly to make it easier soon. If you are comfortable editing your python site packages, I could tell you where to modify it temporarily.
I tried setting the 'lattice beam' parameter to 6 and it successfully changed. As it now threw:
INFO:engine:Listening... [KALDI severity=-1] Did not reach requested beam in determinize-lattice: size exceeds maximum 50000000 bytes; (repo,arcs,elems) = (33837632,921856,15265440), after rebuilding, repo size was 27490176, effective beam was 5.33964 vs. requested beam 6
I should have put more emphasis that the problem is that no commands are executed, until after the error.
I also tried setting it to 4 and 2. With those it doesn't throw the error, but still doesn't work until after quickly saying ~20 commands ("numb one" and which works fine after). Then it executes the commands said until that point. And commands said after are executed practically immediately as they should be. The output is like this, as if it doesn't start processing the commands until the voice buffer(?) reaches a certain size. Then the output right after those ~20 commands is more or less normal:
INFO:engine:Listening... Numbers: [
] numb , , 10 10Numbers: [ ] numb , , 110 110Numbers: [ ] numb , , 4110 4110Numbers: [ ] numb , , 110 110Numbers: [ ] numb , , 101 101Numbers: [ ] numb , , 1 1Numbers: [ ] numb , , 1
To add/clarify what else I tried:
Confirming that it's still a problem in 2.1 with my laptop microphone. But I tested with an external microphone. It works more or less properly with it, though I think recognition is slower, than on windows. I noticed that muting the laptop microphone results in recognition of last few commands, though it doesn't start recognizing after those commands are executed. Also using external microphone for one command, then unplugging it. Makes laptop's microphone work properly.
I'm running Kaldi through Caster on Linux (Kubuntu). After the start of listening no commands are activated. After very roughly ~20 commands. I get the error below. After which everything seems to be working as expected. It happens every time on Linux with both the small and big models (20200905_1ep ones) On Windows 10 (on the same laptop) it works as expected.
log of Caster log of Caster with debug logging mode