artiso-solutions / CoVoX

MIT License
1 stars 1 forks source link

Test multiple ContinuousRecognition performance #17

Closed kczornik closed 3 years ago

kczornik commented 3 years ago

Create demo for multiple language ContinuosRecognition performance

kczornik commented 3 years ago

A test was made logging all the steps in the continuous recognition poc/demos/multiple-continuous-translation. Four instances of SpeechRecognizer were started, each recognizing one language (en, de, it, es), the recognized text was sent to the english translator endpoint (without from language) Two sentences where spoken into the mic:

Lang State Result Delay
en-US Listening...   235.1733 ms
en-US SpeechStartDetected   5916.0466 ms
en-US Recognizing... shut 6124.5357 ms
en-US Recognizing... shelter 6124.8716 ms
en-US Recognizing... shelter asleep 6339.7879 ms
en-US Recognizing... charlatan 6342.2823 ms
en-US Recognized Shelterless leeton 6577.6358 ms
en-US Translated Shelterless leeton (en) 8471.5025 ms
en-US Recognizing... turn 9768.5601 ms
en-US Recognizing... turn on 9971.6659 ms
en-US Recognizing... turn on the light 10186.9086 ms
en-US Recognized turn on the light. 10793.993 ms
en-US Translated turn on the light. (en) 11703.9361 ms
Lang State Result Delay
de-DE Listening...   218.6579 ms
de-DE SpeechStartDetected   5905.3204 ms
de-DE Recognizing... schalte 6120.3961 ms
de-DE Recognizing... schalte das licht 6123.7986 ms
de-DE Recognizing... schalte das licht an 6125.3948 ms
de-DE Recognized Schalte das Licht an. 6363.5486 ms
de-DE Translated Turn on the light. (de) 8253.2676 ms
de-DE Recognizing... turn on 9549.6543 ms
de-DE Recognizing... turn on the light 10156.2045 ms
de-DE Recognized Turn on the light. 10979.5403 ms
de-DE Translated Turn on the light. (en) 11460.2578 ms
Lang State Result Delay
it-IT Listening...   223.1901 ms
it-IT SpeechStartDetected   6138.3569 ms
it-IT Recognizing... scelte da 6138.7632 ms
it-IT Recognizing... scelte da slitta 6339.4466 ms
it-IT Recognizing... scelte da slatan 6339.6535 ms
it-IT Recognized Scelte da slatan? 6542.7231 ms
it-IT Translated Choices from slatan? (it) 8241.0058 ms
it-IT Recognizing... tu non 9740.9549 ms
it-IT Recognizing... tu non l'hai 9956.3479 ms
it-IT Recognizing... tu non delight 10158.6159 ms
it-IT Recognized Tu non delight. 10563.6193 ms
it-IT Translated You're not delighting. (fr) 11514.1887 ms
Lang State Result Delay
es-ES Listening...   204.5828 ms
es-ES SpeechStartDetected   5890.0355 ms
es-ES Recognizing... saldrá 5890.8628 ms
es-ES Recognizing... sal de las 6094.6438 ms
es-ES Recognizing... sal de las listas 6094.9853 ms
es-ES Recognizing... saldrás listan 6305.1977 ms
es-ES Recognized Saldrás. Listan. 6329.3167 ms
es-ES Translated You'll get out. Listed. (es) 8222.6616 ms
es-ES Recognizing... yo no 9519.7894 ms
es-ES Recognizing... yo no del ait 10328.8732 ms
es-ES Recognized Yo no del ait. 10731.4245 ms
es-ES Translated I don't like ait. (es) 11252.5544 ms

Although the first sentence was spoken immediately after the "Listening..." message appeared, there is a delay of over 5 seconds before the SpeechStartDetected event is triggered, after that it takes about 2 seconds for the translated text to be returned. The second time there is no SpeechStartDetected event and the translation process also takes about 2 seconds