synesthesiam / rhasspy

Rhasspy voice assistant for offline home automation
https://rhasspy.readthedocs.io
MIT License
943 stars 101 forks source link

Snowboy multiple wake words firing more then once #135

Closed whentotrade closed 4 years ago

whentotrade commented 4 years ago

I am using snowboy wake word with multiple wake word. The multiple wake words work in snowboy when testing with demo script:

models = [
'/home/datapool/snowboy/resources/lars1.pmdl', 
'/home/datapool/snowboy/resources/lars2.pmdl', 
'/home/datapool/snowboy/resources/lars3.pmdl', 
'/home/datapool/snowboy/resources/lars4.pmdl', 
'/home/datapool/snowboy/resources/frauke1.pmdl', 
'/home/datapool/snowboy/resources/frauke2.pmdl', 
'/home/datapool/snowboy/resources/frauke3.pmdl' ]

however, when I enter multiple wake words in the settings json for Rhasspy like so:

"snowboy": {
            "model": "frauke1.pmdl,frauke2.pmdl,frauke3.pmdl,lars1.pmdl,lars2.pmdl,lars3.pmdl,lars4.pmdl",
            "model_settings": {
                "frauke1.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                },
                "frauke2.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                },
                "frauke3.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                },
                "lars1.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                },
                "lars2.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                },
                "lars3.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                },
                "lars4.pmdl": {
                    "apply_frontend": false,
                    "audio_gain": 1,
                    "sensitivity": "0.4"
                }
            },
            "sensitivity": "0.4,0.4,0.4,0.4,0.4,0.4,0.4"
        },

The wake word is detected more than once. It is correctly registered and the intent handling follows. However, afterwards the "wakeword" detection fires again after the first processing (without any additional audio/speech). here is a log extract:

[DEBUG:168328] SnowboyWakeListener: loaded -> listening [DEBUG:168327] DialogueManager: ready -> asleep [INFO:168326] DialogueManager: Automatically listening for wake word [DEBUG:168321] DialogueManager: handling -> ready [DEBUG:168310] WebSocketObserver: {"text": "im", "intent": {"name": "", "confidence": 0}, "entities": [], "raw_text": "im", "speech_confidence": 0.0027951163298217873, "wakeId": "lars1.pmdl", "siteId": "default", "slots": {}} [DEBUG:168309] DialogueManager: recognizing -> handling [DEBUG:168308] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_error.wav'] [DEBUG:168307] DialogueManager: {'text': 'im', 'intent': {'name': '', 'confidence': 0}, 'entities': [], 'raw_text': 'im', 'speech_confidence': 0.0027951163298217873, 'wakeId': 'lars1.pmdl', 'siteId': 'default'} [ERROR:168303] FsticuffsRecognizer: in_loaded Traceback (most recent call last): File "/usr/share/rhasspy/rhasspy/intent.py", line 179, in in_loaded assert recognitions, "No intent recognized" AssertionError: No intent recognized [DEBUG:168301] DialogueManager: decoding -> recognizing [DEBUG:168298] DialogueManager: im (confidence=0.0027951163298217873) [DEBUG:168297] PocketsphinxDecoder: im [DEBUG:168296] PocketsphinxDecoder: Transcription confidence: 0.0027951163298217873 [DEBUG:168292] PocketsphinxDecoder: Decoded WAV in 1.3914434909820557 second(s) [DEBUG:166894] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_lo.wav'] [DEBUG:166891] PocketsphinxDecoder: rate=16000, width=2, channels=1. [DEBUG:166889] DialogueManager: awake -> decoding [DEBUG:166883] WebrtcvadCommandListener: listening -> loaded [DEBUG:166880] WebrtcvadCommandListener: Voice command finished [DEBUG:165051] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_hi.wav'] [DEBUG:164017] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_error.wav'] [DEBUG:163790] WebrtcvadCommandListener: Voice command started [DEBUG:163336] WebrtcvadCommandListener: loaded -> listening [DEBUG:163334] SnowboyWakeListener: listening -> loaded [DEBUG:163333] WebrtcvadCommandListener: Will timeout in 30 second(s) [DEBUG:163330] DialogueManager: asleep -> awake [DEBUG:163328] DialogueManager: Awake! > [DEBUG:163327] SnowboyWakeListener: Hotword(s) detected: ['lars1.pmdl'] [DEBUG:163277] SnowboyWakeListener: loaded -> listening [DEBUG:163274] DialogueManager: ready -> asleep [INFO:163270] DialogueManager: Automatically listening for wake word [DEBUG:163264] DialogueManager: handling -> ready [DEBUG:163262] WebSocketObserver: {"text": "", "intent": {"name": "", "confidence": 0}, "entities": [], "raw_text": "", "speech_confidence": 0, "wakeId": "frauke2.pmdl", "siteId": "default", "slots": {}} [DEBUG:163260] DialogueManager: recognizing -> handling [DEBUG:163256] DialogueManager: {'text': '', 'intent': {'name': '', 'confidence': 0}, 'entities': [], 'raw_text': '', 'speech_confidence': 0, 'wakeId': 'frauke2.pmdl', 'siteId': 'default'} [ERROR:163247] FsticuffsRecognizer: in_loaded Traceback (most recent call last): File "/usr/share/rhasspy/rhasspy/intent.py", line 179, in in_loaded assert recognitions, "No intent recognized" AssertionError: No intent recognized [DEBUG:163242] DialogueManager: decoding -> recognizing [DEBUG:163240] DialogueManager: (confidence=0) [DEBUG:163234] PocketsphinxDecoder: [DEBUG:163229] PocketsphinxDecoder: Decoded WAV in 0.05382347106933594 second(s) [DEBUG:163171] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_lo.wav'] [DEBUG:163168] PocketsphinxDecoder: rate=16000, width=2, channels=1. [DEBUG:163167] DialogueManager: awake -> decoding [DEBUG:163163] WebrtcvadCommandListener: listening -> loaded [DEBUG:163160] WebrtcvadCommandListener: Voice command finished [DEBUG:161810] WebrtcvadCommandListener: Voice command started [DEBUG:161334] WebrtcvadCommandListener: loaded -> listening [DEBUG:161330] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_hi.wav'] [DEBUG:161329] WebrtcvadCommandListener: Will timeout in 30 second(s) [DEBUG:161329] SnowboyWakeListener: listening -> loaded [WARNING:161328] DialogueManager: Unhandled message: <rhasspy.events.WakeWordDetected object at 0x70f2d710> [DEBUG:161326] DialogueManager: asleep -> awake [DEBUG:161324] DialogueManager: Awake! [DEBUG:161322] SnowboyWakeListener: Hotword(s) detected: ['frauke2.pmdl', 'lars4.pmdl'] [DEBUG:161301] SnowboyWakeListener: loaded -> listening [DEBUG:161299] DialogueManager: ready -> asleep [INFO:161298] DialogueManager: Automatically listening for wake word [DEBUG:161296] DialogueManager: handling -> ready .... [DEBUG:159600] PocketsphinxDecoder: rate=16000, width=2, channels=1. [DEBUG:159597] DialogueManager: awake -> decoding [DEBUG:159594] WebrtcvadCommandListener: listening -> loaded [DEBUG:159590] WebrtcvadCommandListener: Voice command finished [DEBUG:156170] WebrtcvadCommandListener: Voice command started [DEBUG:155687] WebrtcvadCommandListener: loaded -> listening [DEBUG:155685] SnowboyWakeListener: listening -> loaded [DEBUG:155685] APlayAudioPlayer: ['aplay', '-q', '-D', 'default:CARD=ArrayUAC10', '/usr/share/rhasspy/etc/wav/beep_hi.wav'] [DEBUG:155684] WebrtcvadCommandListener: Will timeout in 30 second(s) [DEBUG:155682] DialogueManager: asleep -> awake [DEBUG:155682] DialogueManager: Awake! [DEBUG:155679] SnowboyWakeListener: Hotword(s) detected: ['frauke3.pmdl']

You can see that the snowboy wake listener fires 3 times wit different hotwords detected. Info: the hotword was only spoken ONCE. The different personal models just include the SAME hotword, just spoken from 2 different persons in different background environments. This works very well with the snowboy demo script, but not in Rhasspy.

Any idea?

synesthesiam commented 4 years ago

Any idea?

Yes, I had assumed that you'd want each wake word to fire independently, but I can see why that wouldn't be desired in your case. I can fix this, so it will only fire once for a given session.

synesthesiam commented 4 years ago

Fixed in master. Will go into the next Docker image.