rhasspy / wyoming-satellite

Remote voice satellite using Wyoming protocol
MIT License
578 stars 83 forks source link

Device wakes twice with custom Alfred wake word. #23

Open that1guy opened 8 months ago

that1guy commented 8 months ago

I'm observing that when I load a custom wake word (specifically "Alfred") that 50% of the time the wake word is invoked twice.

  1. "Alfred, do something"
  2. Alfred responds and executes my command
  3. Device automatically wakes back up again as if I uttered the wake word and then goes back to sleep and Alfred responds a second time with the same response.

Anybody else observing this?

I'm finding the default "hey jarvis" only reproduces this issues 1% of the time. Almost never.

I'm running openwakeword on the rasberry pi. No VAD being used.

thanks for help.

adoreparler commented 8 months ago

I am having the same issue, I will post a video here in a little with the behavior.

Running wyoming-openWakeWord and wyoming-satellite on a raspberry pi zero 2 W

I feel like it just does not stop listening after the TTS starts, and it's re-triggering it self or if no TTS it just keeps listening. You will see this second scenario in my video

adoreparler commented 8 months ago

https://youtu.be/-LtBLQsTdzI

adoreparler commented 8 months ago

I kind of like "feature"(Bug?) of it to keep taking commands after the initial command. But I would need it to not listen to itself when TTS is talking lol

Functionality would be: 1 Trigger with wake word 2 respond to initial command 3 keep listening for x seconds for another command 4 loop 2 and 3 until x seconds passed, or never mind (or the word stop) was said

for some reason TTS stopped working when I added scripts to the different events, so in the video above there is no TTS, but when there is TTS it responds to initial command, but hears it self talking and responds with I do not understand or what ever up to 2 - 3 times.

snootched commented 8 months ago

I've been playing around with this for last little while - what I've found are a few things.. I'm using computer_v2 model which is quite good, but it doesn't like my pronounciation (compewder)

What worked for me: -enable debug probability temporarily in openwakeword -find the probability threshold that seems to work for when you are speaking (and what is showing for the fast activation) -see how many samples you are getting (ie. threshold) for when you speak and the false activation

With this you can play with the threshold and trigger-level parameters. It's key to note, that I had to use different values when I was trying different models. For example, I had to go up to a threshold of 2-3 on some, whereas now I'm on threshold=1 again.

adoreparler commented 8 months ago

Will try this today For others --debug-probability and

--threshold

type=float,
default=0.5,
help="Wake word model threshold (0.0-1.0, default: 0.5)",

are the flags

snootched commented 8 months ago

Will try this today For others --debug-probability and

--threshold

type=float,
default=0.5,
help="Wake word model threshold (0.0-1.0, default: 0.5)",

are the flags

Sorry yes I wrote them incorrectly... --threshold=0.9 --trigger-level=1 When trying some other wake words, I did have to increase trigger-level to 2 or 3 as I saw it was catching multiple consecutively.

So far so good. I do get some random activations still - but very minimal now.

adoreparler commented 8 months ago

looks like you need to have both the --debug and --debug-probability flag to see probability in the logs

pcwii commented 8 months ago

I had the same problem and changed --threshold=0.9 and --trigger-level=3 and this has helped substantially.

adoreparler commented 8 months ago

Changing those values made no difference in the fact that it does not stop listening during/after responding

pcwii commented 8 months ago

It wakes twice less often but it definitely still wakes twice.

vhsdream commented 8 months ago

Perhaps unlikely but what does everyone's Assist pipeline in HA look like? Do you have Openwakeword configured there as well?

jmthompson commented 8 months ago

I do have OWW in my pipeline, since I also have an ESPBox running in another room that uses it. So, I just made a copy of the pipeline except without a wake word, and it's still exhibiting the repeating behavior.

I am doing some poking around with strace and tcpdump now trying to make sense of what's happening. Here's some snippits of the wyoming-openwakeword process during a command (I've clipped out all the epolls and other things that don't really matter here).

Audio comes in:

recvfrom(17, "{\"type\": \"audio-chunk\", \"version"..., 262144, 0, NULL, NULL) = 2066

Detection event sent:

sendto(17, "{\"type\": \"detection\", \"version\":"..., 59, 0, NULL, 0) = 59
sendto(17, "{\"name\": \"wintermute\", \"timestam"..., 51, 0, NULL, 0) = 51

Detection event sent again almost immediately, even though no additional audio was received:

sendto(17, "{\"type\": \"detection\", \"version\":"..., 59, 0, NULL, 0) = 59
sendto(17, "{\"name\": \"wintermute\", \"timestam"..., 51, 0, NULL, 0) = 51

Not sure what's up with this. If it's a bug it must be fairly recent, because the version of wyoming-openwakeword I have running in a docker container on my GPU server doesn't seem to have this issue when speaking to my my ESPBox.

pcwii commented 8 months ago

I too have OWW configured on my HA device. Looking forward to a solution to this.

synesthesiam commented 8 months ago

I've addressed this in a recent commit. There is now a "refractory period" per wake word (default: 5 seconds) that will prevent the same wake word from triggering multiple times in quick succession.

You can adjust this with --wake-refractory-seconds <SECONDS>

pcwii commented 8 months ago

Just to confirm, if we do not add the --wake-refractory-seconds <SECONDS> to our run command it will default to 5 seconds? So we do not need to add this if 5 seconds is good?

adoreparler commented 8 months ago

I think its broken even more lol I will upload video

adoreparler commented 8 months ago

https://youtu.be/BhyaeAqvqC4

pcwii commented 8 months ago

Mine seems to be working much better.

KeithSBB commented 3 months ago

I set --wake-refractory-seconds = 8 and it helped unless the TTS response was over 8 seconds. I wonder if its possible for HA to reject any STT that matches the last TTS?, or perhaps a way to mute the mic when TTS is playing?