Picovoice / porcupine

On-device wake word detection powered by deep learning
https://picovoice.ai/
Apache License 2.0
3.8k stars 505 forks source link

sensitivity 0.1 - 0.9 seems to have no difference #72

Closed ghost closed 6 years ago

ghost commented 6 years ago

so i created a ppn file for mac x86_64 "hey janet"

even with the sensitivities set to [0.1] it picks up odd words when i say things like "hey jant" "hey jan Bert" (and i really make sure the Bert sound comes out strong)

i even tried setting the sen to [0.00001] and it still fires with odd sounds

i would have thought the sensitivity closer to 1 would false positive lots.. but even 0.001 it false positives with close sounding words.. but "jan BERT" really has a B sound which i would have expected with a low sensitivity to not accept

kenarsa commented 6 years ago

Hmmm. That is odd. what is the miss rate roughly?

ghost commented 6 years ago

I don't know the miss rate? or what it even is calc by? :)

I was just trying to think of sounds and words that sound like "hey janet" but had other words "hey jan vant" "hey jan bert" "hey john et" these make it false positive.. and they have very different vowel sounds.. making sure to have the B or V H sounds very pronounced as to not sound like janet

"hey vanet".. things like this do not false positive

I will leave it running in the TV room with the kids to see how often it fires as i know they won't say things like hey janet in there

ghost commented 6 years ago

i would say that it falsely fires off 3 times every hour with just background noise and talking.. even with the sensitivity set to 0.1

kenarsa commented 6 years ago

Thanks. Did you have a chance to measure the miss rate? I'll take whatever information you can provide and use it to improve the next version.

ghost commented 6 years ago

its slow.. i cut it down to only 61000 "excel word" files.. so i stopped it but here it is testing just for "JANET"

INFO:loaded background speech dataset with 61444 examples INFO:loaded keyword dataset with 10 examples INFO:PocketSphinx (1e-10): INFO:false alarms per hour: 29.683759 (2306 / 77.685579) INFO:miss detection rate: 0.000000 (0 / 10) INFO:[Pocketsphinx][1e-10] took 180.30022630213332 minutes to finish INFO:Porcupine (0.0): INFO:false alarms per hour: 0.038617 (3 / 77.685579) INFO:miss detection rate: 0.100000 (1 / 10) INFO:[Porcupine][0.0] took 194.5665965698333 minutes to finish INFO:PocketSphinx (2.1544346900318867e-07): INFO:false alarms per hour: 14.494325 (1126 / 77.685579) INFO:miss detection rate: 0.000000 (0 / 10) INFO:[Pocketsphinx][2.1544346900318867e-07] took 178.7614756802167 minutes to finish INFO:Porcupine (0.1111111111111111): INFO:false alarms per hour: 0.064362 (5 / 77.685579) INFO:miss detection rate: 0.100000 (1 / 10) INFO:[Porcupine][0.1111111111111111] took 198.77709024325 minutes to finish INFO:PocketSphinx (0.0004641588833612782): INFO:false alarms per hour: 7.092693 (551 / 77.685579) INFO:miss detection rate: 0.000000 (0 / 10) INFO:[Pocketsphinx][0.0004641588833612782] took 169.37341605933332 minutes to finish INFO:Porcupine (0.2222222222222222): INFO:false alarms per hour: 0.090107 (7 / 77.685579) INFO:miss detection rate: 0.100000 (1 / 10) INFO:[Porcupine][0.2222222222222222] took 196.99930387758334 minutes to finish

ghost commented 6 years ago

testing on just Porcupine and the same 10 example waves i created with the word JANET in it

ran this with and with out --add_noise.. and both had the exact same results

this was run without the wave files without the word.. for the false alarms column is null and void

sensitivity false_alarm_per_hour miss_rate
0 0 0.1
0.11111111 0 0.1
0.22222222 0 0.1
0.33333333 0 0.1
0.44444444 0 0.1
0.55555556 0 0.1
0.66666667 0 0.1
0.77777778 0 0.1
0.88888889 0 0.1
1 0 0.1
aqiank commented 6 years ago

I seem to be having the same experience on Linux x86_64 (multiple false alarms with mostly background noise). Let me know if you want the audio data. I can provide them (although I'm not sure how long the audio needs to be before the detection).

kenarsa commented 6 years ago

@oziee this is really good information. I believe I have identified the problem. Let me work on the fix and I will get back to you in a few days time. Thanks!

kenarsa commented 6 years ago

@aqiank could you please just provide the information @oziee did? I will look into that as well.

kenarsa commented 6 years ago

hey janet_mac.ppn.tar.gz

@oziee can you please try this file with your experiment? You need to uncompress it first!

ghost commented 6 years ago

i ran the experiment on ubuntu.. got a linux one?

i can have a play with it on mac

ghost commented 6 years ago

@kenarsa i hacked the wake word benchmark to run on the mac using that file and the same test wave files used to test before under ubuntu here is the output..

EDITED

So I removed the output as I realized my tests uses just “Janet” so the miss rate was high as the test waves just had Janet and not Hey Janet

However with the new file you sent me Hey Janet... using the initial testing phrases.. it still picks up oddly sounding names as listed in the first post

kenarsa commented 6 years ago

Ok. That makes sense now. What's the sensitivity you are using with the new model?

ghost commented 6 years ago

0.5

kenarsa commented 6 years ago

OK. Let's reduce the sensitivity. If you have time/interest I suggest reducing by 0.1 iteratively until you get to a point that you are happy with it. Otherwise just try 0!

ghost commented 6 years ago

Ok.. good that sensitivity is working now:)

ghost commented 6 years ago

my bad.. i just checked.. its set to sensitivities = [0.00001] so "jan Bert" etc are still being detected with basically a 0 sensitivity

I tested the same phrases on the snowboy test webpage and they also detected the incorrect sounding words too.. but i could not change the sensitivity there.. will run the benchmark again with all 3 engines and see how it goes

I am creating a few different sets of sound samples to test with Set 1 - all files contain "hey janet" and miss rate from sensitivity 0-1 was 0 Set 2 - 2 files contain "hey janet" 8 contain just "janet".. sensitivity 0 missed 8, sen 1 missed 4

looking good so far.. setting to 0 might be the best place to start :)

ghost commented 6 years ago

so i have had it running all day with the tv on in the background... its firing off 4-5 times an hour

the sensitivity is set to 0

kenarsa commented 6 years ago

that seems odd as you mentioned that the old model was firing 3 times with a higher sensitivity. In order to help I need to reproduce. I will get back to you in a few days time.

ghost commented 6 years ago

i am adding some buffer code to collect the audio when the wake word triggers just to see what its hearing

ghost commented 6 years ago

i have been running it for 48hrs now and not one false alarm.. wonder if i restarted it then copied the file instead of copying the file and restarting.. i can bet it was something i did wrong.. but set to 0 its has been working a treat with no problems so far

abramovi commented 6 years ago

@oziee regarding your last test - is it with the patched model ? (hey janet_mac.ppn.tar.gz)

ghost commented 6 years ago

@abramovi correct the patched model

jscottsf commented 6 years ago

We've been having the same issue. Even at 0 sensitivity, we get a fair amount of false triggers when just casually speaking. This is on Linux.

kenarsa commented 6 years ago

you need to wait for v1.5 release. If you have a commercial application reach out via email and we can provide you an updated model faster

jscottsf commented 6 years ago

you need to wait for v1.5 release. If you have a commercial application reach out via email and we can provide you an updated model faster

Thank you so much. Sending an email to your sales team now.