Closed ghost closed 6 years ago
Hmmm. That is odd. what is the miss rate roughly?
I don't know the miss rate? or what it even is calc by? :)
I was just trying to think of sounds and words that sound like "hey janet" but had other words "hey jan vant" "hey jan bert" "hey john et" these make it false positive.. and they have very different vowel sounds.. making sure to have the B or V H sounds very pronounced as to not sound like janet
"hey vanet".. things like this do not false positive
I will leave it running in the TV room with the kids to see how often it fires as i know they won't say things like hey janet in there
i would say that it falsely fires off 3 times every hour with just background noise and talking.. even with the sensitivity set to 0.1
Thanks. Did you have a chance to measure the miss rate? I'll take whatever information you can provide and use it to improve the next version.
its slow.. i cut it down to only 61000 "excel word" files.. so i stopped it but here it is testing just for "JANET"
INFO:loaded background speech dataset with 61444 examples INFO:loaded keyword dataset with 10 examples INFO:PocketSphinx (1e-10): INFO:false alarms per hour: 29.683759 (2306 / 77.685579) INFO:miss detection rate: 0.000000 (0 / 10) INFO:[Pocketsphinx][1e-10] took 180.30022630213332 minutes to finish INFO:Porcupine (0.0): INFO:false alarms per hour: 0.038617 (3 / 77.685579) INFO:miss detection rate: 0.100000 (1 / 10) INFO:[Porcupine][0.0] took 194.5665965698333 minutes to finish INFO:PocketSphinx (2.1544346900318867e-07): INFO:false alarms per hour: 14.494325 (1126 / 77.685579) INFO:miss detection rate: 0.000000 (0 / 10) INFO:[Pocketsphinx][2.1544346900318867e-07] took 178.7614756802167 minutes to finish INFO:Porcupine (0.1111111111111111): INFO:false alarms per hour: 0.064362 (5 / 77.685579) INFO:miss detection rate: 0.100000 (1 / 10) INFO:[Porcupine][0.1111111111111111] took 198.77709024325 minutes to finish INFO:PocketSphinx (0.0004641588833612782): INFO:false alarms per hour: 7.092693 (551 / 77.685579) INFO:miss detection rate: 0.000000 (0 / 10) INFO:[Pocketsphinx][0.0004641588833612782] took 169.37341605933332 minutes to finish INFO:Porcupine (0.2222222222222222): INFO:false alarms per hour: 0.090107 (7 / 77.685579) INFO:miss detection rate: 0.100000 (1 / 10) INFO:[Porcupine][0.2222222222222222] took 196.99930387758334 minutes to finish
testing on just Porcupine and the same 10 example waves i created with the word JANET in it
ran this with and with out --add_noise.. and both had the exact same results
this was run without the wave files without the word.. for the false alarms column is null and void
sensitivity | false_alarm_per_hour | miss_rate |
---|---|---|
0 | 0 | 0.1 |
0.11111111 | 0 | 0.1 |
0.22222222 | 0 | 0.1 |
0.33333333 | 0 | 0.1 |
0.44444444 | 0 | 0.1 |
0.55555556 | 0 | 0.1 |
0.66666667 | 0 | 0.1 |
0.77777778 | 0 | 0.1 |
0.88888889 | 0 | 0.1 |
1 | 0 | 0.1 |
I seem to be having the same experience on Linux x86_64 (multiple false alarms with mostly background noise). Let me know if you want the audio data. I can provide them (although I'm not sure how long the audio needs to be before the detection).
@oziee this is really good information. I believe I have identified the problem. Let me work on the fix and I will get back to you in a few days time. Thanks!
@aqiank could you please just provide the information @oziee did? I will look into that as well.
@oziee can you please try this file with your experiment? You need to uncompress it first!
i ran the experiment on ubuntu.. got a linux one?
i can have a play with it on mac
@kenarsa i hacked the wake word benchmark to run on the mac using that file and the same test wave files used to test before under ubuntu here is the output..
EDITED
So I removed the output as I realized my tests uses just “Janet” so the miss rate was high as the test waves just had Janet and not Hey Janet
However with the new file you sent me Hey Janet... using the initial testing phrases.. it still picks up oddly sounding names as listed in the first post
Ok. That makes sense now. What's the sensitivity you are using with the new model?
0.5
OK. Let's reduce the sensitivity. If you have time/interest I suggest reducing by 0.1 iteratively until you get to a point that you are happy with it. Otherwise just try 0!
Ok.. good that sensitivity is working now:)
my bad.. i just checked.. its set to sensitivities = [0.00001] so "jan Bert" etc are still being detected with basically a 0 sensitivity
I tested the same phrases on the snowboy test webpage and they also detected the incorrect sounding words too.. but i could not change the sensitivity there.. will run the benchmark again with all 3 engines and see how it goes
I am creating a few different sets of sound samples to test with Set 1 - all files contain "hey janet" and miss rate from sensitivity 0-1 was 0 Set 2 - 2 files contain "hey janet" 8 contain just "janet".. sensitivity 0 missed 8, sen 1 missed 4
looking good so far.. setting to 0 might be the best place to start :)
so i have had it running all day with the tv on in the background... its firing off 4-5 times an hour
the sensitivity is set to 0
that seems odd as you mentioned that the old model was firing 3 times with a higher sensitivity. In order to help I need to reproduce. I will get back to you in a few days time.
i am adding some buffer code to collect the audio when the wake word triggers just to see what its hearing
i have been running it for 48hrs now and not one false alarm.. wonder if i restarted it then copied the file instead of copying the file and restarting.. i can bet it was something i did wrong.. but set to 0 its has been working a treat with no problems so far
@oziee regarding your last test - is it with the patched model ? (hey janet_mac.ppn.tar.gz)
@abramovi correct the patched model
We've been having the same issue. Even at 0 sensitivity, we get a fair amount of false triggers when just casually speaking. This is on Linux.
you need to wait for v1.5 release. If you have a commercial application reach out via email and we can provide you an updated model faster
you need to wait for v1.5 release. If you have a commercial application reach out via email and we can provide you an updated model faster
Thank you so much. Sending an email to your sales team now.
so i created a ppn file for mac x86_64 "hey janet"
even with the sensitivities set to [0.1] it picks up odd words when i say things like "hey jant" "hey jan Bert" (and i really make sure the Bert sound comes out strong)
i even tried setting the sen to [0.00001] and it still fires with odd sounds
i would have thought the sensitivity closer to 1 would false positive lots.. but even 0.001 it false positives with close sounding words.. but "jan BERT" really has a B sound which i would have expected with a low sensitivity to not accept