Kitt-AI / snowboy

Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy
Other
3.11k stars 1k forks source link

Better rules to choose the hotword. #29

Closed Smanar closed 8 years ago

Smanar commented 8 years ago

Hi, I m using your engine in a C++ project (for the moment it's more C) without problems, but I need to use a trick to reduce false positive. It's perhaps due to microphone but if a record the word "momomo" I have a trigger when I say "tatata" (even with 0.5 sensitivity). IDK how work your algorithm, but what can I do to choose a better Hotword. Its better to use long word/short word more vowel/more consonant, simple word+blank+simple word, ...

chenguoguo commented 8 years ago

Yes it's definitely better to use long and unique word/phrases. We should probably add it to the documentation. Thanks for pointing this out. On Jul 2, 2016 9:57 AM, "Smanar" notifications@github.com wrote:

Hi, I m using your engine in a C++ project (for the moment it's more C) without problems, but I need to use a trick to reduce false positive. It's perhaps due to microphone but if a record the word "momomo" I have a trigger when I say "tatata" (even with 0.5 sensitivity). IDK how work your algorithm, but what can I do to choose a better Hotword. Its better to use long word/short word more vowel/more consonant, simple word+blank+simple word, ...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Kitt-AI/snowboy/issues/29, or mute the thread https://github.com/notifications/unsubscribe/ALPfk1jyuzH0LvfNxerJaaIsJFsDjimXks5qRph-gaJpZM4JDtnu .

Smanar commented 8 years ago

And for vowel/consonant choose, it doesn't have impact on it ? Some "sound" aren't better than other ? It's better to use same vowel 2/3 time in the word, or the more different possible ?

Or only the lenght have impact on the result ?

chenguoguo commented 8 years ago

Things like vowel/consonant choice will affect the performance, yes, but may not be a lot. Your intuition should be correct to some extent, but we don't have a chart showing the performance difference of different vowel/consonant combination, so I don't feel like giving a suggestion on this yet. Also, if we suggest one "sound" over another, that may not hold if we update the model in the future :-) We do suggest developers trying different hotwords though. And please keep giving us suggestions like this to make it a better tool, thanks!

Smanar commented 8 years ago

Lol, thx a lot, I will continue my tests for fun, but since I don't know how works your engine, Is totaly random ^^. In fact It's not really a problem, I m using a trick. With 2 hotwords, after the first one I have 3/4 s to say the second one, else it reset. With that I have only 1 or 2 false positive in a complete day, but it's realy frustrating when the engine mix some word as differents as "momomo" and "tatata".

But nevermind my project works fine with it, SnowBoy is realy a good engine,so powerful and so easy to set up.

chenguoguo commented 8 years ago

Love your trick, by probability it will reduce false positive quite a lot :-)

We will keep imoroving the model. Hopefully in the future it will better handle things like momomo. Our model is a currently trained on human speech, not "sound".

chenguoguo commented 8 years ago

I'll add some suggestions on how to choose a hotword. Colsing this issue.