Kitt-AI / snowboy

Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy
Other
3.08k stars 997 forks source link

can not distinguish between xiao(小) and shou(瘦) #235

Open wensheng opened 7 years ago

wensheng commented 7 years ago

I learned about snowboy from Baidu Create 2017 (congratulation on the acquisition btw.)

So I created a hotword xiao zhou zhou, installed snowboy successfully on a Raspberry Pi, and start testing.

I copied the code verbatim from the hotword page to test.py:

import snowboydecoder
def detected_callback():
    print "hotword detected"
detector = snowboydecoder.HotwordDetector("xiao_zhou_zhou.pmdl", sensitivity=0.5, audio_gain=1)
detector.start(detected_callback)

I run it, whenever I say xiao zhou zhou(小周周), it printed hotword detected, so far so good.

So I began testing other similar phrases, "da zhou zhou", "pang zhou zhou", "shou zhou zhou" (in Chinese, 大周周,胖周周,瘦周周), whenever I say "shou zhou zhou", it also print hotword detected.

xiǎo is pretty different from shòu, I wonder how the snowboy miss it.

Another thing I found is that snowboy have no knowledge of tones at all, I tried combinations of 4 tones of xiao zhou zhou, for example (笑皱皱, 肖肘肘 etc.), they are all detected as hotwords. I guess the model was trained only on non-tonal language like English.

I'm not sure it's bug or feature, you might say it's feature. But for Chinese, ignoring tones will give false positive very easily. Since you're baidu now, it might be a good idea for model to be trained on Chinese, or create a separate model for Chinese.

Since 3 of 4 co-founders are Chinese, you can verify all these easily. I'm looking forward to fix/improvement.

chenguoguo commented 7 years ago

Thanks for testing this out.

The model we provide on the website is mainly for prototyping. For commercial use, we would recommend using a universal model.

Regarding the tone, you are correct, we didn't model it, mostly because we started from English. But it's also a decision to make whether or not to add the tone enforcement. Most of the time, people say the wake word without a lot of attention, and enforcing the tone features will reduce the detection rate.

With that said, you can also play with the sensitivities.

78226415 commented 7 years ago

[https://github.com/Kitt-AI/snowboy/issues/251] 英文的"turn on" and "turn off"都区分不了,更不用说中文了,哈哈