Very good recall and accuracy but does not recognize specific voices

dilerbatu commented 4 months ago

Hey everyone, I have a model that has got 0.90 accuracy, 0.81 recall which is quite good in my opinion. Also it does not fail on the field. The issue about this model is it gives very very low probability of certain voices. My keyword is "Hey Py Za". Unrecognizable voices are man and indian speakers. Any advise ?

I have used 50k data 700k steps and 3000 negative weight

Thanks.

dscripka commented 2 months ago

In cases like this, it is almost always due to limited similarity in the synthetic training data to the target voices. While the TTS model used to generate the training data (Piper) should produce a wide range of different voices, because it was trained on the LibriTTS dataset it may have relatively low representation of different accents (including Indian speakers).

It is difficult to fix this issue without adding more training data that is more similar to the target speakers you expect in deployment. If you have real audio samples, or another TTS model that can more effectively produce other languages/accents, you can add these to the training data and you should see improved performance.

dilerbatu commented 2 months ago

Thanks for answer!

dscripka / openWakeWord

Very good recall and accuracy but does not recognize specific voices #190