Willow doesn't recognize my wife's voice

dslugPX commented 1 year ago

Filing this (1 of 3) from our conversation on reddit.

My wife has a relatively high voice, and it's higher when she's "trying" as she sometimes is when getting used to Willow.

Her voice is often not recognized at all until she actively works to use a much lower register when she speaks (Not Elizabeth Holmes low... but you get the idea!).

Happy to provide whatever additional details would makes sense to help. And she's up for helping test too - so we can do some things specifically to help test if you provide some instructions.

kristiankielhofner commented 1 year ago

Welcome to Willow GH!

We are working through exposing virtually all of the configuration options we have for wake word detection and speech recognition in the Willow Configuration section of config. We'll reference this issue in those commits (likely later today) so you can experiment with them in testing with her.

One question - is wake the only issue? That is to say, can she speak in her natural (however high and excited) voice once she successfully activates wake and have accurate recognition of the command?

Something to try right now - we support multiple wake words. Have you tried an alternative such as Alexa? Pitch, tone, and pronunciation (among others) are fundamental aspects of wake words and we want to try to distinguish between them in working through this issue.

dslugPX commented 1 year ago

I haven't tried Alexa yet cause.... yuck! :)

But I will reflash one of them and give it a try for you for sure! We got ourselves a long weekend here and will have some time to mess about.

On Fri, May 26, 2023 at 5:43 AM Kristian Kielhofner < @.***> wrote:

Welcome to Willow GH!

We are working through exposing virtually all of the configuration options we have for wake word detection and speech recognition in the Willow Configuration section of config. We'll reference this issue in those commits (likely later today) so you can experiment with them in testing with her.

One question - is wake the only issue? That is to say, can she speak in her natural (however high and excited) voice once she successfully activates wake and have accurate recognition of the command?

Something to try right now - we support multiple wake words. Have you tried an alternative such as Alexa? Pitch, tone, and pronunciation (among others) are fundamental aspects of wake words and we want to try to distinguish between them in working through this issue.

— Reply to this email directly, view it on GitHub https://github.com/toverainc/willow/issues/111#issuecomment-1564336401, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3AMQB5FAGUB3LU37FEKAKDXICQQTANCNFSM6AAAAAAYQFMLRM . You are receiving this because you authored the thread.Message ID: @.***>

kristiankielhofner commented 1 year ago

We don't love it either but it has multiple purposes:

1) In cases where people are replacing Alexa they have been conditioned to say "Alexa ...." without thinking about it at this point. This allows them to have a smoother onboarding experience with Willow.

2) It's good for testing pronunciation and other issues like this one.

3) Some people actually like it.

kristiankielhofner commented 1 year ago

@dslugPX We have more config options in tree you can try but I just thought of something else - can you record speech samples of your wife's voice when Willow fails to wake? We do this in our testing and I'm embarrassed it took me this long to realize it would be helpful here too!

Ideally they would be high quality (two channel, 48 kHz). You can record several sessions of her speech when Willow fails to wake with your phone (or similar). When you play them back near Willow, Willow should fail to wake as well again.

Then, if you could also record a few examples with her higher register when Willow does wake and provide all of them we can do all of the debugging ourselves!

dslugPX commented 1 year ago

@kristiankielhofner - Yes! We can do that. I've also compiled some more data for you around speech recognition. I have a bunch of aliases, and we have been trying to find ones that work better than others.

Somethings that are definitely true. S sounds are really hard. "Skip Space" or "Skip Drums and Space" generally fails. "Two give me two" works almost every single time.

Anything with a natural pause in cadence, also causing issues.

Still seeing a bunch of issues with background noise too.

I owe you some files, and more data. We have been collecting it, but just haven't managed to pull it all together cohesively for you yet. But will. And soon.

kristiankielhofner commented 1 year ago

Very interesting information, thanks!

Now I'm really interested to hear a few recordings. We've not seen or heard any specific issues with "s sounds" or anything else... The background noise is going to be very interesting as well. The only issue of concern we have there is the end of VAD timing, in the testing we've done (different environments, accents, different types of noise) the source separation and noise cancellation seems to do a very good job as-is.

The recordings are best but can you elaborate on what you mean by "natural pause in cadence"?

toverainc / willow

Willow doesn't recognize my wife's voice #111