Open mattmon opened 1 year ago
Thanks!
Given the Espressif docs and general industry "standard" for a minimum of three syllables the first wake word we will commission is likely going to be "Hey Willow". We will start a Kickstarter for this, likely utilizing the Espressif service that collects and records the samples for you. To try to utilize the community to source these samples will likely result in many samples that don't meet the standards/requirements in terms of environment, audio quality, various microphone distances, etc and pruning them is likely going to be more trouble than it is worth. We feel this is the best approach that will result in a usable wake word with the least amount of time and effort (our most precious resources). We are waiting until sometime after our first stable release (next week) to begin this process.
People are very passionate about wake words and we'll likely take additional finalists as suggested by the community that meet the requirements and start Kickstarter campaigns for them at pass-through cost.
In the (far) future we may develop our own wake word engine and a process to source samples and generate wake words, potentially utilizing generative AI techniques to extrapolate low numbers of samples for a given word to a training dataset of reasonable quality. I've learned that many users will only be happy with custom wake words and this will attempt to support this with the caveat that this is unlikely to meet the same level of reliability in terms of wake activation and false wake avoidance as the Espressif process.
"Hey Willow" is nice! Rolls off the tongue. Thanks for the update.
I'm in. It's just not like I'm constantly checking Kickstarter or similar for updates...
Great job devs, I'm really impressed at how well the ESP-Box works in combination with Willow inference server. I'm probably not alone in thinking that a dedicated 'Willow' wake word would be a great feature.
Aside from money, the requirements doc from espressif says at least 20,000 voice samples are needed, from 'more than 500 people, including men and women of all ages and at least 100 children'.
Are there any plans to start collecting samples? Has anyone inquired with espressif about the actual cost?
I know I would contribute; in both audio and monetary units!
Would really like to see this happen, I think it would boost the mass appeal of this project and really give it legs as an alternative to Google and Amazon.