esphome / home-assistant-voice-pe

Home Assistant Voice PE
Other
42 stars 7 forks source link

Enable out-of-the-box compatibility with third-party voice-kits like the FutureProofHomes Satellite1 PCB Dev Kit and Seeed Studio’s newReSpeaker Lite Voice Assistant Kit? #44

Closed Hedda closed 1 month ago

Hedda commented 1 month ago

Maybe this is the wrong place to ask but wondering if there are plans to work in some template framework for ESPHome components like the voice_assistant components and the i2s_audio component for ESPHome to allow for out-of-the-box compability with third-party voice-kits that are also based on ESPHome and using similar hardware with the XMOS DSP chips and ESP32-S3 combination but perhaps are using different pins or maybe another SKU/model of XMOS xCORE DSP chip? Perhaps also reusable firmware builds for the most common XMOS xCORE DSP chips?

I know this is all still very early in development and Nabu Casa has not even yet formerly announced your upcoming voice-kit hardware platform but there are already some third-party ESP32-based voice-kits being made available or announced, such as example the newly released "ReSpeaker Lite Voice Assistant Kit" by @Seeed-Studio and the upcoming "Satellite1 PCB Dev Kit" open-source hardware product from @FutureProofHomes (which I understand is that project that @gnumpi and @ben-gineering also joined), and I am sure that we will see a few more related projects with hardware specifications before the end of this year. See:

I spotted @nielsnl68 and @nanosonde writing they are using XMOS’s XK-VOICE-L71 (xCORE Voice Reference Design Evaluation Kit)and XFV3610-IN respectively which though very expensive might be still be interesting as an reference hardware for development if could use the same codebae for custom firmware builds with different XMOS DSP chips:

Then there should maybe is the future consideration of additional other projects that might not use XMOS xCORE DSPs but other hardware accelerated DSP solutions, such as for example this project by @alextrical where he is instead using ZL38063, though I understand if that would be out of scope at this point in time.

PS: Off-topic but just wanted to add that personally I am hoping to also see one ore more variants of updated ready-made clones of the "Onju Voice" open-source hardware PCB design project in the future with XMOS xCORE DSP chips that are drop-in replacements for all various models of Google Nest speakers and Google Home displays or Amazon Echo / Show voice assistant products that can work out-of-the-box so that users can reuse the enclosure and speakers of their existing products that might soon be outdated or irrelevant once we start fully moving over to Home Assistant's Voice Assistants ecosystem.

tbrasser commented 1 month ago

Quick reaction/question (on that last part), what is the current onju lacking with respect to xmos based i2s dacs? (that voice-kit relies on?)

Hedda commented 1 month ago

Quick reaction/question (on that last part), what is the current onju lacking with respect to xmos based i2s dacs?

Current onju-voice PCB design does not feature a XMOS xCORE AI (DSP chip), instead it only has a ESP32-S3. See feature request discussion here:

kahrendt commented 1 month ago

We are moving fast and breaking things as we figure out the best way for these components to interact, but we will add all the code to the base ESPHome project once things are stable and working well. One goal for this repo is to not be reliant on specific hardware configurations. These other boards you linked are quite exciting, and I believe they should be compatible (or relatively easily made compatible) with our changes. For example, all the I2S settings and pins are still configurable in yaml, so it should be straightforward to add support for similar boards.

Very little of the code is reliant specifically on the XMOS chip (and the few lines that are should be adaptable or won't even be there in the final version as we clean up the code), so it should be possible to add support for other DSPs in the future.

Hedda commented 1 month ago

it should be possible to add support for other DSPs in the future.

This is great! Yeah, I think that could be a very cool idea there could be to use the RP2040 or the new RP2350 MCU chip (Raspberry Pi Pico 2) and run open-source software audio DSP instead of the propriatory XMOS xCORE MCU chip.

That is, combine an ESP32 with a RP2xxx chip on the same development board as I believe that combinatu would make for interesting development kit that could attract developers from other projects as well.

UPDATE: FYI, just read now that the new RP2350 ARM Cortex-M33 cores have integrated DSP and FPU hardware acceleration pipelines, so the new RP2350 would at least deffinitivly be a better choice over the older RP2040.

https://www.raspberrypi.com/products/rp2350/

There are a few projects the RP2040 / RPi Pico to create different DSP solutions. However I could not find any projects that have voice specific software for audio DSPs, but maybe someone else know of such made for Arduino?

Other audio DSP projects for RP2040 / Rpi Pico (not specifically for voice though):

https://github.com/playduck/pico-dsp

https://github.com/tooyipjee/DS-Pi

https://github.com/DatanoiseTV/PicoADK-Hardware

https://people.ece.cornell.edu/land/courses/ece4760/RP2040/C_SDK_DSP/index_vga_dsp.html

There are even projects that make use the PIO (Programmable Input Output) of the RP2040 (and Raspberry Pi Pico) as a GPU for graphic output to a display. I guess could maybe use the PIO as a ADC and DAC for external audio input or output?

PS: I would think that having an reference hardware platform as a dev-kit for sale that would be attractive to other projects as well would be benefitial to achievibg better economy-of-scale that should keep the price down, which in turn should attract even more developers, especially those who like the concept of keeping everything 100% open-source.

Hedda commented 1 month ago

, I think that could be a very cool idea there could be to use the RP2040 or the new RP2350 MCU chip (Raspberry Pi Pico 2) and run open-source software audio DSP instead of the propriatory XMOS xCORE MCU chip.

That is, combine an ESP32 with a RP2xxx chip on the same development board as I believe that combinatu would make for interesting development kit that could attract developers from other projects as well.

UPDATE: FYI, just read now that the new RP2350 ARM Cortex-M33 cores have integrated DSP and FPU hardware acceleration pipelines, so the new RP2350 would at least deffinitivly be a better choice over the older RP2040.

@kahrendt Awesome timing! FYI, Raspberry Pi Foundation themselves has just a few days ago posted this very extensive blog article about Real-time ML audio noise suppression on Raspberry Pi Pico 2:

https://www.raspberrypi.com/news/real-time-ml-audio-noise-suppression-on-raspberry-pi-pico-2/

That long-format technical blog article delves into how an existing ML-based audio noise suppression algorithm can be deployed to Raspberry Pi’s RP2350 microcontroller used in the new Pico 2 board.

nielsnl68 commented 1 month ago

I spotted @nielsnl68 and @nanosonde writing they are using XMOS’s XK-VOICE-L71 (xCORE Voice Reference Design Evaluation Kit)and XFV3610-IN

You spotted that wrong 😆 i'm using the M5Atom Echo for my Combadge project and th Raspaudio protokits for my inhouse audio services :D