esphome / feature-requests

ESPHome Feature Request Tracker
https://esphome.io/
422 stars 27 forks source link

support for i2s audio #599

Closed robertmuth closed 2 years ago

robertmuth commented 4 years ago

Add support for I2S audio devices like

https://www.adafruit.com/product/3006?gclid=EAIaIQobChMI6qbJ8vvK5wIVxZyzCh0NwwRIEAQYASABEgJjWvD_BwE

One use case would be an alarm or siren. Wave files would be stored directly on the esp32 or on spi flash (maybe via some sort of asset management independent of the firmware)

Playback of these wave could triggered just like any other action

Additional context

lwqcz commented 4 years ago

Using I2S for a siren seems to be rather odd. Would not be more bullet-proof to use already made sirens - they are cheap and really easy to use - just use a MOSFET, relay or transistor to switch its power supply.

robertmuth commented 4 years ago

The "siren" was just one example use case. Another one would be a doorbell.

In both cases, I would prefer a nice sound defined by an audio-file (or programmatically say by combining sine waves)

glmnet commented 4 years ago

you can easily play audio files with the df player integration https://esphome.io/components/dfplayer.html sound quality is quite good, it even features an integrated low quality amplifier to directly connect a speaker, ideal for doorbells or simple audio feedback

having i2s audio on the esp32 might be doable, but I believe there are other better alternatives already available

robertmuth commented 4 years ago

The esp32 has an i2s interface so why not offer a way to use. dfplayer is certainly an alternative, whether is is better one I do not know. I think choice is good, why else would esphome support at least half a dozen temperature sensors.

How to prioritize this feature is a different story.

lwqcz commented 4 years ago

@robertmuth ESPHome needs to support dozens of different sensors because it's original purpose was to be the brain of something reading sensors and switching some relais or so on. IMHO - I2S is rather hard to support and unfortunately it will be used by few only. If you need a siren, then use one already made, it is like wasting your time, trying to implement siren and wasting @OttoWinter (or some other guy) time as well - trying to implement this complex feature.

But I may be wrong as well ...

DrFrankReade commented 4 years ago

As somebody who's done some I2S work, it's a pain in the butt since you probably need to connect I2C to the CODEC to configure it and so on, and then each CODEC has it's own config specifications, then you're in to supporting individual codecs. So you need a codec and an amp, and an SD card, or if you store the audio in the ESP's flash, you need to be able to access that data and update firmware if you want to change sound, or develop a web UI to upload sounds, etc. While that may be easier in some cases than just swapping out an SD card, once you get through with all the other peripherals required to support an I2S chip, you may as well just use a Raspberry Pi Zero, or the aforementioned dfplayer. On-the-fly generation is another thing entirely.

glmnet commented 4 years ago

I just want to apologize for using the word “better” mostly I just wanted to point out that you can get a similar result with dfplayer. And yes this i2s thing seems quite complex, if there is nothing already done (another library) then it is even harder (I don’t know if there is, op failed to provide a link) As Otto would say we are not saying we are rejecting it, “if somebody sends a PR, it will be welcomed”

robertmuth commented 4 years ago

I have not implemented code interfacing i2s myself, so I cannot fully judge the complexity of the effort.

But ...

Again, I am not saying that this is the most important feature to have but having this feature request shows that there is some interest and helps prospective implementers coordinate efforts.

DrFrankReade commented 4 years ago

Hello,

Your point about existing hardware is well taken, and it really does make one's life much easier. Also, Perhaps I was unclear with my use of the word CODEC. I simply referred to the chip between the ESP32 and the amplifier. In industry they're often refereed to as CODECS (See the Espressif link above), but they usually do not do things like decode MP3 files. Unfortunately, we're risking going down the "is a wrap a sandwich" debate

It's also important to remember that each chip has it's own unique configuration requirements, over I2C, SPI or some other interface on top of the I2S. Simply dumping PCM data over the bus probably won't do anything without recognizing the unique requirements of each chip and configuring them accordingly. Eventually, there's a huge library and a pretty hefty support burden.

I honestly think this would be a fantastic feature, and I'd probably find a use case for it eventually, but my concern would be that it probably deviates from the core mission of esphome, and supporting it means having to support each and every chip that the user would likely encounter.

ON THE OTHER HAND, you could likely just compile sound support yourself. Without a doubt, there's code that you can get running in Platform.io, and ESPhome supports the use of external libraries and code, so folding in sound support for your particular use case might be a simple matter. I'd definitely try it if I had one of those reference boards, I hope I can encourage you to as well.

timotoots commented 4 years ago

Actually it could be pretty useful in this context. It could be used to create very simple sound players with Snapcast (https://github.com/badaix/snapcast). Home assistant is supporting it already. So basically you could have ESP based boards play sound (alarms or music) or record sound as well. Audio as a "sensor" or "actuator" is pretty powerful. There are plenty of new ESP32 audio board released that has external pins available and are cheap to use.

Only thing is that currently you need ESP-ADF for those audio functions.

Some discussion here: https://github.com/badaix/snapcast/issues/347

mrand commented 4 years ago

Possibly of interest on this topic: https://community.home-assistant.io/t/example-esphome-playing-wav-files-with-a-speaker/150302

dturgel commented 4 years ago

Please add support for mics and speakers - the new TTGO boards are ready for this and so are we.

acedrew commented 4 years ago

The m5Stick-c would support some interesting mic use cases as well.

balrog-kun commented 4 years ago

Note about the CODEC question, as far as I remember I2S is simply a serial protocol where you send PCM samples, there's not much that can vary between codecs. Additionally, the ESP32 has a built-in codec -- you send data through the I2S registers and it gets output through the DAC pins. All you need in that case is an amplifier -- you can use an LM386 chip or an NPN-transistor, two resistors and a capacitor.

I think it's worth supporting this, but I have no code to back my words at this time.

I'd personally love to be able to stream audio from another device to an esphome node, but that'd have to be another development after basic support for playing back hardcoded samples, and would involve a new HTTP API. If anyone has ideas on how that could look, I'd like to hear them.

elbarsal commented 4 years ago

I have been playing with an M5StickC and speaker hat, with the intention of using it to interface to Rhasspy and Home Assistant. I do have it set up now with some kludgy custom components for the SPM1423 microphone (input via I2S) and the speaker hat (output via I2S) formatting and sending Hermes protocol messages via MQTT.

I actually have it working (I can say "hey computer" and it plays a recognition chime, then "open the door" and it succeeds / fails and plays the appropriate response). All the audio in both directions is being sent via MQTT in WAV format.

Relevant to this thread, I'm trying to make my code modular - including the I2S interface and the Hermes protocol portions. Can anybody point me to resources for structuring these pieces? From a component perspective, I see it as being separate modules:

I2S <--> Hermes <--> MQTT

so that the I2S portion could be used separately from the rest, if there were other audio sources and sinks.

TLDR: I have I2S working for microphone and speaker, code is really ugly, would like to turn it into a components, any resources?

balrog-kun commented 4 years ago

Relevant to this thread, I'm trying to make my code modular - including the I2S interface and the Hermes protocol portions. Can anybody point me to resources for structuring these pieces? From a component perspective, I see it as being separate modules:

I2S <--> Hermes <--> MQTT

so that the I2S portion could be used separately from the rest, if there were other audio sources and sinks.

TLDR: I have I2S working for microphone and speaker, code is really ugly, would like to turn it into a components, any resources?

Right, I see it that way too. I don't know anything about Hermes but I guess the I2S component is going to be a subclass of Component because there's nothing to stream audio in ESPHome yet, AFAIK. There's the esp32_camera that streams video frames as an example.

I guess ideally you'd configure the I2S component to use either a specific mqtt topic and audio format and not have to do much more. Or, you'd configure it with the name of an audio source/sink (new class), and you'd configure the Hermes component with the mqtt data and it would register as an audio source/sink. This way future audio components could be connected to one another.

I don't use mqtt personally so I'd need to add similar component that works on top of the esphome protocol. Do you have the code posted anywhere yet?

elbarsal commented 4 years ago

Only a private repository elsewhere for now, I will see about putting something up on Github. Right now it's all mashed in a single component (SPM1423 - the microphone) and I started an attempt at modularizing.

The Hermes protocol is pretty simple - basically drop WAV data posted to a specific MQTT topic, that other clients will read. My vision of the I2S component would be that it produces bytes, that could be consumed by the next component (raw UDP broadcast, Hermes over MQTT, storage to SD card, etc.)

The hardest part of it all has been making sense of the data coming in via I2S - the example code I started with was all just taking the 8 high order bits, but there are at least 12 (oddly, it looks like 13) bits of data coming from the SPM1423, and samples were what appeared to be word swapped - easily discovered by testing with a sine wave and displaying in Audacity, but just "poor quality" when dealing with voice.

When I put something up I will post it here (but it will undoubtedly be very rough...)

DieKatzchen commented 4 years ago

I would like this as well, especially if alternate protocols are supported. I know that Rhasspy will actually accept a raw PCM stream over UDP, which would reduce the overhead. I'm sure people will come up with other protocols over time as well.

elbarsal commented 3 years ago

Digging deeper into things, it appears that the MQTT library ESPHome is using can't handle complex traffic - I may have to look at raw PCM over UDP. I have been trying to pull things apart to make it easier to deal with since everything is lumped into the SPM1423 component so far (including playback). Sorry it's slow, work has taken more of my time lately.

DieKatzchen commented 3 years ago

Audio streaming over MQTT never made sense to me. MQTT was intended for brief, lightweight messages from devices that needed to conserve power. They would wake, check if there were any messages intended for them, transmit any messages they needed to send, then go back to sleep. That's why you needed a broker to hold onto those messages, because it was unlikely that the sender and receiver would be awake at the same time.

There are already a ton of protocols available for streaming audio, which is what we're discussing here.

rarroyo6 commented 3 years ago

I can see an immediate use for this! Send TTS messages from HA to this device. HA already has the capability to generate the message in MP3 format, place it on a web accessible folder, and send out the link to a "Media Player". ESPHome would receive the link, open the file, and pass the content to another module which would decode the MP3 and play it. A cheap module like the ones using the VS1053 would work. This would allow you put cheap speakers throughout the house, and would bypass MQTT which just adds another point of possible failure.

frog32 commented 3 years ago

My usecase would be to upgrade my intercom interface with an updated one that allows me to stream the audio as well.

elbarsal commented 3 years ago

Brief update - my hacked up code is getting cleaner, and some of the design decisions are starting to become evident (such as separating the audio acquisition from the transmission). Still not ready for publishing as there is still too much dead code and coupling.

@rarroyo6 yes this would ideally work in both directions - as the ugly code is right now, it can receive (limited) audio and play it back via WAV in MQTT. I'm hoping to make things pluggable.

@frog32 sounds like another interesting use case. The hardware I'm using doesn't have a great speaker, but that doesn't mean you couldn't use better.

timotoots commented 3 years ago

Maybe it is possible to somehow include ESP-ADF as a extra component like it's done here: https://github.com/maxgerhardt/pio-esp-adf-example Even if it's a hacky way, it could be a start to build more audio related functions in Esphome. Could it be done already now?

Bazmundi commented 3 years ago

LilyGo TTGO T-Camera V1.6.2 I have bought has a built in I2S mike and appears to use the code here.

Arguing over use cases seems to be counter productive since most innovations are discovered by people who can find use cases and never by those who cannot. Probably safe to say ESPHome is behind in its use-cases.

dturgel commented 3 years ago

Correct and I think the question is can we somehow attach a I2S speaker component to this as well?

On Sat, Feb 20, 2021 at 12:37 AM Asterion Daedalus notifications@github.com wrote:

LilyGo TTGO T-Camera V1.6.2 https://www.aliexpress.com/i/4000009451126.html I have bought has a built in I2S mike and appears to use the code here https://github.com/Xinyuan-LilyGO/TTGO-Camer-Mic.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/esphome/feature-requests/issues/599#issuecomment-782579746, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHFPJQIHLWHHGNMEHG5EYZ3S75RDVANCNFSM4KTRFHIQ .

dturgel commented 3 years ago

Found a couple of things on github this weekend that might help - probably not but worth a look:

On Mon, Feb 22, 2021 at 8:00 AM Daniel Turgel dturgel@gmail.com wrote:

Correct and I think the question is can we somehow attach a I2S speaker component to this as well?

On Sat, Feb 20, 2021 at 12:37 AM Asterion Daedalus < notifications@github.com> wrote:

LilyGo TTGO T-Camera V1.6.2 https://www.aliexpress.com/i/4000009451126.html I have bought has a built in I2S mike and appears to use the code here https://github.com/Xinyuan-LilyGO/TTGO-Camer-Mic.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/esphome/feature-requests/issues/599#issuecomment-782579746, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHFPJQIHLWHHGNMEHG5EYZ3S75RDVANCNFSM4KTRFHIQ .

wwebers commented 3 years ago

I'm also very much interested in getting this library (ESP8266Audio) get integrated in ESPHome. Reason in my use case is the lack of PINs in my Wemos D1, a usual problem with ESP8266 as I learned. However, because of that I'm simply unable to use the UART to use the DFPlayer...

ferbar commented 3 years ago

@wwebers: It was already mentioned in a previos post, have a look at: https://community.home-assistant.io/t/example-esphome-playing-wav-files-with-a-speaker/150302/3 It wasn't too hard to write the custom integration. However the code is for the versions from 2019 and may have to be adopted for the current versions.

dturgel commented 3 years ago

Thanks for sharing this link Christian!

On Sun, Mar 28, 2021 at 2:53 PM Christian Ferbar @.***> wrote:

@wwebers https://github.com/wwebers: It was already mentioned in a previos post, have a look at: https://community.home-assistant.io/t/example-esphome-playing-wav-files-with-a-speaker/150302/3 It wasn't too hard to write the custom integration. However the code is for the versions from 2019 and may have to be adopted for the current versions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/esphome/feature-requests/issues/599#issuecomment-808957613, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHFPJQKETX2JTT34JCIWBNLTF6QMFANCNFSM4KTRFHIQ .

wwebers commented 3 years ago

@ferbar Thanks, I knew your example already. However, it's just one example on how to stream a hard coded WAV over HTTP. A full functioning component would expose more, such as playing MP3 or RTTTL directly from the filesystem. But for that we would need a filesystem component as well. And a way to upload files.

TheGroundZero commented 3 years ago

Wanted to pitch in and also ask for I²S support for a microphone. Got myself a LILYGO® TTGO T-Camera and it comes with a microphone. Demo firmware even comes with a voice command to activate the camera.

kquinsland commented 3 years ago

... it's just one example on how to stream a hard coded WAV over HTTP. A full functioning component would expose more, such as playing MP3 or RTTTL directly from the filesystem.

Is that necessary? The title of this issue is support for i2s audio. I am interested precisely because I wish to stream either MP3 or WAV to a small ESP32 device with a speaker attached. While It would be nice to see ESPHome get a full media-player class of components, I don't see that as necessary to start using i2s with ESPHome.

As long as there is a function that I can call to 'feed' the samples into, that should be sufficient. Using a very cheap/common i2s chip, I would presume that the output component would look something like this:

i2s:
    lrc: GPIO15
    clk: GPIO16
    din: GPIO17

audio:
  - platform: i2s_max98357
    id: small_speaker
    lambda: |-
      it.send_pcm_stream(....);

# 
on_...:
  - http_request.get:
      url: https://my-homeassistant-internal.url/some/path/to/audio.wav
      verify_ssl: false
      on_response:
        then:
          - logger.log:
              format: 'Response status: %d'
              args:
                - status_code
          - audio.small_speaker:
              send_pcm_stream: '%d'

I can use existing lambdas combined with some of the other ESPHome building blocks to watch a MQTT topic for text and, if that topic contains text that looks like a valid URL, fetch the WAV/PCM stream and dump it into the component.

celer commented 3 years ago

Here are some nice examples streaming MP3s from a URLs: https://github.com/schreibfaul1/ESP32-audioI2S/wiki

My use case is I want to stream text to speech from Nuba Casa, via home assistant: https://www.nabucasa.com/config/tts/

jnoxon commented 3 years ago

My use case is similar to @celer. I'm not happy with my current audio notification setup. This would be a brilliant addition to esphome. I will set up several as soon as it's possible.

celer commented 3 years ago

I took a stab at this but got stuck, my approach was to try to integrate earlephilhower/ESP8266Audio into a custom component as a custom component, my end goal was to have a custom MQTT component which would send text to ESP8266SAM for speech synthesis.

I had two stumbling blocks, first was that platformio seemed to magically create stub libraries for stuff in ESP8266/Arduino, specifically I2S, SD, and SDFat but I couldn't figure out anyway to refer to those stub libraries in the custom component. So I stripped out all file dependencies form ESP8266/Audio and created my own local library by copying the libraries of ESP8266/Ardunio.

This got me to the point where I couldn't get the I2S library to find esp8266/Arduino/cores/esp8266/core_esp8266_i2s.h

I'm coming at this with zero platformio experiance and zero esphome custom component experience. I will try quickly seeing if I have more luck using esp-idf on the esp32 branch of esphome.

dturgel commented 3 years ago

Thanks Celer! Keep pushing it!

On Mon, Nov 29, 2021 at 5:19 PM celer @.***> wrote:

I took a stab at this but got stuck, my approach was to try to integrate earlephilhower/ESP8266Audio into a custom component as a custom component, my end goal was to have a custom MQTT component which would send text to ESP8266SAM.

I had two stumbling blocks, first was that platformio seemed to magically create stub libraries for stuff in ESP8266/Arduino, specifically I2S, SD, and SDFat but I couldn't figure out anyway to refer to those stub libraries in the custom component. So I stripped out all file dependencies form ESP8266/Audio and created my own local library by copying the libraries of ESP8266/Ardunio.

This got me to the point where I couldn't get the I2S library to find esp8266/Arduino/cores/esp8266/core_esp8266_i2s.h

I'm coming at this with zero platformio experiance and zero esphome custom component experience. I will try quickly seeing if I have more luck using esp-idf on the esp32 branch of esphome.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/esphome/feature-requests/issues/599#issuecomment-982155323, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHFPJQKMOOS4KF7ZDLH2FELUOQKAZANCNFSM4KTRFHIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

schmurtzm commented 2 years ago

Like others said here it has been done. It could be nice to have an updated feedback of this code :)

The most incredible audio player based on ESP32 is here : https://github.com/sle118/squeezelite-esp32 This is a squeezebox player (good sound, multiroom, LCD , airplay , spotify connect...). It require a squeezebox server (LMS).

It could be very useful to have sound streaming and TTS in ESPhome ! ESP8266Audio could be a good source to examine ! Mrdiy-audio-notifier and ESParkle are based on it.

dturgel commented 2 years ago

Thanks Schmurtz!

On Wed, Dec 8, 2021 at 10:39 AM Schmurtz @.***> wrote:

The most incredible audio player based on ESP32 is here : https://github.com/sle118/squeezelite-esp32 This is a squeezebox player (good sound, multiroom, LCD , airplay , spotify connect...). It require a squeezebox server (LMS).

It could be very useful to have sound streaming and TTS in ESPhome ! ESP8266Audio https://github.com/earlephilhower/ESP8266Audio could be a good source to examine ! Mrdiy-audio-notifier https://gitlab.com/MrDIYca/mrdiy-audio-notifier/-/tree/master and ESParkle https://github.com/CosmicMac/ESParkle are based on it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/esphome/feature-requests/issues/599#issuecomment-989029036, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHFPJQMRPS7WR4L2UFXWYL3UP6J53ANCNFSM4KTRFHIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

nagyrobi commented 2 years ago

Implemented: https://esphome.io/components/media_player/i2s_audio.html

dturgel commented 2 years ago

Amazing - thank you so much!

On Fri, Jul 1, 2022 at 12:39 AM H. Árkosi Róbert @.***> wrote:

Implemented: https://esphome.io/components/media_player/i2s_audio.html

— Reply to this email directly, view it on GitHub https://github.com/esphome/feature-requests/issues/599#issuecomment-1171992736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHFPJQKQ7WCKZURZB6R3A63VR2HCLANCNFSM4KTRFHIQ . You are receiving this because you commented.Message ID: @.***>

hmjvaline commented 2 years ago

已實現:https://esphome.io/components/media_player/i2s_audio.html

But only supports esp32, esp8266 can pass the Rx pin and a transistor and resistor, it can be realized without DAC. I also want to know how to combine I2S, MQTT (sayText), SAM, google translation TTS and other applications in esphome in esp8266, if it can, it is a beautiful thing

mrdiy-audio-notifier(MQTT control change content) mrdiy-audio-notifier+google translation TTS