Configured my S3-zero has per here. The speaker does work when configured with i2s directly without the adf_pipeline.

Relevant sections of the config ` substitutions:

pins

i2s_out_lrclk_pin: GPIO2 # LRC on Max98357 i2s_out_bclk_pin: GPIO3 # BCLK on Max98357 i2s_in_lrclk_pin: GPIO5 # WS on INMP441 i2s_in_bclk_pin: GPIO6 # SLK on INMP441 light_pin: GPIO21 # on-board LED speaker_pin: GPIO7 # DIN on Max98357 mic_pin: GPIO4 # SD on INMP441

esphome: name: kitchen-va friendly_name: Kitchen VA platformio_options: board_build.flash_mode: dio

esp32: board: esp32-s3-devkitc-1 variant: esp32s3 framework: type: esp-idf version: recommended sdkconfig_options: CONFIG_ESP32_S3_BOX_BOARD: "y"

psram: mode: quad speed: 80MHz

external_components:

source: type: git url: https://github.com/gnumpi/esphome_audio ref: main components:
- adf_pipeline
- i2s_audio refresh: 0s

adf_pipeline:

platform: i2s_audio type: sink id: adf_i2s_out i2s_audio_id: i2s_out i2s_dout_pin: ${speaker_pin}
platform: i2s_audio type: source id: adf_i2s_in i2s_audio_id: i2s_in i2s_din_pin: ${mic_pin} channel: left sample_rate: 16000 bits_per_sample: 16bit

microphone:

platform: adf_pipeline id: external_microphone pipeline:
- adf_i2s_in
- self

media_player:

platform: adf_pipeline id: external_speaker name: idf_player-mini internal: false pipeline:
- self
- adf_i2s_out

i2s_audio:

id: i2s_out i2s_lrclk_pin: ${i2s_out_lrclk_pin} i2s_bclk_pin: ${i2s_out_bclk_pin}
id: i2s_in i2s_lrclk_pin: ${i2s_in_lrclk_pin} i2s_bclk_pin: ${i2s_in_bclk_pin} `

Log `[12:39:10][D][micro_wake_word:362]: Wake word sliding average probability is 0.596 and most recent probability is 1.000 [12:39:10][D][micro_wake_word:128]: Wake Word Detected [12:39:10][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [12:39:10][D][micro_wake_word:134]: Stopping Microphone [12:39:10][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [12:39:10][D][esp_adf.microphone:060]: Mic state: 3 [12:39:10][D][esp_adf.microphone:061]: Pipeline State 4: [12:39:10][D][esp_adf.microphone:088]: Mic state: 3 [12:39:10][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [12:39:10][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [12:39:10][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [12:39:10][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [12:39:10][D][esp_adf_pipeline:381]: Called deinit_all [12:39:10][D][esp-idf:000]: I (114374) AUDIO_PIPELINE: audio_pipeline_unlinked

[12:39:10][D][esp-idf:000]: W (114377) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:10][D][esp-idf:000]: I (114381) I2S: DMA queue destroyed

[12:39:10][D][esp-idf:000]: W (114384) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:10][D][esp_adf.microphone:088]: Mic state: 0 [12:39:10][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE [12:39:10][D][media_player:059]: 'idf_player-mini' - Setting [12:39:10][D][media_player:066]: Media URL: https://:8123/local/va/awake.wav [12:39:10][D][adf_media_player:030]: Got control call in state 1 [12:39:10][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [12:39:10][D][esp-idf:000]: I (114435) MP3_DECODER: MP3 init

[12:39:10][D][i2s_audio:030]: Set I2S Lock Mode: 2 [12:39:10][D][esp-idf:000]: I (114444) I2S: DMA Malloc info, datalen=blocksize=3840, dma_buf_count=2

[12:39:10][D][esp_adf_pipeline:339]: pipeline tag 0, http [12:39:10][D][esp_adf_pipeline:339]: pipeline tag 1, decoder [12:39:11][D][esp_adf_pipeline:339]: pipeline tag 2, i2s_out [12:39:11][D][esp-idf:000]: I (114459) AUDIO_PIPELINE: link el->rb, el:0x3de1ff7c, tag:http, rb:0x3de20494

[12:39:11][D][esp-idf:000]: I (114463) AUDIO_PIPELINE: link el->rb, el:0x3de20130, tag:decoder, rb:0x3de208d4

[12:39:11][D][esp_adf_pipeline:351]: Setting up event listener. [12:39:11][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [12:39:11][I][adf_media_player:135]: got new pipeline state: 1 [12:39:11][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE [12:39:11][D][voice_assistant:422]: Desired state set to START_MICROPHONE [12:39:11][W][component:232]: Component micro_wake_word took a long time for an operation (73 ms). [12:39:11][W][component:233]: Components should block for at most 30 ms. [12:39:11][D][voice_assistant:118]: microphone not running [12:39:11][D][voice_assistant:202]: Requesting start... [12:39:11][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE [12:39:11][D][esp-idf:000]: I (114506) AUDIO_THREAD: The http task allocate stack on external memory

[12:39:11][D][esp-idf:000]: I (114512) AUDIO_ELEMENT: [http-0x3de1ff7c] Element task created

[12:39:11][D][esp-idf:000]: I (114516) AUDIO_THREAD: The decoder task allocate stack on external memory

[12:39:11][D][esp-idf:000]: I (114521) AUDIO_ELEMENT: [decoder-0x3de20130] Element task created

[12:39:11][D][esp-idf:000]: I (114526) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[12:39:11][D][esp-idf:000]: I (114530) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[12:39:11][D][esp_audio_sources:097]: Streamer status: 2 [12:39:11][D][esp_audio_sources:098]: decoder status: 5 [12:39:11][D][voice_assistant:437]: Client started, streaming microphone [12:39:11][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [12:39:11][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [12:39:11][D][voice_assistant:155]: Starting Microphone [12:39:11][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [12:39:11][D][i2s_audio:030]: Set I2S Lock Mode: 1 [12:39:11][D][esp-idf:000]: I (114585) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[12:39:11][D][esp_adf_pipeline:339]: pipeline tag 0, i2s_in [12:39:11][D][esp_adf_pipeline:339]: pipeline tag 1, pcm_reader [12:39:11][D][esp-idf:000]: I (114599) AUDIO_PIPELINE: link el->rb, el:0x3de23c98, tag:i2s_in, rb:0x3de23f24

[12:39:11][D][esp_adf_pipeline:351]: Setting up event listener. [12:39:11][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [12:39:11][D][esp_adf.microphone:060]: Mic state: 0 [12:39:11][D][esp_adf.microphone:061]: Pipeline State 1: [12:39:11][D][esp_adf.microphone:088]: Mic state: 0 [12:39:11][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [12:39:11][W][component:232]: Component voice_assistant took a long time for an operation (51 ms). [12:39:11][W][component:233]: Components should block for at most 30 ms. [12:39:11][D][i2s_audio:030]: Set I2S Lock Mode: 1 [12:39:11][D][esp_adf_pipeline:283]: State changed from PREPARING to STARTING [12:39:11][D][esp_adf.microphone:060]: Mic state: 1 [12:39:11][D][esp_adf.microphone:061]: Pipeline State 2: [12:39:11][D][esp_adf.microphone:088]: Mic state: 1 [12:39:11][D][esp-idf:000]: I (114652) AUDIO_ELEMENT: [i2s_in-0x3de23c98] Element task created

[12:39:11][D][esp-idf:000]: I (114656) AUDIO_ELEMENT: [pcm_reader-0x3de23ddc] Element task created

[12:39:11][D][esp-idf:000]: I (114659) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2096699 Bytes, Inter:149548 Bytes, Dram:149548 Bytes

[12:39:11][D][esp-idf:000]: I (114663) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[12:39:11][D][esp-idf:000]: I (114668) AUDIO_PIPELINE: Pipeline started

[12:39:11][D][voice_assistant:523]: Event Type: 1 [12:39:11][D][voice_assistant:526]: Assist Pipeline running [12:39:11][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [12:39:11][D][esp_adf_pipeline:125]: Check element [i2s_in] status, 3 [12:39:11][D][esp_adf_pipeline:125]: Check element [pcm_reader] status, 3 [12:39:11][D][esp_adf_pipeline:283]: State changed from STARTING to RUNNING [12:39:11][D][esp_adf.microphone:060]: Mic state: 1 [12:39:11][D][esp_adf.microphone:061]: Pipeline State 3: [12:39:11][D][esp_adf.microphone:088]: Mic state: 2 [12:39:11][D][voice_assistant:523]: Event Type: 3 [12:39:11][D][voice_assistant:537]: STT started [12:39:11][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [12:39:11][I][esp_adf_pipeline:200]: [ i2s_in ] status: 12 [12:39:11][D][light:036]: 'Status LED' Setting: [12:39:11][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[12:39:11][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [12:39:21][I][esp_adf_pipeline:171]: Pipeline preparation timeout! [12:39:21][D][esp_adf_pipeline:283]: State changed from PREPARING to STOPPING [12:39:21][I][adf_media_player:135]: got new pipeline state: 4 [12:39:21][D][esp-idf:000]: W (124482) AUDIO_PIPELINE: Without stop, st:1

[12:39:26][D][voice_assistant:523]: Event Type: 12 [12:39:26][D][voice_assistant:681]: STT by VAD end [12:39:26][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [12:39:26][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [12:39:26][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [12:39:26][D][esp_adf.microphone:060]: Mic state: 3 [12:39:26][D][esp_adf.microphone:061]: Pipeline State 4: [12:39:26][D][esp_adf.microphone:088]: Mic state: 3 [12:39:26][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [12:39:26][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [12:39:26][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [12:39:26][D][light:036]: 'Status LED' Setting: [12:39:26][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[12:39:26][I][esp_adf_pipeline:200]: [ i2s_in ] status: 14 [12:39:26][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [12:39:26][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [12:39:26][D][esp_adf_pipeline:381]: Called deinit_all [12:39:26][D][esp-idf:000]: I (130165) AUDIO_PIPELINE: audio_pipeline_unlinked

[12:39:26][D][esp-idf:000]: W (130168) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:26][D][esp-idf:000]: I (130172) I2S: DMA queue destroyed

[12:39:26][D][esp-idf:000]: W (130175) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:26][D][i2s_audio:045]: Release I2S Lock Mode: 1 [12:39:26][D][esp_adf_pipeline:283]: State changed from STOPPING to UNINITIALIZED [12:39:27][D][voice_assistant:523]: Event Type: 4 [12:39:27][D][voice_assistant:551]: Speech recognised as: " What is the weather?" [12:39:27][D][voice_assistant:523]: Event Type: 5 [12:39:27][D][voice_assistant:556]: Intent started [12:39:28][D][voice_assistant:523]: Event Type: 6 [12:39:28][D][voice_assistant:523]: Event Type: 7

[12:39:28][D][light:036]: 'Status LED' Setting: [12:39:28][D][light:051]: Brightness: 100% [12:39:28][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[12:39:28][D][voice_assistant:523]: Event Type: 8 [12:39:28][D][voice_assistant:599]: Response URL: "https:///api/tts_proxy/6e2e4e54058884224282e1e126e3541d73981250_en-gb_add2e9951e_tts.piper.mp3" [12:39:28][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [12:39:28][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE [12:39:28][D][media_player:059]: 'idf_player-mini' - Setting [12:39:28][D][media_player:066]: Media URL: https:///api/tts_proxy/6e2e4e54058884224282e1e126e3541d73981250_en-gb_add2e9951e_tts.piper.mp3 [12:39:28][D][adf_media_player:030]: Got control call in state 2 [12:39:28][D][esp_adf_pipeline:079]: Called 'stop' while in STOPPING state. [12:39:28][D][voice_assistant:523]: Event Type: 2 [12:39:28][D][voice_assistant:613]: Assist Pipeline ended [12:40:42][D][sensor:094]: 'WiFi Signal Sensor': Sending state -62.00000 dBm with 0 decimals of accuracy `

Thanks for reporting! Could you share your complete config file? If I interpret your log file correctly you are trying to play a wav file first, which results in an error from which the media player isn't able to recover and thats why the mp3 file later isn't playing either... Actually I never tried playing a wav file, thanks for pointing at it. Probably I need to switch the decoder component in that case. But anyhow the media player should reset to normal after an error occurs which it doesn't seem to do. I will look into it. Another thing, please wait until the booting has finished before starting micro_wake word or voice assistant. And you could try to also import the modified voice_assistant component in the same repository. It fixes and issue in the order of starting the VA pipeline when the microphone is not ready yet.

Could you try using a mp3 file for announcing that the wake word got detected?

I've remove the wav file completely and am now just trying to play re response

INFO ESPHome 2024.3.1 INFO Reading configuration /config/esphome/kitchen-va.yaml... INFO Updating @.*** WARNING GPIO3 is a strapping PIN and should only be used for I/O with care. Attaching external pullup/down resistors to strapping pins can cause unexpected failures. See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins INFO Starting log output from 192.168.2.41 using esphome API INFO Successfully connected to kitchen-va @ 192.168.2.41 in 0.024s INFO Successful handshake with kitchen-va @ 192.168.2.41 in 0.831s [15:31:38][I][app:102]: ESPHome version 2024.3.1 compiled on Mar 28 2024, 15:21:02

[15:31:38][C][wifi:408]: Local MAC: 34:B7:DA:52:FF:E8 [15:31:38][C][wifi:413]: SSID: [redacted] [15:31:38][C][wifi:416]: IP Address: 192.168.2.41 [15:31:38][C][wifi:420]: BSSID: [redacted]

[15:31:38][C][wifi:423]: Signal strength: -44 dB ▂▄▆█ [15:31:38][C][wifi:427]: Channel: 2 [15:31:38][C][wifi:428]: Subnet: 255.255.255.0 [15:31:38][C][wifi:429]: Gateway: 192.168.2.2 [15:31:38][C][wifi:430]: DNS1: 0.0.0.0 [15:31:38][C][wifi:431]: DNS2: 0.0.0.0

[15:31:38][C][logger:167]: Level: DEBUG [15:31:38][C][logger:169]: Log Baud Rate: 115200 [15:31:38][C][logger:170]: Hardware UART: USB_SERIAL_JTAG [15:31:38][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip: [15:31:38][C][esp32_rmt_led_strip:176]: Pin: 21 [15:31:38][C][esp32_rmt_led_strip:177]: Channel: 0 [15:31:38][C][esp32_rmt_led_strip:202]: RGB Order: GRB [15:31:38][C][esp32_rmt_led_strip:203]: Max refresh rate: 0 [15:31:38][C][esp32_rmt_led_strip:204]: Number of LEDs: 1

[15:31:38][C][light:105]: Default Transition Length: 0.0s [15:31:38][C][light:106]: Gamma Correct: 2.80 [15:31:38][C][template.switch:068]: Template Switch 'Enable Voice Assistant'

[15:31:38][C][template.switch:091]: Restore Mode: restore defaults to ON [15:31:38][C][template.switch:057]: Optimistic: YES

[15:31:38][C][psram:021]: Available: YES [15:31:38][C][psram:024]: Size: 2047 KB [15:31:38][C][restart:068]: Restart Switch 'Restart'

[15:31:38][C][restart:091]: Restore Mode: always OFF

[15:31:38][C][mdns:116]: Hostname: kitchen-va [15:31:38][C][ota:096]: Over-The-Air Updates: [15:31:38][C][ota:097]: Address: 192.168.2.41:3232 [15:31:38][C][ota:100]: Using Password. [15:31:38][C][ota:103]: OTA version: 2. [15:31:38][W][ota:107]: Last Boot was an unhandled reset, will proceed to safe mode in 5 restarts [15:31:38][C][api:139]: API Server: [15:31:38][C][api:140]: Address: 192.168.2.41:6053 [15:31:38][C][api:142]: Using noise encryption: YES [15:31:38][C][wifi_signal.sensor:009]: WiFi Signal 'WiFi Signal Sensor' [15:31:38][C][wifi_signal.sensor:009]: Device Class: 'signal_strength' [15:31:38][C][wifi_signal.sensor:009]: State Class: 'measurement' [15:31:38][C][wifi_signal.sensor:009]: Unit of Measurement: 'dBm' [15:31:38][C][wifi_signal.sensor:009]: Accuracy Decimals: 0

[15:31:39][C][micro_wake_word:058]: Wake Word: okay nabu [15:31:39][C][micro_wake_word:059]: Probability cutoff: 0.500 [15:31:39][C][micro_wake_word:060]: Sliding window size: 10

[15:31:39][C][adf_media_player:018]: Number of ASPComponents: 2 [15:31:42][D][micro_wake_word:362]: Wake word sliding average probability is 0.581 and most recent probability is 1.000 [15:31:42][D][micro_wake_word:128]: Wake Word Detected [15:31:42][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [15:31:42][D][micro_wake_word:134]: Stopping Microphone [15:31:42][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [15:31:42][D][esp_adf.microphone:060]: Mic state: 3 [15:31:42][D][esp_adf.microphone:061]: Pipeline State 4: [15:31:42][D][esp_adf.microphone:088]: Mic state: 3 [15:31:42][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [15:31:42][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [15:31:42][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [15:31:42][D][esp-idf:000]: W (33199) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][D][esp-idf:000]: W (33201) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][D][esp-idf:000]: W (33204) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][D][esp-idf:000]: W (33206) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][I][esp_adf_pipeline:200]: [ i2s_in ] status: 14 [15:31:44][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [15:31:44][D][voice_assistant:437]: Client started, streaming microphone [15:31:44][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [15:31:44][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [15:31:44][D][voice_assistant:155]: Starting Microphone [15:31:44][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [15:31:44][D][i2s_audio:030]: Set I2S Lock Mode: 1 [15:31:44][D][esp-idf:000]: I (35421) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[15:31:44][D][esp_adf_pipeline:339]: pipeline tag 0, i2s_in [15:31:44][D][esp_adf_pipeline:339]: pipeline tag 1, pcm_reader [15:31:44][D][esp-idf:000]: I (35428) AUDIO_PIPELINE: link el->rb, el:0x3de20008, tag:i2s_in, rb:0x3de20294

[15:31:44][D][esp_adf_pipeline:351]: Setting up event listener. [15:31:44][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [15:31:44][D][esp_adf.microphone:060]: Mic state: 0 [15:31:44][D][esp_adf.microphone:061]: Pipeline State 1: [15:31:44][D][esp_adf.microphone:088]: Mic state: 0 [15:31:44][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [15:31:44][D][esp_adf.microphone:060]: Mic state: 1 [15:31:44][D][esp_adf.microphone:061]: Pipeline State 2: [15:31:44][D][esp_adf.microphone:088]: Mic state: 1 [15:31:44][D][esp-idf:000]: I (35453) AUDIO_ELEMENT: [i2s_in-0x3de20008] Element task created

[15:31:44][D][esp-idf:000]: I (35457) AUDIO_ELEMENT: [pcm_reader-0x3de2014c] Element task created

[15:31:44][D][esp-idf:000]: I (35461) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2147307 Bytes, Inter:184584 Bytes, Dram:184584 Bytes

[15:31:44][D][esp-idf:000]: I (35464) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[15:31:44][D][esp-idf:000]: I (35467) AUDIO_PIPELINE: Pipeline started

[15:31:44][D][voice_assistant:523]: Event Type: 1 [15:31:44][D][voice_assistant:526]: Assist Pipeline running [15:31:44][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [15:31:44][D][esp_adf_pipeline:125]: Check element [i2s_in] status, 3 [15:31:44][D][esp_adf_pipeline:125]: Check element [pcm_reader] status, 3 [15:31:44][D][esp_adf_pipeline:283]: State changed from STARTING to RUNNING [15:31:44][D][esp_adf.microphone:060]: Mic state: 1 [15:31:44][D][esp_adf.microphone:061]: Pipeline State 3: [15:31:44][D][esp_adf.microphone:088]: Mic state: 2 [15:31:44][D][voice_assistant:523]: Event Type: 3 [15:31:44][D][voice_assistant:537]: STT started [15:31:44][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [15:31:44][I][esp_adf_pipeline:200]: [ i2s_in ] status: 12 [15:31:44][D][light:036]: 'Status LED' Setting: [15:31:44][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:31:44][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [15:31:46][D][voice_assistant:523]: Event Type: 11 [15:31:46][D][voice_assistant:677]: Starting STT by VAD [15:31:48][D][voice_assistant:523]: Event Type: 12 [15:31:48][D][voice_assistant:681]: STT by VAD end [15:31:48][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [15:31:48][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [15:31:48][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [15:31:48][D][esp_adf.microphone:060]: Mic state: 3 [15:31:48][D][esp_adf.microphone:061]: Pipeline State 4: [15:31:48][D][esp_adf.microphone:088]: Mic state: 3 [15:31:48][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [15:31:48][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [15:31:48][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [15:31:48][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [15:31:48][D][esp_adf_pipeline:381]: Called deinit_all [15:31:48][D][esp-idf:000]: I (38859) AUDIO_PIPELINE: audio_pipeline_unlinked

[15:31:48][D][esp-idf:000]: W (38862) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[15:31:48][D][esp-idf:000]: I (38865) I2S: DMA queue destroyed

[15:31:48][D][esp-idf:000]: W (38868) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[15:31:48][D][i2s_audio:045]: Release I2S Lock Mode: 1 [15:31:48][D][esp_adf_pipeline:283]: State changed from STOPPING to UNINITIALIZED [15:31:48][D][esp_adf.microphone:060]: Mic state: 3 [15:31:48][D][esp_adf.microphone:061]: Pipeline State 0: [15:31:48][D][esp_adf.microphone:088]: Mic state: 0 [15:31:48][D][light:036]: 'Status LED' Setting: [15:31:48][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:31:48][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE [15:31:48][D][voice_assistant:523]: Event Type: 4 [15:31:48][D][voice_assistant:551]: Speech recognised as: " What is the weather?" [15:31:48][D][voice_assistant:523]: Event Type: 5 [15:31:48][D][voice_assistant:556]: Intent started [15:31:49][D][voice_assistant:523]: Event Type: 6 [15:31:49][D][voice_assistant:523]: Event Type: 7

[15:31:49][D][light:036]: 'Status LED' Setting: [15:31:49][D][light:051]: Brightness: 100% [15:31:49][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:31:50][D][voice_assistant:523]: Event Type: 8 [15:31:50][D][voice_assistant:599]: Response URL: "https://

/api/tts_proxy/af42191a9a59e7b1585205c706f22365b925f3ba_en-gb_add2e9951e_tts.piper.mp3" [15:31:50][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [15:31:50][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE [15:31:50][D][media_player:059]: 'idf_player-mini' - Setting [15:31:50][D][media_player:066]: Media URL: https:// /api/tts_proxy/af42191a9a59e7b1585205c706f22365b925f3ba_en-gb_add2e9951e_tts.piper.mp3 [15:31:50][D][adf_media_player:030]: Got control call in state 1 [15:31:50][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [15:31:50][D][esp-idf:000]: I (40758) MP3_DECODER: MP3 init [15:31:50][D][i2s_audio:030]: Set I2S Lock Mode: 2 [15:31:50][D][esp-idf:000]: I (40764) I2S: DMA Malloc info, datalen=blocksize=3840, dma_buf_count=2 [15:31:50][D][esp_adf_pipeline:339]: pipeline tag 0, http [15:31:50][D][esp_adf_pipeline:339]: pipeline tag 1, decoder [15:31:50][D][esp_adf_pipeline:339]: pipeline tag 2, i2s_out [15:31:50][D][esp-idf:000]: I (40774) AUDIO_PIPELINE: link el->rb, el:0x3de1ff7c, tag:http, rb:0x3de204d4 [15:31:50][D][esp-idf:000]: I (40776) AUDIO_PIPELINE: link el->rb, el:0x3de20170, tag:decoder, rb:0x3de20914 [15:31:50][D][esp_adf_pipeline:351]: Setting up event listener. ERROR Fatal error: protocol.data_received() call failed. protocol: transport: <_SelectorSocketTransport fd=6 read=polling write=> Traceback (most recent call last): File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received self._protocol.data_received(data) File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data cryptography.exceptions.InvalidTag WARNING kitchen-va @ 192.168.2.41: Connection error occurred: kitchen-va @ 192.168.2.41: Invalid encryption key: received_name=kitchen-va INFO Processing unexpected disconnect from ESPHome API for kitchen-va @ 192.168.2.41 WARNING Disconnected from API INFO Successfully connected to kitchen-va @ 192.168.2.41 in 0.044s INFO Successful handshake with kitchen-va @ 192.168.2.41 in 0.333s [15:32:00][I][esp_adf_pipeline:171]: Pipeline preparation timeout! [15:32:00][D][esp_adf_pipeline:283]: State changed from PREPARING to STOPPING [15:32:00][I][adf_media_player:135]: got new pipeline state: 4 [15:32:00][D][esp-idf:000]: W (50787) AUDIO_PIPELINE: Without stop, st:1 On Thu, Mar 28, 2024 at 3:15 PM Mischa Siekmann ***@***.***> wrote: > Could you try using a mp3 file for announcing that the wake word got > detected? > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > -- ~Steve

somehow it can't access your HA to receive the tts response, never seen this before. Sorry I don't have an explanation for it right now. It says invalid encryption key though, is everything else related to HA working fine?

That was a one-off issue that hasn't come back. I am still not getting audio. Here is a log without the other issue

[19:40:06][D][micro_wake_word:362]: Wake word sliding average probability is 0.522 and most recent probability is 1.000 [19:40:06][D][micro_wake_word:128]: Wake Word Detected [19:40:06][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [19:40:06][D][micro_wake_word:134]: Stopping Microphone [19:40:06][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [19:40:06][D][esp_adf.microphone:060]: Mic state: 3 [19:40:06][D][esp_adf.microphone:061]: Pipeline State 4: [19:40:06][D][esp_adf.microphone:088]: Mic state: 3 [19:40:06][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [19:40:06][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [19:40:06][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [19:40:06][D][esp-idf:000]: W (11394491) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][D][esp-idf:000]: W (11394493) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][D][esp-idf:000]: W (11394496) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][D][esp-idf:000]: W (11394499) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][I][esp_adf_pipeline:200]: [ i2s_in ] status: 14 [19:40:09][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [19:40:09][D][voice_assistant:118]: microphone not running [19:40:09][D][voice_assistant:437]: Client started, streaming microphone [19:40:09][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [19:40:09][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [19:40:09][D][voice_assistant:155]: Starting Microphone [19:40:09][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [19:40:09][D][i2s_audio:030]: Set I2S Lock Mode: 1 [19:40:09][D][esp-idf:000]: I (11397936) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[19:40:09][D][esp_adf_pipeline:339]: pipeline tag 0, i2s_in [19:40:09][D][esp_adf_pipeline:339]: pipeline tag 1, pcm_reader [19:40:09][D][esp-idf:000]: I (11397944) AUDIO_PIPELINE: link el->rb, el:0x3de23c9c, tag:i2s_in, rb:0x3de23f28

[19:40:09][D][esp_adf_pipeline:351]: Setting up event listener. [19:40:09][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [19:40:09][D][esp_adf.microphone:060]: Mic state: 0 [19:40:09][D][esp_adf.microphone:061]: Pipeline State 1: [19:40:09][D][esp_adf.microphone:088]: Mic state: 0 [19:40:09][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [19:40:09][D][i2s_audio:030]: Set I2S Lock Mode: 1 [19:40:09][D][esp_adf_pipeline:283]: State changed from PREPARING to STARTING [19:40:09][D][esp_adf.microphone:060]: Mic state: 1 [19:40:09][D][esp_adf.microphone:061]: Pipeline State 2: [19:40:09][D][esp_adf.microphone:088]: Mic state: 1 [19:40:09][D][esp-idf:000]: I (11397972) AUDIO_ELEMENT: [i2s_in-0x3de23c9c] Element task created

[19:40:09][D][esp-idf:000]: I (11397975) AUDIO_ELEMENT: [pcm_reader-0x3de23de0] Element task created

[19:40:09][D][esp-idf:000]: I (11397978) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2119039 Bytes, Inter:171892 Bytes, Dram:171892 Bytes

[19:40:09][D][esp-idf:000]: I (11397981) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[19:40:09][D][esp-idf:000]: I (11397984) AUDIO_PIPELINE: Pipeline started

[19:40:09][D][voice_assistant:523]: Event Type: 1 [19:40:09][D][voice_assistant:526]: Assist Pipeline running [19:40:09][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [19:40:09][D][esp_adf_pipeline:125]: Check element [i2s_in] status, 3 [19:40:09][D][esp_adf_pipeline:125]: Check element [pcm_reader] status, 3 [19:40:09][D][esp_adf_pipeline:283]: State changed from STARTING to RUNNING [19:40:09][D][esp_adf.microphone:060]: Mic state: 1 [19:40:09][D][esp_adf.microphone:061]: Pipeline State 3: [19:40:09][D][esp_adf.microphone:088]: Mic state: 2 [19:40:09][D][voice_assistant:523]: Event Type: 3 [19:40:09][D][voice_assistant:537]: STT started [19:40:09][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [19:40:09][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [19:40:09][D][light:036]: 'Status LED' Setting: [19:40:09][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[19:40:09][I][esp_adf_pipeline:200]: [ i2s_in ] status: 12 [19:40:16][I][esp_adf_pipeline:171]: Pipeline preparation timeout! [19:40:16][D][esp_adf_pipeline:283]: State changed from PREPARING to STOPPING [19:40:16][I][adf_media_player:135]: got new pipeline state: 4 [19:40:16][D][esp-idf:000]: W (11404582) AUDIO_PIPELINE: Without stop, st:1

[19:40:26][D][voice_assistant:523]: Event Type: 11 [19:40:26][D][voice_assistant:677]: Starting STT by VAD [19:40:26][D][voice_assistant:523]: Event Type: 12 [19:40:26][D][voice_assistant:681]: STT by VAD end [19:40:26][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [19:40:26][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [19:40:26][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [19:40:26][D][esp_adf.microphone:060]: Mic state: 3 [19:40:26][D][esp_adf.microphone:061]: Pipeline State 4: [19:40:26][D][esp_adf.microphone:088]: Mic state: 3 [19:40:26][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [19:40:26][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [19:40:26][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [19:40:26][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [19:40:26][D][esp_adf_pipeline:381]: Called deinit_all [19:40:26][D][esp-idf:000]: I (11414328) AUDIO_PIPELINE: audio_pipeline_unlinked

[19:40:26][D][esp-idf:000]: W (11414330) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[19:40:26][D][esp-idf:000]: I (11414334) I2S: DMA queue destroyed

[19:40:26][D][esp-idf:000]: W (11414336) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[19:40:26][D][i2s_audio:045]: Release I2S Lock Mode: 1 [19:40:26][D][esp_adf_pipeline:283]: State changed from STOPPING to UNINITIALIZED [19:40:26][D][esp_adf.microphone:060]: Mic state: 3 [19:40:26][D][esp_adf.microphone:061]: Pipeline State 0: [19:40:26][D][esp_adf.microphone:088]: Mic state: 0 [19:40:26][D][light:036]: 'Status LED' Setting: [19:40:26][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[19:40:26][D][voice_assistant:523]: Event Type: 4 [19:40:26][D][voice_assistant:551]: Speech recognised as: " What is the temperature?" [19:40:26][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE [19:40:26][D][voice_assistant:523]: Event Type: 5 [19:40:26][D][voice_assistant:556]: Intent started [19:40:26][D][voice_assistant:523]: Event Type: 6 [19:40:26][D][voice_assistant:523]: Event Type: 7

[19:40:26][D][light:036]: 'Status LED' Setting: [19:40:26][D][light:051]: Brightness: 100% [19:40:26][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[19:40:26][D][voice_assistant:523]: Event Type: 8 [19:40:26][D][voice_assistant:599]: Response URL: "https://

k/api/tts_proxy/557d14b3a74d89cae750c0b48ba22c88ad8f62dc_en-gb_add2e9951e_tts.piper.mp3" [19:40:26][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [19:40:26][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE [19:40:26][D][media_player:059]: 'idf_player-mini' - Setting [19:40:26][D][media_player:066]: Media URL: https:// k/api/tts_proxy/557d14b3a74d89cae750c0b48ba22c88ad8f62dc_en-gb_add2e9951e_tts.piper.mp3 [19:40:26][D][adf_media_player:030]: Got control call in state 2 [19:40:26][D][esp_adf_pipeline:079]: Called 'stop' while in STOPPING state. [19:40:26][D][voice_assistant:523]: Event Type: 2 [19:40:26][D][voice_assistant:613]: Assist Pipeline ended On Thu, Mar 28, 2024 at 4:52 PM Mischa Siekmann ***@***.***> wrote: > somehow it can't access your HA to receive the tts response, never seen > this before. Sorry I don't have an explanation for it right now. > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > -- ~Steve

It's always DNS, I had a static IP in the config and didn't secify DNS.

gnumpi / esphome_audio

Media Player does not play audio #24

pins