gnumpi / esphome_audio

Custom audio components for ESPHome
Other
20 stars 8 forks source link

Media Player does not play audio #24

Closed steku closed 1 month ago

steku commented 2 months ago

Configured my S3-zero has per here. The speaker does work when configured with i2s directly without the adf_pipeline.

Relevant sections of the config ` substitutions:

pins

i2s_out_lrclk_pin: GPIO2 # LRC on Max98357 i2s_out_bclk_pin: GPIO3 # BCLK on Max98357 i2s_in_lrclk_pin: GPIO5 # WS on INMP441 i2s_in_bclk_pin: GPIO6 # SLK on INMP441 light_pin: GPIO21 # on-board LED speaker_pin: GPIO7 # DIN on Max98357 mic_pin: GPIO4 # SD on INMP441

esphome: name: kitchen-va friendly_name: Kitchen VA platformio_options: board_build.flash_mode: dio

esp32: board: esp32-s3-devkitc-1 variant: esp32s3 framework: type: esp-idf version: recommended sdkconfig_options: CONFIG_ESP32_S3_BOX_BOARD: "y"

psram: mode: quad speed: 80MHz

external_components:

adf_pipeline:

microphone:

media_player:

i2s_audio:

Log `[12:39:10][D][micro_wake_word:362]: Wake word sliding average probability is 0.596 and most recent probability is 1.000 [12:39:10][D][micro_wake_word:128]: Wake Word Detected [12:39:10][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [12:39:10][D][micro_wake_word:134]: Stopping Microphone [12:39:10][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [12:39:10][D][esp_adf.microphone:060]: Mic state: 3 [12:39:10][D][esp_adf.microphone:061]: Pipeline State 4: [12:39:10][D][esp_adf.microphone:088]: Mic state: 3 [12:39:10][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [12:39:10][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [12:39:10][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [12:39:10][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [12:39:10][D][esp_adf_pipeline:381]: Called deinit_all [12:39:10][D][esp-idf:000]: I (114374) AUDIO_PIPELINE: audio_pipeline_unlinked

[12:39:10][D][esp-idf:000]: W (114377) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:10][D][esp-idf:000]: I (114381) I2S: DMA queue destroyed

[12:39:10][D][esp-idf:000]: W (114384) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:10][D][esp_adf.microphone:088]: Mic state: 0 [12:39:10][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE [12:39:10][D][media_player:059]: 'idf_player-mini' - Setting [12:39:10][D][media_player:066]: Media URL: https://:8123/local/va/awake.wav [12:39:10][D][adf_media_player:030]: Got control call in state 1 [12:39:10][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [12:39:10][D][esp-idf:000]: I (114435) MP3_DECODER: MP3 init

[12:39:10][D][i2s_audio:030]: Set I2S Lock Mode: 2 [12:39:10][D][esp-idf:000]: I (114444) I2S: DMA Malloc info, datalen=blocksize=3840, dma_buf_count=2

[12:39:10][D][esp_adf_pipeline:339]: pipeline tag 0, http [12:39:10][D][esp_adf_pipeline:339]: pipeline tag 1, decoder [12:39:11][D][esp_adf_pipeline:339]: pipeline tag 2, i2s_out [12:39:11][D][esp-idf:000]: I (114459) AUDIO_PIPELINE: link el->rb, el:0x3de1ff7c, tag:http, rb:0x3de20494

[12:39:11][D][esp-idf:000]: I (114463) AUDIO_PIPELINE: link el->rb, el:0x3de20130, tag:decoder, rb:0x3de208d4

[12:39:11][D][esp_adf_pipeline:351]: Setting up event listener. [12:39:11][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [12:39:11][I][adf_media_player:135]: got new pipeline state: 1 [12:39:11][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE [12:39:11][D][voice_assistant:422]: Desired state set to START_MICROPHONE [12:39:11][W][component:232]: Component micro_wake_word took a long time for an operation (73 ms). [12:39:11][W][component:233]: Components should block for at most 30 ms. [12:39:11][D][voice_assistant:118]: microphone not running [12:39:11][D][voice_assistant:202]: Requesting start... [12:39:11][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE [12:39:11][D][esp-idf:000]: I (114506) AUDIO_THREAD: The http task allocate stack on external memory

[12:39:11][D][esp-idf:000]: I (114512) AUDIO_ELEMENT: [http-0x3de1ff7c] Element task created

[12:39:11][D][esp-idf:000]: I (114516) AUDIO_THREAD: The decoder task allocate stack on external memory

[12:39:11][D][esp-idf:000]: I (114521) AUDIO_ELEMENT: [decoder-0x3de20130] Element task created

[12:39:11][D][esp-idf:000]: I (114526) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[12:39:11][D][esp-idf:000]: I (114530) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[12:39:11][D][esp_audio_sources:097]: Streamer status: 2 [12:39:11][D][esp_audio_sources:098]: decoder status: 5 [12:39:11][D][voice_assistant:437]: Client started, streaming microphone [12:39:11][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [12:39:11][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [12:39:11][D][voice_assistant:155]: Starting Microphone [12:39:11][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [12:39:11][D][i2s_audio:030]: Set I2S Lock Mode: 1 [12:39:11][D][esp-idf:000]: I (114585) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[12:39:11][D][esp_adf_pipeline:339]: pipeline tag 0, i2s_in [12:39:11][D][esp_adf_pipeline:339]: pipeline tag 1, pcm_reader [12:39:11][D][esp-idf:000]: I (114599) AUDIO_PIPELINE: link el->rb, el:0x3de23c98, tag:i2s_in, rb:0x3de23f24

[12:39:11][D][esp_adf_pipeline:351]: Setting up event listener. [12:39:11][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [12:39:11][D][esp_adf.microphone:060]: Mic state: 0 [12:39:11][D][esp_adf.microphone:061]: Pipeline State 1: [12:39:11][D][esp_adf.microphone:088]: Mic state: 0 [12:39:11][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [12:39:11][W][component:232]: Component voice_assistant took a long time for an operation (51 ms). [12:39:11][W][component:233]: Components should block for at most 30 ms. [12:39:11][D][i2s_audio:030]: Set I2S Lock Mode: 1 [12:39:11][D][esp_adf_pipeline:283]: State changed from PREPARING to STARTING [12:39:11][D][esp_adf.microphone:060]: Mic state: 1 [12:39:11][D][esp_adf.microphone:061]: Pipeline State 2: [12:39:11][D][esp_adf.microphone:088]: Mic state: 1 [12:39:11][D][esp-idf:000]: I (114652) AUDIO_ELEMENT: [i2s_in-0x3de23c98] Element task created

[12:39:11][D][esp-idf:000]: I (114656) AUDIO_ELEMENT: [pcm_reader-0x3de23ddc] Element task created

[12:39:11][D][esp-idf:000]: I (114659) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2096699 Bytes, Inter:149548 Bytes, Dram:149548 Bytes

[12:39:11][D][esp-idf:000]: I (114663) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[12:39:11][D][esp-idf:000]: I (114668) AUDIO_PIPELINE: Pipeline started

[12:39:11][D][voice_assistant:523]: Event Type: 1 [12:39:11][D][voice_assistant:526]: Assist Pipeline running [12:39:11][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [12:39:11][D][esp_adf_pipeline:125]: Check element [i2s_in] status, 3 [12:39:11][D][esp_adf_pipeline:125]: Check element [pcm_reader] status, 3 [12:39:11][D][esp_adf_pipeline:283]: State changed from STARTING to RUNNING [12:39:11][D][esp_adf.microphone:060]: Mic state: 1 [12:39:11][D][esp_adf.microphone:061]: Pipeline State 3: [12:39:11][D][esp_adf.microphone:088]: Mic state: 2 [12:39:11][D][voice_assistant:523]: Event Type: 3 [12:39:11][D][voice_assistant:537]: STT started [12:39:11][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [12:39:11][I][esp_adf_pipeline:200]: [ i2s_in ] status: 12 [12:39:11][D][light:036]: 'Status LED' Setting: [12:39:11][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[12:39:11][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [12:39:21][I][esp_adf_pipeline:171]: Pipeline preparation timeout! [12:39:21][D][esp_adf_pipeline:283]: State changed from PREPARING to STOPPING [12:39:21][I][adf_media_player:135]: got new pipeline state: 4 [12:39:21][D][esp-idf:000]: W (124482) AUDIO_PIPELINE: Without stop, st:1

[12:39:26][D][voice_assistant:523]: Event Type: 12 [12:39:26][D][voice_assistant:681]: STT by VAD end [12:39:26][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [12:39:26][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [12:39:26][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [12:39:26][D][esp_adf.microphone:060]: Mic state: 3 [12:39:26][D][esp_adf.microphone:061]: Pipeline State 4: [12:39:26][D][esp_adf.microphone:088]: Mic state: 3 [12:39:26][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [12:39:26][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [12:39:26][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [12:39:26][D][light:036]: 'Status LED' Setting: [12:39:26][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[12:39:26][I][esp_adf_pipeline:200]: [ i2s_in ] status: 14 [12:39:26][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [12:39:26][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [12:39:26][D][esp_adf_pipeline:381]: Called deinit_all [12:39:26][D][esp-idf:000]: I (130165) AUDIO_PIPELINE: audio_pipeline_unlinked

[12:39:26][D][esp-idf:000]: W (130168) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:26][D][esp-idf:000]: I (130172) I2S: DMA queue destroyed

[12:39:26][D][esp-idf:000]: W (130175) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[12:39:26][D][i2s_audio:045]: Release I2S Lock Mode: 1 [12:39:26][D][esp_adf_pipeline:283]: State changed from STOPPING to UNINITIALIZED [12:39:27][D][voice_assistant:523]: Event Type: 4 [12:39:27][D][voice_assistant:551]: Speech recognised as: " What is the weather?" [12:39:27][D][voice_assistant:523]: Event Type: 5 [12:39:27][D][voice_assistant:556]: Intent started [12:39:28][D][voice_assistant:523]: Event Type: 6 [12:39:28][D][voice_assistant:523]: Event Type: 7

[12:39:28][D][light:036]: 'Status LED' Setting: [12:39:28][D][light:051]: Brightness: 100% [12:39:28][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[12:39:28][D][voice_assistant:523]: Event Type: 8 [12:39:28][D][voice_assistant:599]: Response URL: "https:///api/tts_proxy/6e2e4e54058884224282e1e126e3541d73981250_en-gb_add2e9951e_tts.piper.mp3" [12:39:28][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [12:39:28][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE [12:39:28][D][media_player:059]: 'idf_player-mini' - Setting [12:39:28][D][media_player:066]: Media URL: https:///api/tts_proxy/6e2e4e54058884224282e1e126e3541d73981250_en-gb_add2e9951e_tts.piper.mp3 [12:39:28][D][adf_media_player:030]: Got control call in state 2 [12:39:28][D][esp_adf_pipeline:079]: Called 'stop' while in STOPPING state. [12:39:28][D][voice_assistant:523]: Event Type: 2 [12:39:28][D][voice_assistant:613]: Assist Pipeline ended [12:40:42][D][sensor:094]: 'WiFi Signal Sensor': Sending state -62.00000 dBm with 0 decimals of accuracy `

gnumpi commented 2 months ago

Thanks for reporting! Could you share your complete config file? If I interpret your log file correctly you are trying to play a wav file first, which results in an error from which the media player isn't able to recover and thats why the mp3 file later isn't playing either... Actually I never tried playing a wav file, thanks for pointing at it. Probably I need to switch the decoder component in that case. But anyhow the media player should reset to normal after an error occurs which it doesn't seem to do. I will look into it. Another thing, please wait until the booting has finished before starting micro_wake word or voice assistant. And you could try to also import the modified voice_assistant component in the same repository. It fixes and issue in the order of starting the VA pipeline when the microphone is not ready yet.

gnumpi commented 2 months ago

Could you try using a mp3 file for announcing that the wake word got detected?

steku commented 2 months ago

I've remove the wav file completely and am now just trying to play re response

INFO ESPHome 2024.3.1 INFO Reading configuration /config/esphome/kitchen-va.yaml... INFO Updating @.*** WARNING GPIO3 is a strapping PIN and should only be used for I/O with care. Attaching external pullup/down resistors to strapping pins can cause unexpected failures. See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins INFO Starting log output from 192.168.2.41 using esphome API INFO Successfully connected to kitchen-va @ 192.168.2.41 in 0.024s INFO Successful handshake with kitchen-va @ 192.168.2.41 in 0.831s [15:31:38][I][app:102]: ESPHome version 2024.3.1 compiled on Mar 28 2024, 15:21:02

[15:31:38][C][wifi:408]: Local MAC: 34:B7:DA:52:FF:E8 [15:31:38][C][wifi:413]: SSID: [redacted] [15:31:38][C][wifi:416]: IP Address: 192.168.2.41 [15:31:38][C][wifi:420]: BSSID: [redacted]

[15:31:38][C][wifi:423]: Signal strength: -44 dB ▂▄▆█ [15:31:38][C][wifi:427]: Channel: 2 [15:31:38][C][wifi:428]: Subnet: 255.255.255.0 [15:31:38][C][wifi:429]: Gateway: 192.168.2.2 [15:31:38][C][wifi:430]: DNS1: 0.0.0.0 [15:31:38][C][wifi:431]: DNS2: 0.0.0.0

[15:31:38][C][logger:167]: Level: DEBUG [15:31:38][C][logger:169]: Log Baud Rate: 115200 [15:31:38][C][logger:170]: Hardware UART: USB_SERIAL_JTAG [15:31:38][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip: [15:31:38][C][esp32_rmt_led_strip:176]: Pin: 21 [15:31:38][C][esp32_rmt_led_strip:177]: Channel: 0 [15:31:38][C][esp32_rmt_led_strip:202]: RGB Order: GRB [15:31:38][C][esp32_rmt_led_strip:203]: Max refresh rate: 0 [15:31:38][C][esp32_rmt_led_strip:204]: Number of LEDs: 1

[15:31:38][C][light:105]: Default Transition Length: 0.0s [15:31:38][C][light:106]: Gamma Correct: 2.80 [15:31:38][C][template.switch:068]: Template Switch 'Enable Voice Assistant'

[15:31:38][C][template.switch:091]: Restore Mode: restore defaults to ON [15:31:38][C][template.switch:057]: Optimistic: YES

[15:31:38][C][psram:021]: Available: YES [15:31:38][C][psram:024]: Size: 2047 KB [15:31:38][C][restart:068]: Restart Switch 'Restart'

[15:31:38][C][restart:091]: Restore Mode: always OFF

[15:31:38][C][mdns:116]: Hostname: kitchen-va [15:31:38][C][ota:096]: Over-The-Air Updates: [15:31:38][C][ota:097]: Address: 192.168.2.41:3232 [15:31:38][C][ota:100]: Using Password. [15:31:38][C][ota:103]: OTA version: 2. [15:31:38][W][ota:107]: Last Boot was an unhandled reset, will proceed to safe mode in 5 restarts [15:31:38][C][api:139]: API Server: [15:31:38][C][api:140]: Address: 192.168.2.41:6053 [15:31:38][C][api:142]: Using noise encryption: YES [15:31:38][C][wifi_signal.sensor:009]: WiFi Signal 'WiFi Signal Sensor' [15:31:38][C][wifi_signal.sensor:009]: Device Class: 'signal_strength' [15:31:38][C][wifi_signal.sensor:009]: State Class: 'measurement' [15:31:38][C][wifi_signal.sensor:009]: Unit of Measurement: 'dBm' [15:31:38][C][wifi_signal.sensor:009]: Accuracy Decimals: 0

[15:31:39][C][micro_wake_word:058]: Wake Word: okay nabu [15:31:39][C][micro_wake_word:059]: Probability cutoff: 0.500 [15:31:39][C][micro_wake_word:060]: Sliding window size: 10

[15:31:39][C][adf_media_player:018]: Number of ASPComponents: 2 [15:31:42][D][micro_wake_word:362]: Wake word sliding average probability is 0.581 and most recent probability is 1.000 [15:31:42][D][micro_wake_word:128]: Wake Word Detected [15:31:42][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [15:31:42][D][micro_wake_word:134]: Stopping Microphone [15:31:42][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [15:31:42][D][esp_adf.microphone:060]: Mic state: 3 [15:31:42][D][esp_adf.microphone:061]: Pipeline State 4: [15:31:42][D][esp_adf.microphone:088]: Mic state: 3 [15:31:42][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [15:31:42][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [15:31:42][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [15:31:42][D][esp-idf:000]: W (33199) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][D][esp-idf:000]: W (33201) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][D][esp-idf:000]: W (33204) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][D][esp-idf:000]: W (33206) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[15:31:42][I][esp_adf_pipeline:200]: [ i2s_in ] status: 14 [15:31:44][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [15:31:44][D][voice_assistant:437]: Client started, streaming microphone [15:31:44][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [15:31:44][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [15:31:44][D][voice_assistant:155]: Starting Microphone [15:31:44][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [15:31:44][D][i2s_audio:030]: Set I2S Lock Mode: 1 [15:31:44][D][esp-idf:000]: I (35421) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[15:31:44][D][esp_adf_pipeline:339]: pipeline tag 0, i2s_in [15:31:44][D][esp_adf_pipeline:339]: pipeline tag 1, pcm_reader [15:31:44][D][esp-idf:000]: I (35428) AUDIO_PIPELINE: link el->rb, el:0x3de20008, tag:i2s_in, rb:0x3de20294

[15:31:44][D][esp_adf_pipeline:351]: Setting up event listener. [15:31:44][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [15:31:44][D][esp_adf.microphone:060]: Mic state: 0 [15:31:44][D][esp_adf.microphone:061]: Pipeline State 1: [15:31:44][D][esp_adf.microphone:088]: Mic state: 0 [15:31:44][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [15:31:44][D][esp_adf.microphone:060]: Mic state: 1 [15:31:44][D][esp_adf.microphone:061]: Pipeline State 2: [15:31:44][D][esp_adf.microphone:088]: Mic state: 1 [15:31:44][D][esp-idf:000]: I (35453) AUDIO_ELEMENT: [i2s_in-0x3de20008] Element task created

[15:31:44][D][esp-idf:000]: I (35457) AUDIO_ELEMENT: [pcm_reader-0x3de2014c] Element task created

[15:31:44][D][esp-idf:000]: I (35461) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2147307 Bytes, Inter:184584 Bytes, Dram:184584 Bytes

[15:31:44][D][esp-idf:000]: I (35464) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[15:31:44][D][esp-idf:000]: I (35467) AUDIO_PIPELINE: Pipeline started

[15:31:44][D][voice_assistant:523]: Event Type: 1 [15:31:44][D][voice_assistant:526]: Assist Pipeline running [15:31:44][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [15:31:44][D][esp_adf_pipeline:125]: Check element [i2s_in] status, 3 [15:31:44][D][esp_adf_pipeline:125]: Check element [pcm_reader] status, 3 [15:31:44][D][esp_adf_pipeline:283]: State changed from STARTING to RUNNING [15:31:44][D][esp_adf.microphone:060]: Mic state: 1 [15:31:44][D][esp_adf.microphone:061]: Pipeline State 3: [15:31:44][D][esp_adf.microphone:088]: Mic state: 2 [15:31:44][D][voice_assistant:523]: Event Type: 3 [15:31:44][D][voice_assistant:537]: STT started [15:31:44][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [15:31:44][I][esp_adf_pipeline:200]: [ i2s_in ] status: 12 [15:31:44][D][light:036]: 'Status LED' Setting: [15:31:44][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:31:44][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [15:31:46][D][voice_assistant:523]: Event Type: 11 [15:31:46][D][voice_assistant:677]: Starting STT by VAD [15:31:48][D][voice_assistant:523]: Event Type: 12 [15:31:48][D][voice_assistant:681]: STT by VAD end [15:31:48][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [15:31:48][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [15:31:48][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [15:31:48][D][esp_adf.microphone:060]: Mic state: 3 [15:31:48][D][esp_adf.microphone:061]: Pipeline State 4: [15:31:48][D][esp_adf.microphone:088]: Mic state: 3 [15:31:48][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [15:31:48][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [15:31:48][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [15:31:48][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [15:31:48][D][esp_adf_pipeline:381]: Called deinit_all [15:31:48][D][esp-idf:000]: I (38859) AUDIO_PIPELINE: audio_pipeline_unlinked

[15:31:48][D][esp-idf:000]: W (38862) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[15:31:48][D][esp-idf:000]: I (38865) I2S: DMA queue destroyed

[15:31:48][D][esp-idf:000]: W (38868) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[15:31:48][D][i2s_audio:045]: Release I2S Lock Mode: 1 [15:31:48][D][esp_adf_pipeline:283]: State changed from STOPPING to UNINITIALIZED [15:31:48][D][esp_adf.microphone:060]: Mic state: 3 [15:31:48][D][esp_adf.microphone:061]: Pipeline State 0: [15:31:48][D][esp_adf.microphone:088]: Mic state: 0 [15:31:48][D][light:036]: 'Status LED' Setting: [15:31:48][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:31:48][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE [15:31:48][D][voice_assistant:523]: Event Type: 4 [15:31:48][D][voice_assistant:551]: Speech recognised as: " What is the weather?" [15:31:48][D][voice_assistant:523]: Event Type: 5 [15:31:48][D][voice_assistant:556]: Intent started [15:31:49][D][voice_assistant:523]: Event Type: 6 [15:31:49][D][voice_assistant:523]: Event Type: 7

[15:31:49][D][light:036]: 'Status LED' Setting: [15:31:49][D][light:051]: Brightness: 100% [15:31:49][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:31:50][D][voice_assistant:523]: Event Type: 8 [15:31:50][D][voice_assistant:599]: Response URL: "https://

/api/tts_proxy/af42191a9a59e7b1585205c706f22365b925f3ba_en-gb_add2e9951e_tts.piper.mp3" [15:31:50][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [15:31:50][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE [15:31:50][D][media_player:059]: 'idf_player-mini' - Setting [15:31:50][D][media_player:066]: Media URL: https:// /api/tts_proxy/af42191a9a59e7b1585205c706f22365b925f3ba_en-gb_add2e9951e_tts.piper.mp3 [15:31:50][D][adf_media_player:030]: Got control call in state 1 [15:31:50][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [15:31:50][D][esp-idf:000]: I (40758) MP3_DECODER: MP3 init [15:31:50][D][i2s_audio:030]: Set I2S Lock Mode: 2 [15:31:50][D][esp-idf:000]: I (40764) I2S: DMA Malloc info, datalen=blocksize=3840, dma_buf_count=2 [15:31:50][D][esp_adf_pipeline:339]: pipeline tag 0, http [15:31:50][D][esp_adf_pipeline:339]: pipeline tag 1, decoder [15:31:50][D][esp_adf_pipeline:339]: pipeline tag 2, i2s_out [15:31:50][D][esp-idf:000]: I (40774) AUDIO_PIPELINE: link el->rb, el:0x3de1ff7c, tag:http, rb:0x3de204d4 [15:31:50][D][esp-idf:000]: I (40776) AUDIO_PIPELINE: link el->rb, el:0x3de20170, tag:decoder, rb:0x3de20914 [15:31:50][D][esp_adf_pipeline:351]: Setting up event listener. ERROR Fatal error: protocol.data_received() call failed. protocol: transport: <_SelectorSocketTransport fd=6 read=polling write=> Traceback (most recent call last): File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received self._protocol.data_received(data) File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data cryptography.exceptions.InvalidTag WARNING kitchen-va @ 192.168.2.41: Connection error occurred: kitchen-va @ 192.168.2.41: Invalid encryption key: received_name=kitchen-va INFO Processing unexpected disconnect from ESPHome API for kitchen-va @ 192.168.2.41 WARNING Disconnected from API INFO Successfully connected to kitchen-va @ 192.168.2.41 in 0.044s INFO Successful handshake with kitchen-va @ 192.168.2.41 in 0.333s [15:32:00][I][esp_adf_pipeline:171]: Pipeline preparation timeout! [15:32:00][D][esp_adf_pipeline:283]: State changed from PREPARING to STOPPING [15:32:00][I][adf_media_player:135]: got new pipeline state: 4 [15:32:00][D][esp-idf:000]: W (50787) AUDIO_PIPELINE: Without stop, st:1 On Thu, Mar 28, 2024 at 3:15 PM Mischa Siekmann ***@***.***> wrote: > Could you try using a mp3 file for announcing that the wake word got > detected? > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > -- ~Steve
gnumpi commented 2 months ago

somehow it can't access your HA to receive the tts response, never seen this before. Sorry I don't have an explanation for it right now. It says invalid encryption key though, is everything else related to HA working fine?

steku commented 2 months ago

That was a one-off issue that hasn't come back. I am still not getting audio. Here is a log without the other issue

[19:40:06][D][micro_wake_word:362]: Wake word sliding average probability is 0.522 and most recent probability is 1.000 [19:40:06][D][micro_wake_word:128]: Wake Word Detected [19:40:06][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [19:40:06][D][micro_wake_word:134]: Stopping Microphone [19:40:06][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [19:40:06][D][esp_adf.microphone:060]: Mic state: 3 [19:40:06][D][esp_adf.microphone:061]: Pipeline State 4: [19:40:06][D][esp_adf.microphone:088]: Mic state: 3 [19:40:06][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [19:40:06][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [19:40:06][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [19:40:06][D][esp-idf:000]: W (11394491) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][D][esp-idf:000]: W (11394493) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][D][esp-idf:000]: W (11394496) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][D][esp-idf:000]: W (11394499) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[19:40:06][I][esp_adf_pipeline:200]: [ i2s_in ] status: 14 [19:40:09][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 5 [19:40:09][D][voice_assistant:118]: microphone not running [19:40:09][D][voice_assistant:437]: Client started, streaming microphone [19:40:09][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [19:40:09][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [19:40:09][D][voice_assistant:155]: Starting Microphone [19:40:09][D][esp_adf_pipeline:044]: Starting request, current state UNINITIALIZED [19:40:09][D][i2s_audio:030]: Set I2S Lock Mode: 1 [19:40:09][D][esp-idf:000]: I (11397936) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[19:40:09][D][esp_adf_pipeline:339]: pipeline tag 0, i2s_in [19:40:09][D][esp_adf_pipeline:339]: pipeline tag 1, pcm_reader [19:40:09][D][esp-idf:000]: I (11397944) AUDIO_PIPELINE: link el->rb, el:0x3de23c9c, tag:i2s_in, rb:0x3de23f28

[19:40:09][D][esp_adf_pipeline:351]: Setting up event listener. [19:40:09][D][esp_adf_pipeline:283]: State changed from UNINITIALIZED to PREPARING [19:40:09][D][esp_adf.microphone:060]: Mic state: 0 [19:40:09][D][esp_adf.microphone:061]: Pipeline State 1: [19:40:09][D][esp_adf.microphone:088]: Mic state: 0 [19:40:09][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [19:40:09][D][i2s_audio:030]: Set I2S Lock Mode: 1 [19:40:09][D][esp_adf_pipeline:283]: State changed from PREPARING to STARTING [19:40:09][D][esp_adf.microphone:060]: Mic state: 1 [19:40:09][D][esp_adf.microphone:061]: Pipeline State 2: [19:40:09][D][esp_adf.microphone:088]: Mic state: 1 [19:40:09][D][esp-idf:000]: I (11397972) AUDIO_ELEMENT: [i2s_in-0x3de23c9c] Element task created

[19:40:09][D][esp-idf:000]: I (11397975) AUDIO_ELEMENT: [pcm_reader-0x3de23de0] Element task created

[19:40:09][D][esp-idf:000]: I (11397978) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2119039 Bytes, Inter:171892 Bytes, Dram:171892 Bytes

[19:40:09][D][esp-idf:000]: I (11397981) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[19:40:09][D][esp-idf:000]: I (11397984) AUDIO_PIPELINE: Pipeline started

[19:40:09][D][voice_assistant:523]: Event Type: 1 [19:40:09][D][voice_assistant:526]: Assist Pipeline running [19:40:09][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [19:40:09][D][esp_adf_pipeline:125]: Check element [i2s_in] status, 3 [19:40:09][D][esp_adf_pipeline:125]: Check element [pcm_reader] status, 3 [19:40:09][D][esp_adf_pipeline:283]: State changed from STARTING to RUNNING [19:40:09][D][esp_adf.microphone:060]: Mic state: 1 [19:40:09][D][esp_adf.microphone:061]: Pipeline State 3: [19:40:09][D][esp_adf.microphone:088]: Mic state: 2 [19:40:09][D][voice_assistant:523]: Event Type: 3 [19:40:09][D][voice_assistant:537]: STT started [19:40:09][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [19:40:09][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 12 [19:40:09][D][light:036]: 'Status LED' Setting: [19:40:09][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[19:40:09][I][esp_adf_pipeline:200]: [ i2s_in ] status: 12 [19:40:16][I][esp_adf_pipeline:171]: Pipeline preparation timeout! [19:40:16][D][esp_adf_pipeline:283]: State changed from PREPARING to STOPPING [19:40:16][I][adf_media_player:135]: got new pipeline state: 4 [19:40:16][D][esp-idf:000]: W (11404582) AUDIO_PIPELINE: Without stop, st:1

[19:40:26][D][voice_assistant:523]: Event Type: 11 [19:40:26][D][voice_assistant:677]: Starting STT by VAD [19:40:26][D][voice_assistant:523]: Event Type: 12 [19:40:26][D][voice_assistant:681]: STT by VAD end [19:40:26][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [19:40:26][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [19:40:26][D][esp_adf_pipeline:283]: State changed from RUNNING to STOPPING [19:40:26][D][esp_adf.microphone:060]: Mic state: 3 [19:40:26][D][esp_adf.microphone:061]: Pipeline State 4: [19:40:26][D][esp_adf.microphone:088]: Mic state: 3 [19:40:26][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [19:40:26][I][esp_adf_pipeline:200]: [ pcm_reader ] status: 14 [19:40:26][D][esp_adf_pipeline:138]: Check element for stop [i2s_in] status, 3 [19:40:26][D][esp_adf_pipeline:138]: Check element for stop [pcm_reader] status, 5 [19:40:26][D][esp_adf_pipeline:381]: Called deinit_all [19:40:26][D][esp-idf:000]: I (11414328) AUDIO_PIPELINE: audio_pipeline_unlinked

[19:40:26][D][esp-idf:000]: W (11414330) AUDIO_ELEMENT: [i2s_in] Element has not create when AUDIO_ELEMENT_TERMINATE

[19:40:26][D][esp-idf:000]: I (11414334) I2S: DMA queue destroyed

[19:40:26][D][esp-idf:000]: W (11414336) AUDIO_ELEMENT: [pcm_reader] Element has not create when AUDIO_ELEMENT_TERMINATE

[19:40:26][D][i2s_audio:045]: Release I2S Lock Mode: 1 [19:40:26][D][esp_adf_pipeline:283]: State changed from STOPPING to UNINITIALIZED [19:40:26][D][esp_adf.microphone:060]: Mic state: 3 [19:40:26][D][esp_adf.microphone:061]: Pipeline State 0: [19:40:26][D][esp_adf.microphone:088]: Mic state: 0 [19:40:26][D][light:036]: 'Status LED' Setting: [19:40:26][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[19:40:26][D][voice_assistant:523]: Event Type: 4 [19:40:26][D][voice_assistant:551]: Speech recognised as: " What is the temperature?" [19:40:26][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE [19:40:26][D][voice_assistant:523]: Event Type: 5 [19:40:26][D][voice_assistant:556]: Intent started [19:40:26][D][voice_assistant:523]: Event Type: 6 [19:40:26][D][voice_assistant:523]: Event Type: 7

[19:40:26][D][light:036]: 'Status LED' Setting: [19:40:26][D][light:051]: Brightness: 100% [19:40:26][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[19:40:26][D][voice_assistant:523]: Event Type: 8 [19:40:26][D][voice_assistant:599]: Response URL: "https://

k/api/tts_proxy/557d14b3a74d89cae750c0b48ba22c88ad8f62dc_en-gb_add2e9951e_tts.piper.mp3" [19:40:26][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [19:40:26][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE [19:40:26][D][media_player:059]: 'idf_player-mini' - Setting [19:40:26][D][media_player:066]: Media URL: https:// k/api/tts_proxy/557d14b3a74d89cae750c0b48ba22c88ad8f62dc_en-gb_add2e9951e_tts.piper.mp3 [19:40:26][D][adf_media_player:030]: Got control call in state 2 [19:40:26][D][esp_adf_pipeline:079]: Called 'stop' while in STOPPING state. [19:40:26][D][voice_assistant:523]: Event Type: 2 [19:40:26][D][voice_assistant:613]: Assist Pipeline ended On Thu, Mar 28, 2024 at 4:52 PM Mischa Siekmann ***@***.***> wrote: > somehow it can't access your HA to receive the tts response, never seen > this before. Sorry I don't have an explanation for it right now. > > — > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > -- ~Steve
steku commented 1 month ago

It's always DNS, I had a static IP in the config and didn't secify DNS.