Media player can't play mp3s after updating to 2024.5.0

The problem

Media player doesn't play mp3s after update to 2024.5.0

Which version of ESPHome has the issue?

2024.5.0

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

Core 2024.5.3, Supervisor 2024.05.1 ,Operating System 12.3, Frontend 20240501.1

What platform are you using?

ESP32

Board

ESP32-C3-Zero like here: https://www.aliexpress.com/item/1005006524672028.html

Component causing the issue

esp-idf framework / adf_pipeline media player

Example YAML snippet

yaml Anything using media_player.play_media, like this button, fails to play the .mp3

button:
  - platform: template
    name: Test sound
    id: test_sound
    icon: "mdi:speaker-play"
    disabled_by_default: true           # Shows entity in HA, but disabled by default
    on_press:
      - media_player.play_media: "http://192.168.5.5:8123/local/Voice_assistant/im_listening.mp3"

Any tts.speak from HA fails to play too:

  - service: tts.speak
    metadata: {}
    data:
      cache: true
      media_player_entity_id: media_player.jarvis_media_player
      message: "Test TTS message"
      language: en_US
      options:
        voice: en_US-ryan-medium
    target:
      entity_id: tts.piper

Whole YAML for my assistant/smart speaker:

substitutions:                          # Substitutions
  friendly_name: Jarvis
  host_name: jarvis
  static_ip: 192.168.5.8
  log_level: DEBUG                      # NONE, ERROR, WARN, INFO, DEBUG (Default), VERBOSE, VERY_VERBOSE
#  log_level: VERBOSE                      # NONE, ERROR, WARN, INFO, DEBUG (Default), VERBOSE, VERY_VERBOSE

  start_volume: 45%
  # Voice assistant
  mic_volume_multiplier: "10.0"         # MIC volume multiplier
  # Mic_volume_multiplier: "5.0"
  noise_suppression_level: "3"
  # Mic
  i2s_lrclk_in_pin: GPIO7               # Mic WS
  i2s_bclk_in_pin: GPIO8                # Mic CK
  i2s_din_pin: GPIO9                    # Mic SD/DA
  auto_gain: 31dBFS                     # MIC gain
  # Speaker
  i2s_lrclk_out_pin: GPIO5              # Spk LRC
  i2s_bclk_out_pin: GPIO4               # Spk BCLK
  i2s_dout_pin: GPIO6                   # Spk DIN
  # Sounds
  connected_sound:         "http://192.168.5.5:8123/local/Voice_assistant/jarvis_connected.mp3"
  stop_listening_sound:    "http://192.168.5.5:8123/local/Voice_assistant/ill_stop_listening.mp3"
  start_listening_sound:   "http://192.168.5.5:8123/local/Voice_assistant/im_listening.mp3"
  ww_detected_sound:       "http://192.168.5.5:8123/local/Voice_assistant/pip.mp3"
  test_sound:              "http://192.168.5.5:8123/local/Voice_assistant/testing_1_2_3_testing.mp3"

esphome:
  name: $host_name
  friendly_name: $friendly_name
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - light.turn_on:
        id: rgb_led
        red: 100%
        green: 100%
        blue: 0%
        brightness: 20%
        effect: fast pulse

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: esp-idf
    # version: recommended
    version: 4.4.6
    sdkconfig_options:
      CONFIG_ESP32_S3_BOX_BOARD: "y"

psram:
  mode: quad
  speed: 80MHz

external_components:                    # Import adf_pipeline & i2s_audio
  - source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: main
    components: [ adf_pipeline, i2s_audio ]

# <<: !include common/logger.yaml
# <<: !include common/api_ota.yaml
# <<: !include common/wifi.yaml
<<: !include common/web_server.yaml
<<: !include common/binary_sensors.yaml
<<: !include common/sensors.yaml
<<: !include common/text_sensors.yaml
# <<: !include common/buttons.yaml
<<: !include common/sync_time.yaml

logger:                                 # Enable logging. https://esphome.io/components/logger.html
  level: $log_level                     # NONE, ERROR, WARN, INFO, DEBUG (Default), VERBOSE, VERY_VERBOSE
  baud_rate: 0                          # Stops logging over onboard UART

api:                                    # Home Assistant API
  encryption:
    key: !secret api_encryption_key
  on_client_connected:
    - media_player.volume_set: $start_volume

ota:                                    # OTA updates
  password: !secret ota_password
  safe_mode: True

wifi:                                   # WiFi connection details. Without domain:, defaults to .local. https://esphome.io/components/wifi.html
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  manual_ip:
    static_ip: $static_ip
    gateway: 192.168.5.254
    subnet: 255.255.255.0
    dns1: 192.168.5.252
  domain: .home
  power_save_mode: LIGHT                # NONE, LIGHT, HIGH
  fast_connect: true

button:                                 # Useful buttons in HA
  - platform: restart
    id: "restart_device"
    name: "Restart"
    entity_category: "diagnostic"
  - platform: safe_mode
    id: "restart_device_safe_mode"
    name: "Restart (Safe Mode)"
    entity_category: "diagnostic"
    disabled_by_default: true           # Shows entity in HA, but disabled by default
  - platform: template
    name: Test sound
    id: test_sound
    icon: "mdi:speaker-play"
    disabled_by_default: true           # Shows entity in HA, but disabled by default
    on_press:
      - media_player.play_media: $test_sound
      # - media_player.stop

i2s_audio:                              # I2s audio pins
  - id: i2s_in
    i2s_lrclk_pin: $i2s_lrclk_in_pin    # Mic WS
    i2s_bclk_pin: $i2s_bclk_in_pin      # Mic CK
  - id: i2s_out
    i2s_lrclk_pin: $i2s_lrclk_out_pin   # Spk LRC
    i2s_bclk_pin: $i2s_bclk_out_pin     # Spk BCLK

adf_pipeline:                           # ADF pieline pins
  - platform: i2s_audio
    type: audio_out
    id: adf_i2s_out
    i2s_audio_id: i2s_out
    i2s_dout_pin: GPIO6                 # Spk DIN

  - platform: i2s_audio
    type: audio_in
    id: adf_i2s_in
    i2s_audio_id: i2s_in
    i2s_din_pin: $i2s_din_pin           # Mic SD/DA
    channel: left
    sample_rate: 16000
    bits_per_sample: 16bit
    pdm: false

media_player:                           # Media player
  - platform: adf_pipeline
    id: jarvis_media_player
    name: Media Player
    internal: false
    icon: mdi:speaker-wireless
    pipeline:
      - self
      - adf_i2s_out

microphone:                             # ADF Mic
  - platform: adf_pipeline
    id: mic
    pipeline:
      - adf_i2s_in
      - self

voice_assistant:                        # Voice Assistant
  microphone: mic
  media_player: jarvis_media_player
  use_wake_word: false
  noise_suppression_level: $noise_suppression_level
  auto_gain: $auto_gain
  volume_multiplier: $mic_volume_multiplier
  on_client_connected:
    - media_player.play_media: $connected_sound
    # - media_player.stop
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          # - voice_assistant.start:
          - light.turn_on:
              id: rgb_led
              red: 0%
              green: 100%
              blue: 0%
              brightness: 20%
              effect: none
          - micro_wake_word.start
        else:
          - light.turn_on:
              id: rgb_led
              red: 100%
              green: 100%
              blue: 0%
              brightness: 20%
              effect: none
    - logger.log:
        level: ERROR
        format: "****VA on_client_connected"
  on_client_disconnected:
    # - voice_assistant.stop
    - micro_wake_word.stop
    - light.turn_on:
        id: rgb_led
        red: 100%
        green: 100%
        blue: 0%
        brightness: 20%
        effect: fast pulse
    - logger.log:
        level: ERROR
        format: "****VA on_client_disconnected"
  on_start:
    - logger.log:
        level: ERROR
        format: "****VA on_start"
  # on_wake_word_detected:                # Will never trigger when using micro_wake_word
  #   - light.turn_on:
  #       id: rgb_led
  #       red: 0%
  #       green: 0%
  #       blue: 100%
  #       brightness: 40%
  #       effect: none
  #   - media_player.play_media: http://192.168.5.5:8123/local/Voice_assistant/listening.mp3
  #   - media_player.stop
  #   - logger.log:
  #       level: ERROR
  #       format: "****VA on_wake_word_detected"
  on_listening:
    - logger.log:
        level: ERROR
        format: "****VA on_listening"
  on_stt_vad_start:
    - logger.log:
        level: ERROR
        format: "****VA on_stt_vad_start"
  on_stt_vad_end:
    - logger.log:
        level: ERROR
        format: "****VA on_stt_vad_end"
  on_stt_end:
    - logger.log:
        level: ERROR
        format: "****VA on_stt_end"
  on_intent_start:
    - logger.log:
        level: ERROR
        format: "****VA on_intent_start"
  on_intent_end:
    - logger.log:
        level: ERROR
        format: "****VA on_intent_end"
  on_tts_start:
    - logger.log:
        level: ERROR
        format: "****VA on_tts_start"
  on_tts_end:
    - logger.log:
        level: ERROR
        format: "****VA on_tts_end"
  on_end:
    - logger.log:
        level: ERROR
        format: "****VA on_end"
    - light.turn_on:
        id: rgb_led
        red: 0%
        green: 100%
        blue: 0%
        brightness: 20%
        effect: none
    - micro_wake_word.start
  # on_idle:
  #   - logger.log:
  #       level: ERROR
  #       format: "****VA on_idle"
  on_error:
    - logger.log:
        level: ERROR
        format: "****VA on_error"
    - light.turn_on:
        id: rgb_led
        red: 100%
        green: 0%
        blue: 0%
        brightness: 25%
        effect: none

switch:                                 # Switches
  - platform: factory_reset
    name: Restart with Factory Default Settings
    disabled_by_default: true
    entity_category: diagnostic
  - platform: template
    name: Enable Assist
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    icon: mdi:microphone-message
    on_turn_on:
      # - logger.log:
      #     level: ERROR
      #     format: "****VA turned on"
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            # - voice_assistant.start
            - micro_wake_word.start
      - light.turn_on:
          id: rgb_led
          red: 0%
          green: 100%
          blue: 0%
          brightness: 20%
          effect: none
      # - media_player.volume_set: 45%
      # - media_player.play_media: $start_listening_sound
      # - media_player.play_media: http://192.168.5.5:8123/local/Voice_assistant/ready.mp3
      # - media_player.play_media: http://192.168.5.5:8123/local/Voice_assistant/listening.mp3
      # - media_player.stop
    on_turn_off:
      # - logger.log:
      #     level: ERROR
      #     format: "****VA turned off"
      - light.turn_on:
          id: rgb_led
          red: 100%
          green: 100%
          blue: 0%
          brightness: 25%
          effect: none
      # - media_player.play_media: $stop_listening_sound
      # - media_player.stop
      - voice_assistant.stop
      - micro_wake_word.stop

micro_wake_word:                        # On-device wake word
  sliding_window_average_size: 5
  model: hey_jarvis
  on_wake_word_detected:
    - media_player.play_media: $ww_detected_sound
    # - media_player.stop
    - light.turn_on:
        id: rgb_led
        red: 0%
        green: 0%
        blue: 100%
        brightness: 50%
        effect: none
    # - logger.log:
    #     level: ERROR
    #     format: "****MWW Wake word detected"
    - voice_assistant.start:
        wake_word: !lambda return wake_word;

light:                                  # RGB status LED
  - platform: esp32_rmt_led_strip
    rgb_order: RGB
    pin: GPIO21
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: "RGB LED"
    id: "rgb_led"
    internal: true
    default_transition_length: 0s
    disabled_by_default: true           # Shows entity in HA, but disabled by default
    effects:
      - pulse:
          min_brightness: 10%
          max_brightness: 40%
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.25s
          update_interval: 0.25s
          min_brightness: 10%
          max_brightness: 40%
      - pulse:
          name: "Slow Pulse"
          # transition_length: 1s       # defaults to 1s
          update_interval: 2s
          min_brightness: 10%
          max_brightness: 40%

Anything in the logs that might be useful for us?

Log as I press the button to play a mp3:
[23:01:32][D][button:010]: 'Test sound' Pressed.
[23:01:32][D][media_player:061]: 'Media Player' - Setting
[23:01:32][D][media_player:068]:   Media URL: http://192.168.5.5:8123/local/Voice_assistant/im_listening.mp3
[23:01:32][D][adf_media_player:030]: Got control call in state 1
[23:01:32][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED
[23:01:32][D][esp-idf:000]: I (132522) MP3_DECODER: MP3 init

[23:01:32][D][esp-idf:000]: I (132528) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[23:01:32][D][i2s_audio:072]: Installing driver : yes
[23:01:32][D][esp_adf_pipeline:358]: pipeline tag 0, http
[23:01:32][D][esp_adf_pipeline:358]: pipeline tag 1, decoder
[23:01:32][D][esp_adf_pipeline:358]: pipeline tag 2, i2s_out
[23:01:32][D][esp-idf:000]: I (132540) AUDIO_PIPELINE: link el->rb, el:0x3de20d8c, tag:http, rb:0x3de21290

[23:01:32][D][esp-idf:000]: I (132543) AUDIO_PIPELINE: link el->rb, el:0x3de20f4c, tag:decoder, rb:0x3de222d0

[23:01:32][D][esp_adf_pipeline:370]: Setting up event listener.
[23:01:32][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING
[23:01:32][I][adf_media_player:135]: got new pipeline state: 1
[23:01:32][D][adf_i2s_out:127]: Set final i2s settings: 22050
[23:01:32][D][esp-idf:000]: I (132575) AUDIO_THREAD: The http task allocate stack on external memory

[23:01:32][D][esp-idf:000]: I (132577) AUDIO_ELEMENT: [http-0x3de20d8c] Element task created

[23:01:32][D][esp-idf:000]: I (132580) AUDIO_THREAD: The decoder task allocate stack on external memory

[23:01:32][D][esp-idf:000]: I (132582) AUDIO_ELEMENT: [decoder-0x3de20f4c] Element task created

[23:01:32][D][esp-idf:000][http]: I (132585) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:01:32][D][esp-idf:000][decoder]: I (132588) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[23:01:32][D][esp_audio_sources:097]: Streamer status: 2
[23:01:32][D][esp_audio_sources:098]: decoder status: 2
[23:01:32][D][esp-idf:000][http]: I (132618) HTTP_CLIENT: Body received in fetch header state, 0x3fcc3819, 1703

[23:01:32][D][esp-idf:000][http]: I (132624) HTTP_STREAM: total_bytes=9541

[23:01:32][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=22050, bits=16, ch=1
[23:01:32][D][adf_i2s_out:127]: Set final i2s settings: 22050
[23:01:32][D][esp-idf:000][decoder]: W (132706) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[23:01:32][D][esp-idf:000][decoder]: W (132711) MP3_DECODER: output aborted -3

[23:01:32][D][esp-idf:000][decoder]: I (132717) MP3_DECODER: Closed

[23:01:32][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[23:01:32][I][adf_media_player:135]: got new pipeline state: 2
[23:01:32][D][adf_i2s_out:127]: Set final i2s settings: 22050
[23:01:32][D][esp-idf:000]: I (132743) AUDIO_ELEMENT: [i2s_out-0x3de21118] Element task created

[23:01:32][D][esp-idf:000]: I (132745) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2103415 Bytes, Inter:163960 Bytes, Dram:163960 Bytes

[23:01:32][D][esp-idf:000][http]: I (132747) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:01:32][D][esp-idf:000][decoder]: I (132751) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[23:01:32][D][esp-idf:000][i2s_out]: I (132754) AUDIO_ELEMENT: [i2s_out] AEL_MSG_CMD_RESUME,state:1

[23:01:32][D][esp-idf:000]: I (132757) AUDIO_PIPELINE: Pipeline started

[23:01:32][I][esp_adf_pipeline:214]: [ decoder ] status: 14
[23:01:32][D][esp-idf:000][http]: I (132833) HTTP_CLIENT: Body received in fetch header state, 0x3fcbf4c1, 1703

[23:01:32][D][esp-idf:000][http]: I (132840) HTTP_STREAM: total_bytes=9541

[23:01:32][I][esp_adf_pipeline:214]: [ http ] status: 14
[23:01:32][I][esp_adf_pipeline:214]: [ i2s_out ] status: 12
[23:01:32][D][esp_adf_pipeline:131]: Check element [http] status, 3
[23:01:32][D][esp_adf_pipeline:131]: Check element [decoder] status, 3
[23:01:32][D][esp_adf_pipeline:131]: Check element [i2s_out] status, 3
[23:01:32][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[23:01:32][I][adf_media_player:135]: got new pipeline state: 3
[23:01:32][D][adf_i2s_out:127]: Set final i2s settings: 22050
[23:01:32][I][esp_adf_pipeline:214]: [ http ] status: 12
[23:01:32][D][esp-idf:000][http]: W (132976) HTTP_STREAM: No more data,errno:0, total_bytes:9541, rlen = 0

[23:01:32][D][esp-idf:000][http]: I (132981) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[23:01:32][I][esp_adf_pipeline:214]: [ decoder ] status: 12
[23:01:32][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=22050, bits=16, ch=1
[23:01:32][D][adf_i2s_out:127]: Set final i2s settings: 22050
[23:01:32][I][esp_adf_pipeline:214]: [ http ] status: 15
[23:01:32][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[23:01:32][I][adf_media_player:135]: got new pipeline state: 4
[23:01:33][D][esp-idf:000][decoder]: I (133371) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[23:01:33][D][esp-idf:000][decoder]: I (133761) MP3_DECODER: Closed

[23:01:33][D][esp-idf:000][i2s_out]: I (133876) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[23:01:33][D][esp_adf_pipeline:400]: Called deinit_all
[23:01:33][D][esp-idf:000]: I (134074) AUDIO_PIPELINE: audio_pipeline_unlinked

[23:01:33][D][esp-idf:000]: W (134077) AUDIO_ELEMENT: [http] Element has not create when AUDIO_ELEMENT_TERMINATE

[23:01:33][D][esp-idf:000]: W (134079) AUDIO_ELEMENT: [decoder] Element has not create when AUDIO_ELEMENT_TERMINATE

[23:01:33][D][esp-idf:000]: W (134081) AUDIO_ELEMENT: [i2s_out] Element has not create when AUDIO_ELEMENT_TERMINATE

[23:01:34][D][esp-idf:000]: I (134085) I2S: DMA queue destroyed

[23:01:34][D][esp_adf_pipeline:302]: State changed from STOPPING to UNINITIALIZED
[23:01:34][I][adf_media_player:135]: got new pipeline state: 0

Additional information

Rolling back the update to 2024.5.0 and re-uploading to the device made it play mp3s reliably again.

I found that if I stick with 2024.5.0 and change

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: esp-idf
    version: recommended

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: esp-idf
    version: 4.4.6

...then things play OK again

The behaviours changed slightly since I installed esphome 24.6.6. With version: recommended, it listens and responds OK first time, but on the second try it detects the wake word and starts listening, I ask something and then it gets stuck and needs a reset. I see this in its log though, which I don't think I did prior to 24.6.6:

[22:38:12][E][main:650]: ****MWW Wake word detected
[22:38:12][W][component:237]: Component micro_wake_word took a long time for an operation (63 ms).
[22:38:12][W][component:238]: Components should block for at most 30 ms.
[22:38:12][D][esp-idf:000]: I (32079) AUDIO_THREAD: The http task allocate stack on external memory

[22:38:12][D][esp-idf:000]: I (32082) AUDIO_ELEMENT: [http-0x3de17d34] Element task created

[22:38:12][D][esp-idf:000]: I (32085) AUDIO_THREAD: The decoder task allocate stack on external memory

[22:38:12][D][esp-idf:000]: I (32088) AUDIO_ELEMENT: [decoder-0x3de17eec] Element task created

[22:38:12][D][esp-idf:000][http]: I (32090) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[22:38:12][D][esp-idf:000][decoder]: I (32093) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[22:38:12][D][esp_audio_sources:097]: Streamer status: 2
[22:38:12][D][esp_audio_sources:098]: decoder status: 2
[22:38:13][D][esp-idf:000][http]: I (32159) HTTP_CLIENT: Body received in fetch header state, 0x3fcc211c, 1704

[22:38:13][D][esp-idf:000][http]: I (32163) HTTP_STREAM: total_bytes=2520

[22:38:13][D][esp-idf:000][http]: W (32179) HTTP_STREAM: No more data,errno:0, total_bytes:2520, rlen = 0

[22:38:13][D][esp-idf:000][http]: I (32182) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[22:38:13][D][esp-idf:000][decoder]: I (32186) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[22:38:13][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=32000, bits=16, ch=1
[22:38:13][D][adf_i2s_out:114]: update i2s clk settings: rate:32000 bits:16 ch:1
[22:38:13][D][adf_i2s_out:127]: Set final i2s settings: 32000
[22:38:13][D][esp-idf:000][decoder]: W (32200) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[22:38:13][D][esp-idf:000][decoder]: W (32205) MP3_DECODER: output aborted -3

[22:38:13][D][esp-idf:000][decoder]: I (32209) MP3_DECODER: Closed

[22:38:13][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[22:38:13][I][adf_media_player:135]: got new pipeline state: 2
[22:38:13][D][adf_i2s_out:127]: Set final i2s settings: 32000
[22:38:13][D][esp-idf:000]: I (32225) AUDIO_ELEMENT: [i2s_out-0x3de180b8] Element task created

[22:38:13][D][esp-idf:000]: I (32228) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:2123355 Bytes, Inter:165420 Bytes, Dram:165420 Bytes

[22:38:13][D][esp-idf:000][http]: I (32231) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[22:38:13][D][esp-idf:000][decoder]: I (32233) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[22:38:13][D][esp-idf:000][i2s_out]: I (32236) AUDIO_ELEMENT: [i2s_out] AEL_MSG_CMD_RESUME,state:1

[22:38:13][D][esp-idf:000][i2s_out]: I (32238) I2S_STREAM: AUDIO_STREAM_WRITER

[22:38:13][I][esp_adf_pipeline:214]: [ http ] status: 14
[22:38:13][I][esp_adf_pipeline:214]: [ i2s_out ] status: 12
[22:38:13][D][esp_adf_pipeline:131]: Check element [http] status, 2
[22:38:13][D][esp-idf:000][http]: I (32306) HTTP_CLIENT: Body received in fetch header state, 0x3fcbe770, 1704

[22:38:13][D][esp-idf:000][http]: I (32310) HTTP_STREAM: total_bytes=2520

[22:38:13][I][esp_adf_pipeline:214]: [ http ] status: 12
[22:38:13][D][esp_adf_pipeline:131]: Check element [http] status, 3
[22:38:13][D][esp_adf_pipeline:131]: Check element [decoder] status, 2
[22:38:13][I][esp_adf_pipeline:214]: [ decoder ] status: 12
[22:38:13][D][esp_adf_pipeline:131]: Check element [http] status, 3
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x7f95ad901590>
transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
  File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received
    self._protocol.data_received(data)
  File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
  File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad
    plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt
    return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
  File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING jarvis @ 192.168.5.55: Connection error occurred: jarvis @ 192.168.5.55: Invalid encryption key: received_name=jarvis
INFO Processing unexpected disconnect from ESPHome API for jarvis @ 192.168.5.55
WARNING Disconnected from API
INFO Successfully connected to jarvis @ 192.168.5.55 in 0.006s
INFO Successful handshake with jarvis @ 192.168.5.55 in 0.065s
[22:38:13][D][esp-idf:000][decoder]: I (32461) MP3_DECODER: Closed

[22:38:13][D][esp-idf:000][i2s_out]: I (32532) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[22:38:13][D][esp_adf_pipeline:400]: Called deinit_all
[22:38:13][D][esp-idf:000]: I (32663) AUDIO_PIPELINE: audio_pipeline_unlinked

[22:38:13][D][esp-idf:000]: W (32666) AUDIO_ELEMENT: [http] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:38:13][D][esp-idf:000]: W (32668) AUDIO_ELEMENT: [decoder] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:38:13][D][esp-idf:000]: W (32671) AUDIO_ELEMENT: [i2s_out] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:38:13][D][esp-idf:000]: I (32674) I2S: DMA queue destroyed

[22:38:13][D][esp_adf_pipeline:302]: State changed from STOPPING to UNINITIALIZED
[22:38:13][I][adf_media_player:135]: got new pipeline state: 0
[22:38:16][D][sensor:094]: 'Uptime seconds': Sending state 35.54700 s with 0 decimals of accuracy
[22:38:33][D][sensor:094]: 'WiFi Signal dB': Sending state -65.00000 dBm with 0 decimals of accuracy
[22:38:33][D][sensor:094]: 'WiFi Signal': Sending state 70.00000 % with 0 decimals of accuracy
[22:38:42][I][safe_mode:041]: Boot seems successful; resetting boot loop counter
[22:38:42][D][esp32.preferences:114]: Saving 1 preferences to flash...
[22:38:42][D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed

Trying to play an mp3 (like media_player.play_media: http://IP/mp3.mp3) still doesn't work, but I see nothng that looks wrong in its log. Same with TTS.

Setting version: 4.4.6 hasn't improved anything for the last few versions.

I still have to stick with esphome 2024.4 to be able to use it properly.

esphome / issues