esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
292 stars 36 forks source link

Voice assistant via i2s microphone crashes ESP32-S3 after ESPHome 2024.10.1 update #6379

Open JustEnoughDucks opened 4 weeks ago

JustEnoughDucks commented 4 weeks ago

The problem

I am using a ESP32-S3, a PCM5102 as a DAC to my Yamaha receiver, and a MSM261S4030H0R MEMs microphone connected via an interconnect PCB.

The goal (that works) is a simultaneous device that can have full voice assistant satellite support while also being a media player for home assistant/spotify/etc... while also controlling my yamaha rx496-RDS receiver via IR. So far it can do all of this, I just have to put in automatic source switching and better volume control than just home assistant buttons if possible.

After the ESPHome 2024.10.1 update, the esp crashes after enabling the microphone and starting the voice assistant pipeline, thus breaking voice assistant support. I am currently testing with a wake button instead of a wakeword for easy testing, so I don't know if it is a change specifically with button-activated support or not.

Version Voice Assistant Audio playback (Music Assistant)
2024.8.0
2024.9.0
2024.10.0
2024.10.1
2024.10.2

Which version of ESPHome has the issue?

=2024.10.1

What type of installation are you using?

Docker

Which version of Home Assistant has the issue?

NA

What platform are you using?

ESP32

Board

ESP32-S3

Component causing the issue

voiceassistant

Example YAML snippet

esp32:
  board: seeed_xiao_esp32s3
  variant: esp32s3
  framework:
    type: arduino
    version: latest
    platform_version: 6.3.2

...

voice_assistant:
  id: va
  microphone: yamaha_mic
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  media_player: media_out

  #on_start:

  #on_tts_start:

  on_tts_end:
    - media_player.play_media: !lambda return x;

  on_end:
    - delay: 2s
    - wait_until:
        not:
          media_player.is_playing: media_out
    - media_player.stop
    - voice_assistant.stop:

  #on_error:

i2s_audio:
  i2s_lrclk_pin: GPIO7
  i2s_bclk_pin: GPIO9

media_player:
  - platform: i2s_audio
    id: media_out
    name: Yamaha ESP Media Player
    dac_type: external
    i2s_dout_pin: GPIO8
    mode: stereo

microphone:
  - platform: i2s_audio
    id: yamaha_mic
    channel: left
    pdm: False
    adc_type: external
    i2s_din_pin: GPIO44

binary_sensor:
  - platform: gpio
    pin: 
      number: GPIO01
      inverted: true
      mode:
        input: true
        pullup: true
    name: PCB_Switch
    internal: true
    on_click:
      - if:
          condition: voice_assistant.is_running
          then:
            - voice_assistant.stop:
          else:
            - voice_assistant.start_continuous:

Anything in the logs that might be useful for us?

2024.10.0

][V][esp32.preferences:163]: nvs_get_blob('2827303864'): ESP_ERR_NVS_NOT_FOUND - the key might not be set yet
[10:21:00][V][esp32.preferences:126]: sync: key: 2827303864, len: 7
[10:21:00][D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[10:21:02][D][voice_assistant:637]: Event Type: 11
[10:21:02][D][voice_assistant:793]: Starting STT by VAD
[10:21:04][D][voice_assistant:637]: Event Type: 12
[10:21:04][D][voice_assistant:797]: STT by VAD end
[10:21:04][D][voice_assistant:514]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[10:21:04][D][voice_assistant:520]: Desired state set to AWAITING_RESPONSE
[10:21:04][D][voice_assistant:514]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[10:21:04][D][voice_assistant:514]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[10:21:05][D][voice_assistant:637]: Event Type: 4
[10:21:05][D][voice_assistant:665]: Speech recognised as: " Ffff.  Test, test, test."
[10:21:05][D][voice_assistant:637]: Event Type: 5
[10:21:05][D][voice_assistant:670]: Intent started
[10:21:05][D][voice_assistant:637]: Event Type: 6
[10:21:05][D][voice_assistant:637]: Event Type: 7
[10:21:05][D][voice_assistant:693]: Response: "Sorry, I couldn't understand that"
[10:21:05][D][voice_assistant:637]: Event Type: 8
[10:21:05][D][voice_assistant:715]: Response URL: "http://192.168.1.100:8123/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_9d6fb81c4b_tts.piper.mp3"
[10:21:05][D][voice_assistant:514]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[10:21:05][D][voice_assistant:520]: Desired state set to STREAMING_RESPONSE
[10:21:05][D][media_player:061]: 'Yamaha ESP Media Player' - Setting
[10:21:05][D][media_player:068]:   Media URL: http://192.168.1.100:8123/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_9d6fb81c4b_tts.piper.mp3
[10:21:05][D][media_player:074]:  Announcement: yes
[10:21:05][D][media_player:061]: 'Yamaha ESP Media Player' - Setting
[10:21:05][D][media_player:068]:   Media URL: http://192.168.1.100:8123/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_9d6fb81c4b_tts.piper.mp3
[10:21:05][D][voice_assistant:637]: Event Type: 2
[10:21:05][D][voice_assistant:729]: Assist Pipeline ended
[10:21:05][W][component:237]: Component i2s_audio.media_player took a long time for an operation (525 ms).
[10:21:05][W][component:238]: Components should block for at most 30 ms.
[10:21:08][W][component:237]: Component i2s_audio.media_player took a long time for an operation (371 ms).
[10:21:08][W][component:238]: Components should block for at most 30 ms.
[10:21:08][D][media_player:061]: 'Yamaha ESP Media Player' - Setting
[10:21:08][D][media_player:065]:   Command: STOP
[10:21:08][W][component:237]: Component i2s_audio.media_player took a long time for an operation (366 ms).
[10:21:08][W][component:238]: Components should block for at most 30 ms.
[10:21:10][I][safe_mode:041]: Boot seems successful; resetting boot loop counter
[10:21:10][D][esp32.preferences:114]: Saving 1 preferences to flash...
[10:21:10][V][esp32.preferences:126]: sync: key: 233825507, len: 4
[10:21:10][D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[10:21:10][D][voice_assistant:514]: State changed from STREAMING_RESPONSE to IDLE
[10:21:10][D][voice_assistant:520]: Desired state set to IDLE

2024.10.1

[10:23:56][I][app:100]: ESPHome version 2024.10.1 compiled on Oct 25 2024, 10:23:27
[10:23:56][C][wifi:600]: WiFi:
[10:23:56][C][wifi:428]:   Local MAC: DC:DA:0C:57:7E:98
[10:23:56][C][wifi:433]:   SSID: [redacted]
[10:23:56][C][wifi:436]:   IP Address: 192.168.1.123
[10:23:56][C][wifi:440]:   BSSID: [redacted]
[10:23:56][C][wifi:441]:   Hostname: 'esphome-speaker-dongle'
[10:23:56][C][wifi:443]:   Signal strength: -54 dB ▂▄▆█
[10:23:56][C][wifi:447]:   Channel: 10
[10:23:56][C][wifi:448]:   Subnet: 255.255.255.0
[10:23:56][C][wifi:449]:   Gateway: 192.168.0.1
[10:23:56][C][wifi:450]:   DNS1: 0.0.0.0
[10:23:56][C][wifi:451]:   DNS2: 0.0.0.0
[10:23:56][C][logger:185]: Logger:
[10:23:56][C][logger:186]:   Level: VERBOSE
[10:23:56][C][logger:188]:   Log Baud Rate: 115200
[10:23:56][C][logger:189]:   Hardware UART: USB_CDC
[10:23:56][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'PCB_Switch'
[10:23:56][C][gpio.binary_sensor:016]:   Pin: GPIO1
[10:23:56][C][remote_transmitter:015]: Remote Transmitter...
[10:23:56][C][remote_transmitter:016]:   Channel: 0
[10:23:56][C][remote_transmitter:017]:   RMT memory blocks: 1
[10:23:56][C][remote_transmitter:018]:   Clock divider: 80
[10:23:56][C][remote_transmitter:019]:   Pin: GPIO6
[10:23:56][C][remote_transmitter:022]:     Carrier Duty: 50%
[10:23:56][C][captive_portal:089]: Captive Portal:
[10:23:56][C][mdns:116]: mDNS:
[10:23:56][C][mdns:117]:   Hostname: esphome-speaker-dongle
[10:23:56][V][mdns:118]:   Services:
[10:23:56][V][mdns:120]:   - _esphomelib, _tcp, 6053
[10:23:56][V][mdns:122]:     TXT: friendly_name = Yamaha AV Dongle
[10:23:56][V][mdns:122]:     TXT: version = 2024.10.1
[10:23:56][V][mdns:122]:     TXT: mac = dcda0c577e98
[10:23:56][V][mdns:122]:     TXT: platform = ESP32
[10:23:56][V][mdns:122]:     TXT: board = seeed_xiao_esp32s3
[10:23:56][V][mdns:122]:     TXT: network = wifi
[10:23:56][V][mdns:122]:     TXT: api_encryption = Noise_NNpsk0_25519_ChaChaPoly_SHA256
[10:23:56][C][esphome.ota:073]: Over-The-Air updates:
[10:23:56][C][esphome.ota:074]:   Address: 192.168.1.123:3232
[10:23:56][C][esphome.ota:075]:   Version: 2
[10:23:56][C][esphome.ota:078]:   Password configured
[10:23:56][C][safe_mode:018]: Safe Mode:
[10:23:56][C][safe_mode:020]:   Boot considered successful after 60 seconds
[10:23:56][C][safe_mode:021]:   Invoke after 10 boot attempts
[10:23:56][C][safe_mode:023]:   Remain in safe mode for 300 seconds
[10:23:56][C][api:140]: API Server:
[10:23:56][C][api:141]:   Address: 192.168.1.123:6053
[10:23:56][C][api:143]:   Using noise encryption: YES
[10:23:56][C][audio:225]: Audio:
[10:23:56][C][audio:247]:   External DAC channels: 2
[10:23:56][C][audio:248]:   I2S DOUT Pin: 8
[10:23:56][D][api:103]: Accepted 192.168.1.100
[10:23:56][V][api.connection:1428]: Hello from client: 'Home Assistant 2024.10.3' | 192.168.1.100 | API Version 1.10
[10:23:56][D][api.connection:1446]: Home Assistant 2024.10.3 (192.168.1.100): Connected successfully
[10:24:02][D][binary_sensor:036]: 'PCB_Switch': Sending state ON
[10:24:03][D][binary_sensor:036]: 'PCB_Switch': Sending state OFF
[10:24:32][D][binary_sensor:036]: 'PCB_Switch': Sending state ON
[10:24:33][D][binary_sensor:036]: 'PCB_Switch': Sending state OFF
[10:24:33][D][voice_assistant:510]: State changed from IDLE to START_MICROPHONE
[10:24:33][D][voice_assistant:516]: Desired state set to START_PIPELINE
[10:24:33][D][voice_assistant:222]: Starting Microphone
WARNING esphome-speaker-dongle @ 192.168.1.123: Connection error occurred: [Errno 104] Connection reset by peer
INFO Processing unexpected disconnect from ESPHome API for esphome-speaker-dongle @ 192.168.1.123
WARNING Disconnected from API
INFO Successfully connected to esphome-speaker-dongle @ 192.168.1.123 in 0.002s
INFO Successful handshake with esphome-speaker-dongle @ 192.168.1.123 in 0.056s

Additional information

No response

BigBobbas commented 3 weeks ago

This is resolved in 2024.10.2

JustEnoughDucks commented 3 weeks ago

@BigBobbas this is not resolved. I have posted that it is tested in 2024.10.2 and it is the exact same behavior as 2024.10.1. In fact, I first recognized the behavior after updating from 2024.9.0 to 2024.10.2.