Removing Speaker from ESPHome Voice Assistant causes HA to disconnect from Echo after request.

The problem

The original goal was to stop the M5 Echo from speaking (instead using on_tts_start to call back to HA to use a media player), i.e. I just want to use the M5 Echo for the microphone component.

I've tried removing the 'speaker' from the voice assistant configuration, and even the whole speaker component configuration.

However the result is always the same, the M5 Echo recognises one voice request, responds correctly (via the HA media_player device), and then it seems to disconnect from HA and you have to wait for the reconnection to take place to try another request.

If I then re-add the speaker configuration, it works perfectly but the response is from both the M5Echo and the media_player device.

All that is necessary to replicate the problem is to comment out the speaker configuration line under voice_assistant.

Which version of ESPHome has the issue?

2024.4.1 & 2

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2024.4.4

What platform are you using?

ESP32

Board

M5Stack Echo

Component causing the issue

Voice Assistant

Example YAML snippet

voice_assistant:
  id: va
  microphone: echo_microphone
  # speaker: echo_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  vad_threshold: 3
  on_listening:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Slow Pulse"
  on_stt_vad_end:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Fast Pulse"
  on_tts_start:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: none
    - homeassistant.service:
        service: tts.cloud_say
        data:
          entity_id: ${media_player}  
          message: !lambda 'return x;'
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.wait: reset_led
  on_error:
    - light.turn_on:
        id: led
        red: 100%
        green: 0%
        blue: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:
          - script.execute: reset_led
  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:
          - light.turn_off: led

Anything in the logs that might be useful for us?

[14:23:13][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[14:23:13][D][voice_assistant:563]: Event Type: 4
[14:23:13][D][voice_assistant:591]: Speech recognised as: "What's the weather like?"
[14:23:13][D][voice_assistant:563]: Event Type: 5
[14:23:13][D][voice_assistant:596]: Intent started
[14:23:15][D][voice_assistant:563]: Event Type: 6
[14:23:15][D][voice_assistant:563]: Event Type: 7
[14:23:15][D][voice_assistant:619]: Response: "The current weather is cloudy with a temperature of 13.6°C. It's a bit breezy too, with winds blowing at about 18.84 knots."
[14:23:15][D][light:036]: 'M5Stack Office' Setting:
[14:23:15][D][light:051]:   Brightness: 100%
[14:23:15][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
WARNING m5stack-office @ 10.0.51.205: Connection error occurred: [Errno 104] Connection reset by peer
INFO Processing unexpected disconnect from ESPHome API for m5stack-office @ 10.0.51.205
WARNING Disconnected from API
INFO Successfully connected to m5stack-office @ 10.0.51.205 in 0.003s
INFO Successful handshake with m5stack-office @ 10.0.51.205 in 0.120s
[14:23:55][D][voice_assistant:563]: Event Type: 0
[14:23:55][D][voice_assistant:563]: Event Type: 2
[14:23:55][D][voice_assistant:653]: Assist Pipeline ended
[14:23:55][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to WAIT_FOR_VAD
[14:23:55][D][voice_assistant:445]: Desired state set to WAITING_FOR_VAD
[14:23:55][D][voice_assistant:180]: Waiting for speech...

Additional information

No response

esphome / issues