esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
290 stars 35 forks source link

I2S Audio Speaker gets stopped before TTS Steam End of Audio, spontaneously #5461

Closed michaelberg79 closed 3 months ago

michaelberg79 commented 8 months ago

The problem

Helly I try to get running a Raspiaudio muse proto for voice assistant. I installed the voice assistant module with the esp-idf framework. Its so far its running... sometimes. The voice commands are recognized and executed. But I get the response output only by a chance of 50%. I tried several times and got some a differences in the log files between good and bad runs. But I cant get any details of this behavior.

anyone who can help with this?

The debug level is VERBOSE...

best wishes and thx for help Michael

Which version of ESPHome has the issue?

2023.12.9

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2024.1.6

What platform are you using?

ESP32-IDF

Board

raspiaudio muse proto

Component causing the issue

i2c_speaker

Example YAML snippet

---
esphome:
  name: muse-proto
  friendly_name: RaspiAudio Muse Proto
  name_add_mac_suffix: false
  project:
    name: raspiaudio.muse-proto-voice-assistant
    version: "1.0"
  min_version: 2023.11.1

esp32:
  board: esp-wrover-kit
  framework:
    type: esp-idf

logger:
  level: VERBOSE
ota:

dashboard_import:
  package_import_url: github://esphome/firmware/voice-assistant/raspiaudio-muse-proto.yaml@main

wifi:
  on_connect:
    - delay: 5s # Gives time for improv results to be transmitted
    - ble.disable:
  on_disconnect:
    - ble.enable:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  ap:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

external_components:
  - source: github://pr#5230
    components:
      - esp_adf
    refresh: 0s

i2s_audio:
  - i2s_lrclk_pin: GPIO25
    i2s_bclk_pin: GPIO5

microphone:
  - platform: i2s_audio
    id: board_microphone
    channel: left
    i2s_din_pin: GPIO35
    adc_type: external
    pdm: false

speaker:
  - platform: i2s_audio
    id: board_speaker
    dac_type: external
    i2s_dout_pin: GPIO26
    mode: mono

output:
  - platform: gpio
    pin:
      number: GPIO21
      inverted: true
    id: mute_pin

esp_adf:

voice_assistant:
  id: va
  microphone: board_microphone
  speaker: board_speaker
  use_wake_word: true
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  vad_threshold: 3
  on_listening:
    - output.turn_on: mute_pin
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Slow Pulse"
  on_stt_vad_end:
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        effect: "Fast Pulse"
  on_tts_start:
    - output.turn_off: mute_pin
    - light.turn_on:
        id: led
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: none
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - output.turn_on: mute_pin
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:
          - script.execute: reset_led
  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:
          - light.turn_off: led

binary_sensor:
  - platform: gpio
    pin:
      number: GPIO0
      inverted: true
      mode:
        input: true
        pullup: true
    name: Action
    disabled_by_default: true
    on_multi_click:
      - timing:
          - ON for at least 250ms
          - OFF for at least 50ms
        then:
          - if:
              condition:
                switch.is_off: use_wake_word
              then:
                - if:
                    condition: voice_assistant.is_running
                    then:
                      - voice_assistant.stop:
                      - script.execute: reset_led
                    else:
                      - voice_assistant.start:
              else:
                - voice_assistant.stop
                - delay: 1s
                - script.execute: reset_led
                - script.wait: reset_led
                - voice_assistant.start_continuous:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn

light:
  - platform: esp32_rmt_led_strip
    rmt_channel: 0
    name: None
    id: led
    disabled_by_default: true
    pin: GPIO22
    chipset: WS2812
    num_leds: 1
    rgb_order: grb
    effects:
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led
                red: 100%
                green: 89%
                blue: 71%
                brightness: 60%
                effect: none
          else:
            - light.turn_off: led

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);
      - script.execute: reset_led
api:
  encryption:
    key: #####################################

Anything in the logs that might be useful for us?

### Logfile for a bad run:
[11:08:23][D][voice_assistant:412]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[11:08:24][D][voice_assistant:519]: Event Type: 4
[11:08:24][D][voice_assistant:547]: Speech recognised as: " Schalte Püscher egal ein."
[11:08:24][D][voice_assistant:519]: Event Type: 5
[11:08:24][D][voice_assistant:552]: Intent started
[11:08:27][D][voice_assistant:519]: Event Type: 6
[11:08:27][D][voice_assistant:519]: Event Type: 7
[11:08:27][D][voice_assistant:575]: Response: "Something went wrong: Unable to find entity ['switch.puescher']"
[11:08:27][D][light:036]: 'RaspiAudio Muse Proto' Setting:
[11:08:27][D][light:051]:   Brightness: 100%
[11:08:27][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[11:08:27][D][light:085]:   Transition length: 1.0s
[11:08:27][D][light:091]:   Effect: 'None'
[11:08:27][D][voice_assistant:519]: Event Type: 8
[11:08:27][D][voice_assistant:595]: Response URL: "https://ha.r6qfwldtqd1u1ftg.myfritz.net/api/tts_proxy/361f2be5c6cf6ff9650a6caec0f0219f6b82258d_de-de_effe3150cd_tts.piper.wav"
[11:08:27][D][voice_assistant:412]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[11:08:27][D][voice_assistant:418]: Desired state set to STREAMING_RESPONSE
[11:08:27][D][esp-idf:000]: I (56942) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[11:08:27][D][i2s_audio.speaker:164]: Started I2S Audio Speaker
[11:08:27][D][i2s_audio.speaker:167]: Stopping I2S Audio Speaker   <----- Speaker gets stopped
[11:08:27][D][i2s_audio.speaker:178]: Stopped I2S Audio Speaker
[11:08:27][D][light:036]: 'RaspiAudio Muse Proto' Setting:
[11:08:27][D][light:051]:   Brightness: 60%
[11:08:27][D][light:059]:   Red: 100%, Green: 89%, Blue: 71%
[11:08:27][D][light:085]:   Transition length: 1.0s
[11:08:27][D][voice_assistant:519]: Event Type: 98
[11:08:27][D][voice_assistant:657]: TTS stream start
[11:08:27][D][i2s_audio.speaker:161]: Starting I2S Audio Speaker
[11:08:27][D][i2s_audio.speaker:164]: Started I2S Audio Speaker
[11:08:31][D][voice_assistant:519]: Event Type: 99
[11:08:31][D][voice_assistant:665]: TTS stream end
[11:08:31][D][voice_assistant:283]: End of audio stream received
[11:08:31][D][voice_assistant:412]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[11:08:31][D][voice_assistant:418]: Desired state set to RESPONSE_FINISHED
[11:08:31][D][i2s_audio.speaker:167]: Stopping I2S Audio Speaker
[11:08:31][D][i2s_audio.speaker:178]: Stopped I2S Audio Speaker
[11:08:31][D][voice_assistant:315]: Speaker has finished outputting all audio
[11:08:31][D][voice_assistant:412]: State changed from RESPONSE_FINISHED to IDLE
[11:08:31][D][voice_assistant:418]: Desired state set to IDLE
[11:08:31][D][voice_assistant:412]: State changed from IDLE to START_MICROPHONE
[11:08:31][D][voice_assistant:418]: Desired state set to WAIT_FOR_VAD
[11:08:31][D][voice_assistant:153]: Starting Microphone
[11:08:31][D][voice_assistant:412]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[11:08:31][D][esp-idf:000]: I (61430) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[11:08:32][D][voice_assistant:412]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD

### Logfile for a good run:

[11:12:50][D][voice_assistant:412]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[11:12:51][D][voice_assistant:519]: Event Type: 4
[11:12:51][D][voice_assistant:547]: Speech recognised as: " Schalte das Bücher egal ein."
[11:12:51][D][voice_assistant:519]: Event Type: 5
[11:12:51][D][voice_assistant:552]: Intent started
[11:12:52][D][voice_assistant:519]: Event Type: 6
[11:12:52][D][voice_assistant:519]: Event Type: 7
[11:12:52][D][voice_assistant:575]: Response: "Entschuldigung, ich habe das Gerät "Bücher egal" nicht gefunden."
[11:12:52][D][light:036]: 'RaspiAudio Muse Proto' Setting:
[11:12:52][D][light:051]:   Brightness: 100%
[11:12:52][D][light:059]:   Red: 0%, Green: 0%, Blue: 100%
[11:12:53][D][light:085]:   Transition length: 1.0s
[11:12:53][D][light:091]:   Effect: 'None'
[11:12:53][D][voice_assistant:519]: Event Type: 8
[11:12:53][D][voice_assistant:595]: Response URL: "https://ha.r6qfwldtqd1u1ftg.myfritz.net/api/tts_proxy/8d0ede4ae3f99da9ade0629aa597db7e7fb5dcfb_de-de_effe3150cd_tts.piper.wav"
[11:12:53][D][voice_assistant:412]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[11:12:53][D][voice_assistant:418]: Desired state set to STREAMING_RESPONSE
[11:12:53][D][esp-idf:000]: I (322468) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[11:12:53][D][i2s_audio.speaker:164]: Started I2S Audio Speaker
[11:12:55][D][voice_assistant:519]: Event Type: 99
[11:12:55][D][voice_assistant:665]: TTS stream end
[11:12:55][D][voice_assistant:283]: End of audio stream received
[11:12:55][D][voice_assistant:412]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[11:12:55][D][voice_assistant:418]: Desired state set to RESPONSE_FINISHED
[11:12:56][D][i2s_audio.speaker:167]: Stopping I2S Audio Speaker
[11:12:56][D][i2s_audio.speaker:178]: Stopped I2S Audio Speaker
[11:12:56][D][light:036]: 'RaspiAudio Muse Proto' Setting:
[11:12:56][D][light:051]:   Brightness: 60%
[11:12:56][D][light:059]:   Red: 100%, Green: 89%, Blue: 71%
[11:12:56][D][light:085]:   Transition length: 1.0s
[11:12:56][D][voice_assistant:315]: Speaker has finished outputting all audio
[11:12:56][D][voice_assistant:412]: State changed from RESPONSE_FINISHED to IDLE
[11:12:56][D][voice_assistant:418]: Desired state set to IDLE
[11:12:56][D][voice_assistant:412]: State changed from IDLE to START_MICROPHONE
[11:12:56][D][voice_assistant:418]: Desired state set to WAIT_FOR_VAD
[11:12:56][D][voice_assistant:153]: Starting Microphone
[11:12:56][D][voice_assistant:412]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[11:12:56][D][esp-idf:000]: I (325949) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[11:12:56][D][voice_assistant:412]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD

Additional information

No response

michaelberg79 commented 8 months ago

Hallo, I give it more tries. The line

[i2s_audio.speaker:167]: Stopping I2S Audio Speaker

appears also in successful runs. So thats not the reason for none responses.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

LumenSoftNL commented 3 months ago

try pr#6718 and see if that will help you.