esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
290 stars 36 forks source link

Voice Assistant with ESP3S3, no speaker sound #6231

Open aviitzhaki opened 1 month ago

aviitzhaki commented 1 month ago

The problem

I am using esp32-s3-devkitc-1 connected to INMP441 Microphone and MAX98357A Audio Amplifier. I am using the esp-idf to implement voice assistant with micro wake word. Th emicrophone is workinf fine (responding to the wake word... see example. I have no sound on the speaker. I checked the I2S signals and see the LRCLK and BCLK but no data, I also see on the log file that there is an issue with the speaker task. I have audio pipeline set correctly.

Did someone encountered an issue with the speaker?

Which version of ESPHome has the issue?

ESPHome 2024.8.3

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2024.9.1

What platform are you using?

ESP32-IDF

Board

esp32-s3-devkitc-1

Component causing the issue

I2S speaker

Example YAML snippet

esphome:
  name: voiceassist
  friendly_name: VoiceAssist
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - light.turn_on:
        id: led_ww
        blue: 100%
        brightness: 60%
        effect: fast pulse

esp32:
  board:   esp32-s3-devkitc-1
  variant: esp32s3
  flash_size: 16MB  
  framework:
    type: esp-idf
    version: 4.4.6
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"

psram:
  mode: octal # Please change this to quad for N8R2 and octal for N16R8
  speed: 80MHz

# Enable Home Assistant API
api:
  encryption:
    key: ***
  on_client_connected:
        then:
          - delay: 50ms
          - light.turn_off: led_ww
          - micro_wake_word.start:
  on_client_disconnected:
        then:
          - voice_assistant.stop: 

logger:

ota:
  platform: esphome

wifi:
  ssid: ***
  password: ***

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: ***
    password: ***

captive_portal:

button:
  - platform: restart
    name: "Restart"
    id: but_rest

switch:
  - platform: template
    id: mute
    name: mute
    optimistic: true
    on_turn_on: 
      - micro_wake_word.stop:
      - voice_assistant.stop:
      - light.turn_on:
          id: led_ww           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - light.turn_on:
          id: led_strip           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 60%
          effect: fast pulse 

      - delay: 2s
      - light.turn_off:
          id: led_ww
      - light.turn_off:
          id: led_strip

      - light.turn_on:
          id: led_ww          
          red: 100%
          green: 0%
          blue: 0%
          brightness: 30%
      - light.turn_on:
          id: led_strip           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 30%

    on_turn_off:
      - micro_wake_word.start:
      - light.turn_on:
          id: led_ww           
          red: 0%
          green: 100%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - light.turn_on:
          id: led_strip  
          red: 0%
          green: 100%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - delay: 2s
      - light.turn_off:
          id: led_strip
      - light.turn_off:
          id: led_ww

binary_sensor:
  - platform: gpio
    id: button01
    name: "Mute Button" # Physical Mute switch
    pin:
      number: GPIO10  #Physical Button connected to this pin.
      inverted: True
      mode:
        input: True
        pullup: True
    on_press: 
      then:
        - switch.toggle: mute

light:
  - platform: esp32_rmt_led_strip
    id: led_ww
    rgb_order: GRB
    pin: GPIO48
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: "On board light"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%

  - platform: esp32_rmt_led_strip
    id: led_strip
    rgb_order: GRB
    pin: GPIO09
    num_leds: 29
    rmt_channel: 1
    chipset: ws2812
    name: "Led Strip"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%
      - addressable_scan:
          name: "Scan Effect With Custom Values"
          move_interval: 5ms
          scan_width: 10

 # Audio and Voice Assistant Config          
i2s_audio:
  - id: i2s_in # For microphone
    i2s_lrclk_pin: GPIO3  #WS 
    i2s_bclk_pin: GPIO2 #SCK

  - id: i2s_speaker #For Speaker
    i2s_lrclk_pin: GPIO6  #LRC 
    i2s_bclk_pin: GPIO20 #BLCK

microphone:
  - platform: i2s_audio
    id: va_mic
    adc_type: external
    i2s_din_pin: GPIO4 #SD
    channel: left
    pdm: false
    i2s_audio_id: i2s_in
    bits_per_sample: 32bit

speaker:
    platform: i2s_audio
    id: va_speaker
    i2s_audio_id: i2s_speaker
    dac_type: external
    i2s_dout_pin: GPIO21   #  DIN Pin of the MAX98357A Audio Amplifier
    mode: mono

micro_wake_word:
  models:
    - hey_jarvis
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
        silence_detection: true
    - light.turn_on:
        id: led_ww           
        red: 30%
        green: 30%
        blue: 70%
        brightness: 60%
        effect: fast pulse 
    - light.turn_on:
        id: led_strip
        effect: "Scan Effect With Custom Values"
        red: 80%
        green: 0%
        blue: 80%
        brightness: 80%

voice_assistant:
  id: va
  microphone: va_mic
  auto_gain: 31dBFS
  noise_suppression_level: 2
  volume_multiplier: 4.0
  speaker: va_speaker
  on_stt_end:
       then: 
         - light.turn_off: led_ww
         - light.turn_off: led_strip
  on_error:
          - micro_wake_word.start:  
  on_end:
        then:
          - light.turn_off: led_ww
          - light.turn_off: led_strip
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start:

Anything in the logs that might be useful for us?

[20:39:18][D][micro_wake_word:162]: The 'hey jarvis' model sliding average probability is 0.990 and most recent probability is 1.000
[20:39:18][D][micro_wake_word:123]: Wake Word 'hey jarvis' Detected
[20:39:18][D][micro_wake_word:195]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[20:39:18][D][micro_wake_word:129]: Stopping Microphone
[20:39:18][D][micro_wake_word:195]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[20:39:18][D][esp-idf:000]: I (5949882) I2S: DMA queue destroyed

[20:39:18][D][micro_wake_word:195]: State changed from STOPPING_MICROPHONE to IDLE
[20:39:18][D][voice_assistant:504]: State changed from IDLE to START_MICROPHONE
[20:39:18][D][voice_assistant:510]: Desired state set to START_PIPELINE
[20:39:18][D][light:036]: 'On board light' Setting:
[20:39:18][D][light:047]: State: ON
[20:39:18][D][light:051]: Brightness: 60%
[20:39:18][D][light:059]: Red: 43%, Green: 43%, Blue: 100%
[20:39:18][D][light:109]: Effect: 'Fast Pulse'
[20:39:18][D][light:036]: 'Led Strip' Setting:
[20:39:18][D][light:047]: State: ON
[20:39:18][D][light:051]: Brightness: 80%
[20:39:18][D][light:059]: Red: 100%, Green: 0%, Blue: 100%
[20:39:18][D][light:109]: Effect: 'Scan Effect With Custom Values'
[20:39:18][D][voice_assistant:221]: Starting Microphone
[20:39:18][D][voice_assistant:504]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[20:39:18][D][esp-idf:000]: I (5949928) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[20:39:18][D][voice_assistant:504]: State changed from STARTING_MICROPHONE to START_PIPELINE
[20:39:18][D][voice_assistant:275]: Requesting start...
[20:39:18][D][voice_assistant:504]: State changed from START_PIPELINE to STARTING_PIPELINE
[20:39:18][D][voice_assistant:525]: Client started, streaming microphone
[20:39:18][D][voice_assistant:504]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[20:39:18][D][voice_assistant:510]: Desired state set to STREAMING_MICROPHONE
[20:39:18][D][voice_assistant:627]: Event Type: 1
[20:39:18][D][voice_assistant:630]: Assist Pipeline running
[20:39:18][D][voice_assistant:627]: Event Type: 3
[20:39:18][D][voice_assistant:641]: STT started
[20:39:20][D][voice_assistant:627]: Event Type: 11
[20:39:20][D][voice_assistant:783]: Starting STT by VAD
[20:39:23][D][voice_assistant:627]: Event Type: 12
[20:39:23][D][voice_assistant:787]: STT by VAD end
[20:39:23][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[20:39:23][D][voice_assistant:510]: Desired state set to AWAITING_RESPONSE
[20:39:23][D][voice_assistant:504]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[20:39:23][D][esp-idf:000]: I (5954617) I2S: DMA queue destroyed

[20:39:23][D][voice_assistant:504]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[20:39:30][D][voice_assistant:627]: Event Type: 4
[20:39:30][D][voice_assistant:655]: Speech recognised as: " Turn on the lights."
[20:39:30][D][light:036]: 'On board light' Setting:
[20:39:30][D][light:047]: State: OFF
[20:39:30][D][light:085]: Transition length: 1.0s
[20:39:30][D][light:091]: Effect: 'None'
[20:39:30][D][light:036]: 'Led Strip' Setting:
[20:39:30][D][light:047]: State: OFF
[20:39:30][D][light:085]: Transition length: 1.0s
[20:39:30][D][light:091]: Effect: 'None'
[20:39:30][D][voice_assistant:627]: Event Type: 5
[20:39:30][D][voice_assistant:660]: Intent started
[20:39:31][D][light:036]: 'Led Strip' Setting:
[20:39:31][D][light:047]: State: ON
[20:39:31][D][light:085]: Transition length: 1.0s
[20:39:31][D][light:036]: 'On board light' Setting:
[20:39:31][D][light:047]: State: ON
[20:39:31][D][light:085]: Transition length: 1.0s
[20:39:31][D][voice_assistant:627]: Event Type: 6
[20:39:31][D][voice_assistant:627]: Event Type: 7
[20:39:31][D][voice_assistant:683]: Response: "Turned on the lights"
[20:39:31][D][voice_assistant:627]: Event Type: 2
[20:39:31][D][voice_assistant:719]: Assist Pipeline ended
[20:39:31][D][voice_assistant:504]: State changed from AWAITING_RESPONSE to IDLE
[20:39:31][D][voice_assistant:510]: Desired state set to IDLE
[20:39:31][D][esp-idf:000][speaker_task]: I (5962148) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[20:39:31][D][light:036]: 'On board light' Setting:
[20:39:31][D][light:047]: State: OFF
[20:39:31][D][light:085]: Transition length: 1.0s
[20:39:31][D][light:036]: 'Led Strip' Setting:
[20:39:31][D][light:047]: State: OFF
[20:39:31][D][light:085]: Transition length: 1.0s
[20:39:31][D][micro_wake_word:399]: Resetting buffers and probabilities
[20:39:31][D][micro_wake_word:195]: State changed from IDLE to START_MICROPHONE
[20:39:31][D][micro_wake_word:107]: Starting Microphone
[20:39:31][D][micro_wake_word:195]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[20:39:31][D][i2s_audio.speaker:211]: Starting I2S Audio Speaker
[20:39:31][D][esp-idf:000]: I (5962206) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[20:39:31][D][micro_wake_word:195]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[20:39:31][D][i2s_audio.speaker:214]: Started I2S Audio Speaker
[20:39:31][D][esp-idf:000][speaker_task]: I (5962251) I2S: DMA queue destroyed

[20:39:31][D][i2s_audio.speaker:218]: Stopping I2S Audio Speaker
[20:39:31][D][i2s_audio.speaker:230]: Stopped I2S Audio Speaker

Additional information

No response

Coldness00 commented 1 month ago

Hello, I had same issue, I fixed it by creating a media_player entity:

media_player:
  - platform: i2s_audio
    id: speaki2s
    i2s_audio_id: i2s_out
    name: ESPHome VA Media Player
    dac_type: external
    i2s_dout_pin: GPIO16
    mode: mono
    on_pause:
      - media_player.stop

Then in voice assistant, I do not define a speaker. But I tell him to use the media player with the URL given of the audio file

voice_assistant:
  id: va
  microphone: mici2s
  use_wake_word: true
  noise_suppression_level: 4
  auto_gain: 31dBFS
  volume_multiplier: 8.0
  on_tts_end:
        - media_player.play_media:
            id: speaki2s
            media_url: !lambda 'return x;'
aviitzhaki commented 1 month ago

Thanks

media_player entity needs arduino platform micro_wake_word needs esp-idf platform

How can I resolve this?

Coldness00 commented 1 month ago

I use push to talk :/ Maybe try with speaker if it also works

Le jeu. 12 sept. 2024, 00:06, aviitzhaki @.***> a écrit :

Thanks

media_player entity needs arduino platform micro_wake_word needs esp-idf platform

How can I resolve this?

— Reply to this email directly, view it on GitHub https://github.com/esphome/issues/issues/6231#issuecomment-2344784415, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDRDSQKCSXQS6YM35WBZGDZWC5ERAVCNFSM6AAAAABN2GVQQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBUG44DINBRGU . You are receiving this because you commented.Message ID: @.***>

aviitzhaki commented 1 month ago

Thank you @Coldness00

the issue was related to network setting

The solution:

  1. turn off Local network
  2. set local network to https://homeassistant.localdomain