Voice Assistant-Add support for the espressif esp32-korvo-v1.1

esphome / feature-requests

ESPHome Feature Request Tracker

https://esphome.io/

422 stars 27 forks source link

Voice Assistant-Add support for the espressif esp32-korvo-v1.1 #2430

Open rarroyo6 opened 1 year ago

rarroyo6 commented 1 year ago

Describe the problem you have/What new integration you would like

Add support for the espressif esp32-korvo-v1.1, see documentation here: https://github.com/espressif/esp-skainet/blob/master/docs/en/hw-reference/esp32/user-guide-esp32-korvo-v1.1.md Please describe your use case for this integration and alternatives you've tried:

This will provide another option for a voice assistant with all the needed features built-in. Additional context

This board is available, relatively inexpensive, and has a microphone array, leds, and speaker output.

alextrical commented 8 months ago

I'm in the process of designing a smart speaker case for this series of boards Korvo v1 1 and Korvo1 V5, with the intent of being both elegant (3d printed, but not looking like it) and compatible with the Echo Dot V3 mounting accessories ecosystem. A render, of the current WIP case: 38024bda1367eed17b55df1b17eb0f50a32798ac_2_1380x776

For more details of the case design see here. https://community.home-assistant.io/t/far-field-satellite-with-an-elegant-3d-printed-enclosures/699893

Further testing and hopefully a working Yaml will be posted once I get back from holiday in a week, and get some tests done. After that I will start making a cost optimised version of the board, while aiming to keep future upgrade paths for AEC and beam forming with a 3 mic array

alextrical commented 8 months ago

Edit:Connecting this to GPIO still didn't work for audio output ~~Looking at the YAML when trying to compile it seems that it will not compile if i2s_mclk_pin is missing or set to the common unused pin identifier of -1~~

I would assume that it means that the ES7210 actually needs this pin to be connected for the audio output to work. Are you aware that that the ES7210 mclk pin (R211 0R is NC) isn't connected in HW? at least according to the schematic.

~~Edit:~~ ~~I can confirm that the ES7210 mclk has no connection to the ESP32.~~ 20240319_194136

~~I'm going to try both bridging the pad to share GPIO0, and also try connecting it to a unused GPIO, possibly 16 or 17 depending on which is a safe pin to use~~

alextrical commented 8 months ago

also interesting to see that this board has pads for Both a Wroom and a Wrover, it gives hope to the idea of transplanting in a S3 module at a point in the future

alextrical commented 8 months ago

Edit:Connecting this to GPIO still didn't work for audio output ~~Definitely a PITA to solder, but it should be connected up. If this DAC works in the ESP32-S3-BOX v2.5 it should be a lot closer to working.~~

janstadt commented 8 months ago

Anyone have any luck piping the TTS out to an external media_player in home assistant? I see that there is a speaker section in the configs, but i also followed an m5stack tutorial on piping the response on_tts_end in the voice assistant to a home assistant speaker like this:

on_tts_end:
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.master_bedroom
          media_content_id: !lambda 'return x;'
          media_content_type: music
          announce: "true"

More info here: https://www.youtube.com/watch?v=o3yZWD_sFIE

This didnt work for this device unfortunately for some reason. TTS nor STT or wake word would work with this section in there. I do have the boolean set to allow ESP to make calls to HA if anyone is wondering.

janstadt commented 8 months ago

ok so i have it working, it seems to keep working too after a while. increasing the delay time seems to have fixed it. i updated it to the release that just came out and it seems even more stable so i made it require it. i don't get audio out and i think it's because the v1.1 only supports Chinese according to the docs for it. i got rid of the gpio0 reuse error and the warning it shows.

esphome:
  name: bender
  friendly_name: bender
  min_version: 2023.12.8
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - priority: -100
      then:
        - wait_until: api.connected
        - delay: 4s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp-wrover-kit
  framework:
    type: arduino
    version: recommended

external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: !secret haapienc

ota:
  password: !secret haotaenc

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 30%
                red: 0%
                green: 0%
                brightness: 25%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - output.turn_on: pa_ctrl
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
    on_release:
      - voice_assistant.stop:
      - output.turn_off: pa_ctrl
      - light.turn_off:
          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

I cant seem to get it to continue to listen for wake words after the initial time. Also, it seems as though there are some issues when i restart the device that it doesnt connect to my wifi. Basically have to reflash via UART.

janstadt commented 8 months ago

Interesting. If i hold down the RECORD button the wake word works just fine. Is there a way to keep that on always? I see that check about if use wake word is on to start listening continuously. If i toggle my wake word bool in HA on and off (even though it says its on) then things start working. I wonder why that bool isnt getting honored on start up?

janstadt commented 8 months ago

substitutions:
  name: alexa-livingroom
  friendly_name: Alexa Living Room
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2023.12.8
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - priority: -100
      then:
        # - wait_until: api.connected
        - delay: 15s #4s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp-wrover-kit
  framework:
    type: arduino
    version: recommended

external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:
  baud_rate: 9600
  level: INFO

# Enable Home Assistant API
api:
  reboot_timeout: 1min
  encryption:
    key: "[key]"

ota:
  password: "[pwd]"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  domain: [domain]
  fast_connect: true
  manual_ip:
    static_ip: [ip]
    gateway: [gateway]
    subnet: [subnet]

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_tts_end:
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.the_kitchen
          media_content_id: !lambda 'return x;'
          media_content_type: music
          announce: "true"
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 30%
                red: 0%
                green: 0%
                brightness: 25%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - output.turn_on: pa_ctrl
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
    on_release:
      - voice_assistant.stop:
      - output.turn_off: pa_ctrl
      - light.turn_off:
          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Ok so this is working for me now. I had to do some random things with the wifi to make it reconnect on reboot but overall i think this is a good solution if you have any external speaker connected to HA for piping audio to.

janstadt commented 8 months ago

Bummer, network still wont connect automatically. Anyone have any ideas on that?

janstadt commented 8 months ago

Oh wow, i clicked the RST button while connected to the device after power cycling and it kicked on. Does that give anyone any ideas? Why isnt it auto reseting or something on power cycle?

DuploDom commented 8 months ago

Edit:Connecting this to GPIO still didn't work for audio output ~Definitely a PITA to solder, but it should be connected up. If this DAC works in the ESP32-S3-BOX v2.5 it should be a lot closer to working.~

Did You find a solution to make Audio Out / Speaker work?

huishizhao commented 8 months ago

could not call homeassistant service to speark out intent response voice

substitutions:
  name: alexa-livingroom
  friendly_name: Alexa Living Room
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2023.12.8
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - priority: -100
      then:
        # - wait_until: api.connected
        - delay: 15s #4s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp-wrover-kit
  framework:
    type: arduino
    version: recommended

external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:
  baud_rate: 9600
  level: INFO

# Enable Home Assistant API
api:
  reboot_timeout: 1min
  encryption:
    key: "[key]"

ota:
  password: "[pwd]"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  domain: [domain]
  fast_connect: true
  manual_ip:
    static_ip: [ip]
    gateway: [gateway]
    subnet: [subnet]

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_tts_end:
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.the_kitchen
          media_content_id: !lambda 'return x;'
          media_content_type: music
          announce: "true"
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 30%
                red: 0%
                green: 0%
                brightness: 25%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - output.turn_on: pa_ctrl
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
    on_release:
      - voice_assistant.stop:
      - output.turn_off: pa_ctrl
      - light.turn_off:
          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Could not call homeassistant media_player.play_media service to speak out intent response through a media player. Error message: 2024-04-05 15:57:58.039 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/core.py", line 2542, in async_call response_data = await coro ^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/core.py", line 2579, in _execute_service return await target(service_call) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 971, in entity_service_call single_response = await _handle_entity_call( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 1043, in _handle_entity_call result = await task ^^^^^^^^^^ File "/usr/src/homeassistant/homeassistant/components/mpd/media_player.py", line 504, in async_play_media await self._client.add(media_id) File "/usr/local/lib/python3.12/site-packages/mpd/asyncio.py", line 318, in __run await result._feed_from(self) File "/usr/local/lib/python3.12/site-packages/mpd/asyncio.py", line 45, in _feed_from line = await mpdclient._read_line() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/mpd/asyncio.py", line 416, in _read_line raise CommandError(error) mpd.base.CommandError: [50@0] {add} No such directory

Through the ESPHOME log, I have found the URL of the response voice and could play on the website. [voice_assistant:599]: Response URL: "https://xxxxx.duckdns.org:8123/api/tts_proxy/9fa665f3bba2007e2f73f5b06628d6918897e9d9_zh_16de212ebd_baidu.wav"

huishizhao commented 8 months ago

build a on device wake word using -microWakeWord

substitutions:
  name: esp32-korvo-1
  friendly_name: esp32-korvo-1
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
  micro_wake_word_model: okay_nabu
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2023.12.8
  platformio_options:
    board_build.flash_mode: dio
  project:
    name: esphome.voice-assistant
    version: "2.0"
  on_boot:
    - priority: -100
      then:
        - light.turn_on:
            id: led_ring
            blue: 0%
            red: 100%
            green: 0%
            effect: Fast Pulse
        - delay: 1s
        - wait_until:
            condition:
              wifi.connected:
        - light.turn_on:
            id: led_ring
            blue: 0%
            red: 100%
            green: 50%
            effect: Slow Pulse
        - wait_until: 
            condition:
              api.connected
        - lambda: id(init_in_progress) = false;
        - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        - script.execute: reset_led
esp32:
  board: esp-wrover-kit
  flash_size: 16MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_IDF_TARGET_ESP32: y
      CONFIG_ESPTOOLPY_FLASHMODE_QIO: y
      CONFIG_ESPTOOLPY_FLASHFREQ_80M: y
      CONFIG_ESPTOOLPY_FLASHSIZE_16MB: y
      CONFIG_PARTITION_TABLE_CUSTOM: y
      CONFIG_PARTITION_TABLE_CUSTOM_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_OFFSET: "0x8000"
      CONFIG_ESP32_DEFAULT_CPU_FREQ_240: y
      CONFIG_ESP32_SPIRAM_SUPPORT: y
      CONFIG_SPIRAM_SPEED_80M: y
      CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT: y
      CONFIG_I2S_ENABLE_DEBUG_LOG: y
#psram:
#  mode: octal
#  speed: 80MHz
external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "vRvf5APYhFeBjsFt8zzQ6xpuiZqn3oCAIbyVHCBawWM="

ota:
  password: "9522b9fe61f659e429743438edf3240e"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Korvo-1 Fallback Hotspot"
    password: "vBJEmQ5iJHQx"

captive_portal:

i2c:
  - id: bus
    sda: GPIO19
    scl: GPIO32
    scan: true
    frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin:
      number: GPIO12
      ignore_strapping_warning: true
i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 
    i2s_bclk_pin: GPIO25 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 
    i2s_bclk_pin: GPIO27 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

esp_adf:

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36
    pdm: false

micro_wake_word:
  model: ${micro_wake_word_model}  #okay_nabu
  on_wake_word_detected:
    then:
      - voice_assistant.start:
          wake_word: !lambda return wake_word;

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5

  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: reset_led
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: reset_led
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 100%
        brightness: 60%
        effect: Working
  on_stt_end: 
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.ke_ting
          media_content_id: !lambda return x;
          media_content_type: music
          announce: "true"

  on_tts_stream_start:
    - output.turn_on: pa_ctrl
    - delay: 100ms
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: reset_led

  on_end:
    - wait_until:
        not:
          speaker.is_playing:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: reset_led
    - if:
        condition:
          and:
            - switch.is_off: mute
            - lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start:

  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: reset_led
          - delay: 2s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: reset_led

  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - if:
              condition:
                lambda: return id(wake_word_engine_location).state == "In Home Assistant";
              then:
                - lambda: id(voice_asst).set_use_wake_word(true);
                - voice_assistant.start_continuous:
          - if:
              condition:
                lambda: return id(wake_word_engine_location).state == "On device";
              then:
                - micro_wake_word.start
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false;
    - script.execute: reset_led

  on_client_disconnected:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "In Home Assistant";
        then:
          - lambda: id(voice_asst).set_use_wake_word(false);
          - voice_assistant.stop:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - micro_wake_word.stop
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: reset_led

script:
  - id: reset_led
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_listening_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 0%
                      red: 0%
                      green: 100%
                      brightness: 100%
                      effect: wakeword
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_thinking_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 100%
                      green: 0%
                      brightness: 100%
                      effect: Working
                  - delay: 100ms
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_replying_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 0%
                      green: 0%
                      brightness: 100%
                      effect: Working
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
                then:
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 0%
                      green: 0%
                      brightness: 40%
                      effect: none
                  - delay: 200ms
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_not_ready_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 40%
                      red: 100%
                      green: 0%
                      effect: Slow Pulse
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_error_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 0%
                      red: 100%
                      green: 0%
                      brightness: 100%
                      effect: none
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id};
                then:                     
                  - light.turn_off: led_ring
          else:
            - light.turn_on:
                id: led_ring
                blue: 0%
                red: 100%
                green: 0%
                effect: Fast Pulse

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 300ms
          update_interval: 300ms
          min_brightness: 50%
          max_brightness: 100%

      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 3ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false
      - pulse:
          name: "Slow Pulse"
          transition_length: 0.5s
          update_interval: 1s
          min_brightness: 0%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 50ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

switch:
  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - if:
                      condition:
                        lambda: return id(wake_word_engine_location).state == "In Home Assistant";
                      then:
                        - lambda: id(voice_asst).set_use_wake_word(true);
                        - voice_assistant.start_continuous
                  - if:
                      condition:
                        lambda: return id(wake_word_engine_location).state == "On device";
                      then:
                        - micro_wake_word.start
            - script.execute: reset_led
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_asst).set_use_wake_word(false);
            - voice_assistant.stop
            - micro_wake_word.stop
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: reset_led
  - platform: restart
    name: "${name} Restart"
select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - wait_until:
          lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
      - if:
          condition:
            lambda: return x == "In Home Assistant";
          then:
            - micro_wake_word.stop
            - delay: 500ms
            - if:
                condition:
                  switch.is_off: mute
                then:
                  - lambda: id(voice_asst).set_use_wake_word(true);
                  - voice_assistant.start_continuous:
      - if:
          condition:
            lambda: return x == "On device";
          then:
            - lambda: id(voice_asst).set_use_wake_word(false);
            - voice_assistant.stop
            - delay: 500ms
            - micro_wake_word.start

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    publish_initial_state : True
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 100%
          effect: "Wakeword"
#    on_release:
#      - voice_assistant.stop:
#      - output.turn_off: pa_ctrl
#      - light.turn_off:
#          id: led_ring
sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Krull56 commented 8 months ago

Hi @huishizhao

Audio out is working ?

huishizhao commented 8 months ago

Does anyone could found why I2S audio dind't work from the log?

I didn't understand why the es8311 I2C address was set to 0x18 and what will be write to resgister. from the es8311 manual, The chip address must be 0011 00x, where x equals CE. What's mean of CE? ES8311 PB.pdf

[15:16:31][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD [15:27:53][D][micro_wake_word:362]: Wake word sliding average probability is 0.529 and most recent probability is 1.000 [15:27:53][D][micro_wake_word:128]: Wake Word Detected [15:27:53][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [15:27:53][D][micro_wake_word:134]: Stopping Microphone [15:27:53][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [15:27:53][D][esp-idf:000]: I (1042611) I2S: DMA queue destroyed

[15:27:53][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE [15:27:53][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE [15:27:53][D][voice_assistant:422]: Desired state set to START_MICROPHONE [15:27:53][D][voice_assistant:118]: microphone not running [15:27:53][D][voice_assistant:202]: Requesting start... [15:27:53][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE [15:27:53][D][voice_assistant:118]: microphone not running [15:27:53][D][voice_assistant:118]: microphone not running [15:27:53][D][voice_assistant:437]: Client started, streaming microphone [15:27:53][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE [15:27:53][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE [15:27:53][D][voice_assistant:155]: Starting Microphone [15:27:53][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE [15:27:53][D][esp-idf:000]: I (1042703) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[15:27:53][D][esp-idf:000]: I (1042713) I2S: I2S1, MCLK output by GPIO0

[15:27:53][D][voice_assistant:523]: Event Type: 1 [15:27:53][D][voice_assistant:526]: Assist Pipeline running [15:27:53][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE [15:27:53][D][voice_assistant:523]: Event Type: 3 [15:27:53][D][voice_assistant:537]: STT started [15:27:53][D][light:036]: 'esp32-korvo-2 Light' Setting: [15:27:53][D][light:051]: Brightness: 100% [15:27:53][D][light:059]: Red: 0%, Green: 100%, Blue: 0%

[15:27:55][D][voice_assistant:523]: Event Type: 11 [15:27:55][D][voice_assistant:677]: Starting STT by VAD [15:27:56][D][voice_assistant:523]: Event Type: 12 [15:27:56][D][voice_assistant:681]: STT by VAD end [15:27:56][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [15:27:56][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE [15:27:56][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [15:27:56][D][light:036]: 'esp32-korvo-2 Light' Setting: [15:27:57][D][light:051]: Brightness: 100% [15:27:57][D][light:059]: Red: 100%, Green: 0%, Blue: 100%

[15:27:57][D][esp-idf:000]: I (1045924) I2S: DMA queue destroyed

[15:27:57][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE [15:27:57][D][voice_assistant:523]: Event Type: 4 [15:27:57][D][voice_assistant:551]: Speech recognised as: "turn on kitchen light." [15:27:57][D][voice_assistant:523]: Event Type: 5 [15:27:57][D][voice_assistant:556]: Intent started [15:27:57][D][voice_assistant:523]: Event Type: 6 [15:27:57][D][voice_assistant:523]: Event Type: 7

[15:27:57][D][light:036]: 'esp32-korvo-2 Light' Setting: [15:27:57][D][light:051]: Brightness: 60% [15:27:57][D][light:059]: Red: 100%, Green: 100%, Blue: 0% [15:27:57][D][voice_assistant:523]: Event Type: 8 [15:27:57][D][voice_assistant:599]: Response URL: "https://xxxxxx.duckdns.org:8123/api/tts_proxy/9fa665f3bba2007e2f73f5b06628d6918897e9d9_zh_16de212ebd_baidu.wav" [15:27:57][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [15:27:57][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE

[15:27:57][D][esp-idf:000]: I (1046606) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[15:27:57][D][esp-idf:000]: I (1046609) I2S: I2S0, MCLK output by GPIO0

[15:27:57][D][i2s_audio.speaker:164]: Started I2S Audio Speaker [15:27:57][D][light:036]: 'esp32-korvo-2 Light' Setting: [15:27:57][D][light:051]: Brightness: 100% [15:27:57][D][light:059]: Red: 0%, Green: 0%, Blue: 100% [15:27:59][D][voice_assistant:523]: Event Type: 99 [15:27:59][D][voice_assistant:672]: TTS stream end [15:27:59][D][voice_assistant:287]: End of audio stream received [15:27:59][D][voice_assistant:416]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED [15:27:59][D][voice_assistant:422]: Desired state set to RESPONSE_FINISHED [15:27:59][D][i2s_audio.speaker:167]: Stopping I2S Audio Speaker [15:27:59][D][i2s_audio.speaker:178]: Stopped I2S Audio Speaker

[15:27:59][D][light:036]: 'esp32-korvo-2 Light' Setting: [15:27:59][D][light:051]: Brightness: 40% [15:27:59][D][light:059]: Red: 0%, Green: 0%, Blue: 100%

[15:27:59][D][voice_assistant:319]: Speaker has finished outputting all audio [15:27:59][D][voice_assistant:416]: State changed from RESPONSE_FINISHED to IDLE [15:27:59][D][voice_assistant:422]: Desired state set to IDLE [15:27:59][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE [15:27:59][D][micro_wake_word:115]: Starting Microphone [15:27:59][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE [15:28:00][D][esp-idf:000]: I (1048651) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[15:28:00][D][esp-idf:000]: I (1048653) I2S: I2S1, MCLK output by GPIO0

[15:28:00][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD

Krull56 commented 8 months ago

Thanks @huishizhao , just testing and micro-wakeword works fine. I wish someone found a solution for audio-out with ES8311 to make this card perfect low-cost Assist satellite

huishizhao commented 8 months ago

Hi @huishizhao

Audio out is working ? No, could not found the root cause, guess it's I2S clock or I2C address issue. No error from esphome log.

TomG736 commented 8 months ago

I almost have audio working with this... I just get a load of static at the moment. The MCLK pin isn't connected so that needs disabling, when disabled the driver tries to set the internal frequency to 512000, which isnt supported so fails. If this is fixed in the driver I just get static on the speaker... So progress but no luck so far.

GinAndBacon commented 8 months ago

Hmm. I'm having a bit of confusion due to the microwakeword config above. My understanding is it had to use an ESP32 S3 with PSRAM due to how it works. Is this no longer the case,?. My basic understanding is it listens for the wake word every 20ms or so and stores it in PSRAM as it takes a few Ms to determine if it's been triggered. That's stored in PSRAM and simply deletes the oldest data when the PSRAM gets close to being full.

I do know that when microwakeword first came out it didn't work on S3 models with only 2MB of PSRAM and this was confirmed by the ESPHome devs. They eventually got that working but 2MB of PSRAM was still required. Have they now gotten it to work with any ESP32 variant and PSRAM is no longer required? HA cam move so fast that if so, I missed that announcement.

Also, the voice pipeline used should have no value in home assistant because it's defined in the ESPHome configuration file. While I have been working on the Korvo-1, and I know this isn't the thread for that one, I'm just curious is PSRAM is no longer required for microwakeword because everything I've read says it's a requirement. Additionally the ESPHome documation says it's still required although it may not have been updated.

https://esphome.io/components/micro_wake_word.html

The micro_wake_word component requires an ESP32-S3 with PSRAM to function.

huishizhao commented 7 months ago

Hmm. I'm having a bit of confusion due to the microwakeword config above. My understanding is it had to use an ESP32 S3 with PSRAM due to how it works. Is this no longer the case,?. My basic understanding is it listens for the wake word every 20ms or so and stores it in PSRAM as it takes a few Ms to determine if it's been triggered. That's stored in PSRAM and simply deletes the oldest data when the PSRAM gets close to being full.

I do know that when microwakeword first came out it didn't work on S3 models with only 2MB of PSRAM and this was confirmed by the ESPHome devs. They eventually got that working but 2MB of PSRAM was still required. Have they now gotten it to work with any ESP32 variant and PSRAM is no longer required? HA cam move so fast that if so, I missed that announcement.

Also, the voice pipeline used should have no value in home assistant because it's defined in the ESPHome configuration file. While I have been working on the Korvo-1, and I know this isn't the thread for that one, I'm just curious is PSRAM is no longer required for microwakeword because everything I've read says it's a requirement. Additionally the ESPHome documation says it's still required although it may not have been updated.

https://esphome.io/components/micro_wake_word.html

The micro_wake_word component requires an ESP32-S3 with PSRAM to function.

From ESP32-Korvo V1.1 user manual: ESP32-WROVER-E: This ESP32 module contains the latest ESP32-D0WD-V3, a 16 MB flash and a 8 MB PSRAM for flexible data storage, featuring Wi-Fi / BT connectivity and data processing capability. https://github.com/espressif/esp-skainet/blob/master/docs/en/hw-reference/esp32/user-guide-esp32-korvo-v1.1.md

nagyrobi commented 7 months ago

https://github.com/espressif/esp-skainet

ESP32-S3 is recommend to run speech commands recognition, which supports AI instructions and high-speed octal SPI PSRAM. The Latest models will be deployed on ESP32-S3

GinAndBacon commented 7 months ago

EDIT: removed, see below post

GinAndBacon commented 7 months ago

Edit: Have you tried a voice pipeline with no wake word defined (I Believe it defauls.to okay_nabu for Open? If not, create one, and use model word of Alexa in your config and see if it still works with either Jarvis or Alexa.. With microwakeword no value should be populated for the voice pipeline

Maybe I'm wrong, maybe PSRAM.ia.the secret sauce and it looks like you have it defined in your config so maybe it somehow sees it and uses it as its probably pretty rare for development boards to have external PSRAM for the wrover, at least that I'm aware of.

janstadt commented 7 months ago

build a on device wake word using -microWakeWord

substitutions:
  name: esp32-korvo-1
  friendly_name: esp32-korvo-1
  voice_assist_idle_phase_id: "1"
  voice_assist_listening_phase_id: "2"
  voice_assist_thinking_phase_id: "3"
  voice_assist_replying_phase_id: "4"
  voice_assist_not_ready_phase_id: "10"
  voice_assist_error_phase_id: "11"
  voice_assist_muted_phase_id: "12"
  micro_wake_word_model: okay_nabu
esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2023.12.8
  platformio_options:
    board_build.flash_mode: dio
  project:
    name: esphome.voice-assistant
    version: "2.0"
  on_boot:
    - priority: -100
      then:
        - light.turn_on:
            id: led_ring
            blue: 0%
            red: 100%
            green: 0%
            effect: Fast Pulse
        - delay: 1s
        - wait_until:
            condition:
              wifi.connected:
        - light.turn_on:
            id: led_ring
            blue: 0%
            red: 100%
            green: 50%
            effect: Slow Pulse
        - wait_until: 
            condition:
              api.connected
        - lambda: id(init_in_progress) = false;
        - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        - script.execute: reset_led
esp32:
  board: esp-wrover-kit
  flash_size: 16MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_IDF_TARGET_ESP32: y
      CONFIG_ESPTOOLPY_FLASHMODE_QIO: y
      CONFIG_ESPTOOLPY_FLASHFREQ_80M: y
      CONFIG_ESPTOOLPY_FLASHSIZE_16MB: y
      CONFIG_PARTITION_TABLE_CUSTOM: y
      CONFIG_PARTITION_TABLE_CUSTOM_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_FILENAME: "default_16MB.csv" #"partitions_esp32.csv"
      CONFIG_PARTITION_TABLE_OFFSET: "0x8000"
      CONFIG_ESP32_DEFAULT_CPU_FREQ_240: y
      CONFIG_ESP32_SPIRAM_SUPPORT: y
      CONFIG_SPIRAM_SPEED_80M: y
      CONFIG_ESP_SYSTEM_PANIC_SILENT_REBOOT: y
      CONFIG_I2S_ENABLE_DEBUG_LOG: y
#psram:
#  mode: octal
#  speed: 80MHz
external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "vRvf5APYhFeBjsFt8zzQ6xpuiZqn3oCAIbyVHCBawWM="

ota:
  password: "9522b9fe61f659e429743438edf3240e"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Korvo-1 Fallback Hotspot"
    password: "vBJEmQ5iJHQx"

captive_portal:

i2c:
  - id: bus
    sda: GPIO19
    scl: GPIO32
    scan: true
    frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin:
      number: GPIO12
      ignore_strapping_warning: true
i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 
    i2s_bclk_pin: GPIO25 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 
    i2s_bclk_pin: GPIO27 
    i2s_mclk_pin:
       number: GPIO0
       allow_other_uses: true
       ignore_strapping_warning: true

esp_adf:

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36
    pdm: false

micro_wake_word:
  model: ${micro_wake_word_model}  #okay_nabu
  on_wake_word_detected:
    then:
      - voice_assistant.start:
          wake_word: !lambda return wake_word;

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5

  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: reset_led
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: reset_led
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 100%
        brightness: 60%
        effect: Working
  on_stt_end: 
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.ke_ting
          media_content_id: !lambda return x;
          media_content_type: music
          announce: "true"

  on_tts_stream_start:
    - output.turn_on: pa_ctrl
    - delay: 100ms
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: reset_led

  on_end:
    - wait_until:
        not:
          speaker.is_playing:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: reset_led
    - if:
        condition:
          and:
            - switch.is_off: mute
            - lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start:

  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: reset_led
          - delay: 2s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: reset_led

  on_client_connected:
    - if:
        condition:
          switch.is_off: mute
        then:
          - if:
              condition:
                lambda: return id(wake_word_engine_location).state == "In Home Assistant";
              then:
                - lambda: id(voice_asst).set_use_wake_word(true);
                - voice_assistant.start_continuous:
          - if:
              condition:
                lambda: return id(wake_word_engine_location).state == "On device";
              then:
                - micro_wake_word.start
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false;
    - script.execute: reset_led

  on_client_disconnected:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "In Home Assistant";
        then:
          - lambda: id(voice_asst).set_use_wake_word(false);
          - voice_assistant.stop:
    - if:
        condition:
          lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - micro_wake_word.stop
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: reset_led

script:
  - id: reset_led
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_listening_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 0%
                      red: 0%
                      green: 100%
                      brightness: 100%
                      effect: wakeword
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_thinking_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 100%
                      green: 0%
                      brightness: 100%
                      effect: Working
                  - delay: 100ms
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_replying_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 0%
                      green: 0%
                      brightness: 100%
                      effect: Working
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
                then:
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 0%
                      green: 0%
                      brightness: 40%
                      effect: none
                  - delay: 200ms
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_not_ready_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 40%
                      red: 100%
                      green: 0%
                      effect: Slow Pulse
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_error_phase_id};
                then:                     
                  - light.turn_on:
                      id: led_ring
                      blue: 0%
                      red: 100%
                      green: 0%
                      brightness: 100%
                      effect: none
            - if:
                condition:
                  lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id};
                then:                     
                  - light.turn_off: led_ring
          else:
            - light.turn_on:
                id: led_ring
                blue: 0%
                red: 100%
                green: 0%
                effect: Fast Pulse

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 300ms
          update_interval: 300ms
          min_brightness: 50%
          max_brightness: 100%

      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 3ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 40ms
          reverse: false
      - pulse:
          name: "Slow Pulse"
          transition_length: 0.5s
          update_interval: 1s
          min_brightness: 0%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 50ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

switch:
  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - if:
                      condition:
                        lambda: return id(wake_word_engine_location).state == "In Home Assistant";
                      then:
                        - lambda: id(voice_asst).set_use_wake_word(true);
                        - voice_assistant.start_continuous
                  - if:
                      condition:
                        lambda: return id(wake_word_engine_location).state == "On device";
                      then:
                        - micro_wake_word.start
            - script.execute: reset_led
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_asst).set_use_wake_word(false);
            - voice_assistant.stop
            - micro_wake_word.stop
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: reset_led
  - platform: restart
    name: "${name} Restart"
select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - wait_until:
          lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
      - if:
          condition:
            lambda: return x == "In Home Assistant";
          then:
            - micro_wake_word.stop
            - delay: 500ms
            - if:
                condition:
                  switch.is_off: mute
                then:
                  - lambda: id(voice_asst).set_use_wake_word(true);
                  - voice_assistant.start_continuous:
      - if:
          condition:
            lambda: return x == "On device";
          then:
            - lambda: id(voice_asst).set_use_wake_word(false);
            - voice_assistant.stop
            - delay: 500ms
            - micro_wake_word.start

globals:
  - id: init_in_progress
    type: bool
    restore_value: false
    initial_value: "true"
  - id: voice_assistant_phase
    type: int
    restore_value: false
    initial_value: ${voice_assist_not_ready_phase_id}

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    publish_initial_state : True
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    publish_initial_state : True
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 0%
          green: 100%
          brightness: 100%
          effect: "Wakeword"
#    on_release:
#      - voice_assistant.stop:
#      - output.turn_off: pa_ctrl
#      - light.turn_off:
#          id: led_ring
sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 39 #8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

@huishizhao is this fully working for you? I am having all sorts of issues with it and am hoping to find a suitable configuration for the korvo 1.1 that works and doesnt have many/any quirks.

janstadt commented 7 months ago

I almost have audio working with this... I just get a load of static at the moment. The MCLK pin isn't connected so that needs disabling, when disabled the driver tries to set the internal frequency to 512000, which isnt supported so fails. If this is fixed in the driver I just get static on the speaker... So progress but no luck so far.

@TomG736 Did you ever make any progress on this? Any configs you'd be willing to share? TIA.

TomG736 commented 7 months ago

I almost have audio working with this... I just get a load of static at the moment. The MCLK pin isn't connected so that needs disabling, when disabled the driver tries to set the internal frequency to 512000, which isnt supported so fails. If this is fixed in the driver I just get static on the speaker... So progress but no luck so far.

@TomG736 Did you ever make any progress on this? Any configs you'd be willing to share? TIA.

No I've only ever got static out of the speaker :(

janstadt commented 7 months ago

@TomG736 Bummer. Are you using microWakeWord or piper via HA? Really trying to get that microWakeWord setup working properly but havent have much success.

janstadt commented 7 months ago

substitutions:
  friendly_name: esp32-voice-3

esphome:
  name: esp32-voice-3
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - priority: -100
      then:
        - wait_until: api.connected
        - delay: 1s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp-wrover-kit
  framework:
    #type: esp-idf
    type: arduino
    version: recommended

external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: <REDACTED>

ota:
  password: <REDACTED>

wifi:
  ssid: <REDACTED>
  password: <REDACTED>
  use_address: <REDACTED>

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin: GPIO0 #GPIO42
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin: GPIO0 #GPIO20

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 30%
                red: 0%
                green: 0%
                brightness: 25%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:

@pbanj have you made any additional updates/fixes to your yaml for the korvo 1.1?

TomG736 commented 7 months ago

@TomG736 Bummer. Are you using microWakeWord or piper via HA? Really trying to get that microWakeWord setup working properly but havent have much success.

I'm using piper via HA

janstadt commented 7 months ago

@TomG736 Bummer. Are you using microWakeWord or piper via HA? Really trying to get that microWakeWord setup working properly but havent have much success.

I'm using piper via HA

Would you mind sharing your yaml or are you using the one posted up top by @pbanj ?

TomG736 commented 7 months ago

@TomG736 Bummer. Are you using microWakeWord or piper via HA? Really trying to get that microWakeWord setup working properly but havent have much success.

I'm using piper via HA

Would you mind sharing your yaml or are you using the one posted up top by @pbanj ?

I'm pretty much using the one you posted here but with minor changes where i was trying to swap to tapping the button to talk instead of holding it.

Did you ever fix the reset issue where you have to press the rst button after turning it on?

janstadt commented 7 months ago

Nah i havent figured that out yet and ended up seeing later on that micro wake word is supported so i started focusing on that. I can circle back and see whats going on there though. I wish there were smarter people out there than me that could just get this stuff working so we dont have to fumble through all this stuff. lol.

janstadt commented 7 months ago

So, i've changed back to piper from micro wake word using the same config i posted earlier and my device is non responsive to wake words. I wonder if all of the futzing with the micro wake word messed up some partitions or something on the device or something?

pbanj commented 7 months ago

substitutions:
  friendly_name: esp32-voice-3

esphome:
  name: esp32-voice-3
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - priority: -100
      then:
        - wait_until: api.connected
        - delay: 1s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp-wrover-kit
  framework:
    #type: esp-idf
    type: arduino
    version: recommended

external_components:
  - source: github://rpatel3001/esphome@es8311
    components: [ es8311 ]
  - source: github://rpatel3001/esphome@es7210
    components: [ es7210 ]
  - source: github://pr#5230
    components:
      - esp_adf

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: <REDACTED>

ota:
  password: <REDACTED>

wifi:
  ssid: <REDACTED>
  password: <REDACTED>
  use_address: <REDACTED>

i2c:
  sda: GPIO19 #GPIO1
  scl: GPIO32 #GPIO2
  scan: true
  frequency: 400kHz

es8311:
  address: 0x18

es7210:
  address: 0x40

output:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO12 #GPIO38

i2s_audio:
  - id: codec
    i2s_lrclk_pin: GPIO22 #GPIO41 #ws
    i2s_bclk_pin: GPIO25 #GPIO40 #clk
    i2s_mclk_pin: GPIO0 #GPIO42
  - id: mic_adc
    i2s_lrclk_pin: GPIO26 #GPIO9 #ws
    i2s_bclk_pin: GPIO27 #GPIO10 #clk
    i2s_mclk_pin: GPIO0 #GPIO20

speaker:
  - platform: i2s_audio
    id: external_speaker
    dac_type: external
    i2s_audio_id: codec
    i2s_dout_pin: GPIO13 #GPIO39
    mode: mono

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_audio_id: mic_adc
    i2s_din_pin: GPIO36 #GPIO11
    pdm: false

voice_assistant:
  id: voice_asst
  microphone: external_mic
  speaker: external_speaker
  noise_suppression_level: 2
  auto_gain: 15dBFS
  volume_multiplier: 0.5
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 30%
                red: 0%
                green: 0%
                brightness: 25%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO33 #GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:

@pbanj have you made any additional updates/fixes to your yaml for the korvo 1.1?

no i ended up in the hospital for a bit. but im going to start messing with it again

janstadt commented 7 months ago

sorry to hear that @pbanj. Keep me posted! I need to get these devices hardened before my wife loses her mind without a voice assistant. lol.

sajov commented 7 months ago

Hi, I found this project https://github.com/haade-administrator/korvo-esphome/blob/main/esp32korvo_esphome.yaml

https://haade.fr/en/blog/homeassistant-assist-esphome-esp32-korvo-wroover-e-microphone-array

maybe helpful

janstadt commented 7 months ago

@sajov thanks. I tried that setup already and while it worked alright, i had better luck with the previous config and i think its a bit more flexible for microWakeWord.

sajov commented 7 months ago

@janstadt good to know. I have followed and tested the setup here, today I tried the other one. I’m just looking for a stable setup and using a google nest mini. Both solutions stop work over some time. I have to spend more time in testing and understanding

janstadt commented 7 months ago

So i disabled the service call to my homeassistant media player and now its not hanging and goes back to idle. I'll need to look at why its doing that now. One thing after another. lol

GinAndBacon commented 6 months ago

So, i've changed back to piper from micro wake word using the same config i posted earlier and my device is non responsive to wake words. I wonder if all of the futzing with the micro wake word messed up some partitions or something on the device or something?

I know this is an older post but I think you are confusing Piper with openeakeword. Openeakeword can't listen for the trigger word on ESP32, the code was never optimized. With ESP32 and Openeakeword the HA server has to constantly listen for the trigger word and uses a lot of resources, especially with multiple assistants. I think 4 ESP32 assistants using Microwakeword can crash a raspberry pi running HA because of this. Microwakeword actually has the ESP32 device listen for the trigger word so no constant streaming needed. Piper is always used in a voice pipeline. Either locally or through Nabu Casa cloud. A raspberry pi constantly having to steam and check every few milliseconds to see if the trigger word was spoken uses a lot of CPU or RAM. This can easily be confirmed by using Microwakeword on an ESP32 and looking at the Openeakeword add on where it shows resources used by the add on.

Openeakeword can work the same on a Wyoming satellite but that requires something like a pi zero and a respeaker hat as Openeakeword is optimized for ARM and you actually have to install Wyoming and piper onto the pi before connecting to HA. Honestly I'm thinking of going that route. All the benefits of microwakeword without the current headaches. I've got a spare Pi4 and respeaker hat. I'm going to set that up today. Hoping for more stable performance. I'll report back to give my results. I have a feeling this is the best solution until they get microwakeword working more stable. I'm also going to build just and ESP32 We with PDM microphone to compare. That is the entire reason microwakeword was created, HA wanting to leverage ESPHome for obvious reasons.

Detailed information.om.setup and how it can listen for wake words with no constant streaming https://youtu.be/eTKgc0YDCwE?si=MEXuauwBqZvZEtrV

Great video on setting up HA with Open AI, downsides are cloud based and costs although I just set my limit to 1 dollar so it just stops working after I reach my limit and only costs me 1 dollar. It also shows how to use the same integration for an LLM but you need a PC running Linux with a decent GPU for halfway decent response times.

https://youtu.be/pAKqKTkx5X4?si=phSJxqVK9_-cUR-b

Interview with creator of HA and how Nvidia has reached out to them to build a specific LLM field HA. Downside is it will require something like a Nvidia Jetson nano.

https://youtu.be/hMlkzt-2qgk?si=jzA77uZFsM0inDn0

Project https://forums.developer.nvidia.com/t/jetson-ai-lab-home-assistant-integration/288225

ALX-TH commented 5 months ago

there is no working (mic+speacker) solution for that boad ?

sajov commented 5 months ago

there is no working (mic+speacker) solution for that boad ?

I used this versions above and it’s working including speaker

ALX-TH commented 5 months ago

@sajov u mean this one https://github.com/haade-administrator/korvo-esphome/blob/main/esp32korvo_esphome.yaml, right ?

dwitgen commented 5 months ago

So I was able to get the speaker to work using a modified jesserockz/esp-adf to enable and disable the PA in the start and stop functions when streaming the audio. I am not sure how good it works I just know I finally got clear audio from the korvo 1 with the wroover chip. I am completely new to this so how to provide the changes to the group is unknown to me and also there is some other changes that would need to happen in order to refence the gpio the proper way instead of directly in the code. I have to head to work so I will try and at least provide what I changed and maybe someone smarter than me can get it incorporated into a pr or something maybe tomorrow.

ALX-TH commented 5 months ago

@dwitgen can u pls share modified esp-adf ?

dwitgen commented 5 months ago

@dwitgen can u pls share modified esp-adf ?

Unfortunately I work aftetnoons, I will try to post something tomorrow. I had given up on the korvo 1 non S3 version and was only using it for testing and learning. I actually was just trying to control the pa inside esp-adf and low and behold I got audio. I was always able to get static but never any audio just like everyone else.

dwitgen commented 5 months ago

@dwitgen can u pls share modified esp-adf ?

Ok so I did some more testing and what I did in esp-adf had nothing to do with it working. I can only assume there was a change somewhere down the line with jesserockz/esp-adf. Sorry for any confusion but I went back to using the pr#5230.

pascalmtts commented 5 months ago

Is audio working for you through the 3.5mm port or with the dedicated speaker port?

dwitgen commented 5 months ago

Is audio working for you through the 3.5mm port or with the dedicated speaker port?

Both, enabling the PA inside the esp-adf speaker removes the popping from the speaker but it is still there in the 3.5mm headphone but that does not go through the PA so the pa is not what is causing the popping. I tried to attach a video but the smallest I was able to get was 16MB and 10MB is the max. So I am completely new to this and github so I am trying to figure how to share config. However I can say that what appears to make it work is the changes in the esp-adf from jesserockz. So you can use that directly but I believe you need to have a custom audio board config or at least that is how I am using it. So below is what I am using to setup the board, esp-adf speaker and microphone. I will leave the custom board component open for now but I am completely new to this and github so I am in complete testing mode trying to figure things out. If someone wants to grab that board config and add it to a more stable repository or see about getting added directly to the esp_adf boards. Note under esp_adf I am refencing board: esp32s3korvo1 and this has to be there, but it is using the custom audio board. I am so new to this I cannot even figure out how to properly paste but hopefully this will help out. Most everything else in the config is the same as others have used so not going to repeat that. Also I did try micro wake word and it did compile but used up almost all of the flash and gave space errors. I could switch it to use home assistant and it worked but never got it to work on the device. What I have tried on the esp32-s3-korvo-1 for micro wake word did work but horribly. Again, no guarantees on the custom board it is just me testing.

esp32: board: esp-wrover-kit variant: esp32 framework: type: esp-idf version: recommended sdkconfig_options: CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y" CONFIG_ESP32S3_DATA_CACHE_64KB: "y" CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y" CONFIG_AUDIO_BOARD_CUSTOM: "y" components:

name: esp32_korvo1_board source: github://dwitgen/esphome_test_audio_boards@main refresh: 0s psram: mode: octal speed: 80MHz

external_components:

source: github://pr#5230 components:
- esp_adf

esp_adf: board: esp32s3korvo1

speaker:

platform: esp_adf id: external_speaker

microphone:

platform: esp_adf id: external_mic

GinAndBacon commented 5 months ago

I know people have had issues getting the speakers working and both the 3.5mm jack and speaker outputs can't be used at the same time on both the V1.1 and -1 versions. I think that is the main issue. See the post above . One of the pins is either high or low when the 3.5mm jack is plugged in. I have the Korvo-1 and it has the "popping" issue when beginning a reply. I have given up on trying to figure that one out. The V1.1 and -1 are almost identical outside the S3 and I think slightly different DSP chip or codec. The top mic/led array is the same for both so only difference is the bottom, which has the same layout.

https://github.com/esphome/feature-requests/issues/2430#issuecomment-1932816462

Previous Next