tetele / onju-voice-satellite

An ESPHome config for the Onju Voice which makes it a Home Assistant voice satellite
MIT License
90 stars 15 forks source link

Problem with long wait for response and sound effect #85

Closed witold-gren closed 1 day ago

witold-gren commented 2 days ago

Flavor

MicroWakeWord

Checklist

Describe the issue

Hi, I don't really understand why all the responses to my configuration are very slow. Every time I ask a question I have to wait a few seconds for an answer. You can see it very nicely in the video at the link: https://www.youtube.com/watch?v=TcIm_3co8XQ Is there anything I can do to make the sound effect trigger faster and the response to be generated faster?

image image image

I have Home Assistance installed on Proxmox running on an Intel NUC13ANHi7 13th Gen Core I7-1360P. I have no problem generating voices when I do it in the browser. You can also see in the photos that intent processing works really fast. Below is my configuration which I use in ESPHome:

substitutions:
  name: sypialnia-onju-voice
  friendly_name: Sypialnia Onju Voice
  project_version: "1.1.0"
  device_description: "Onju Voice Satellite with ESPHome software and microWakeWord"
  wakeup_sound_url: "https://homeassistant.local:8123/local/sounds/wakeup.mp3"
  error_sound_url: "https://homeassistant.local:8123/local/sounds/error.mp3"
  timer_finished_sound_url: "https://homeassistant.local:8123/local/sounds/timer_finished.mp3"

external_components:
  - source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: dev-next
    components: [adf_pipeline, i2s_audio]

esphome:
  name: "${name}"
  friendly_name: "${friendly_name}"
  comment: "${device_description}"
  name_add_mac_suffix: false
  project:
    name: tetele.onju_voice_satellite
    version: "${project_version}"
  min_version: 2024.7.3
  platformio_options:
    board_build.flash_mode: dio
    board_build.arduino.memory_type: qio_opi
  on_boot:
    then:
      - light.turn_on:
          id: top_led
          effect: slow_pulse
          red: 100%
          green: 60%
          blue: 0%
      - wait_until:
          condition:
            wifi.connected:
      - light.turn_on:
          id: top_led
          effect: pulse
          red: 0%
          green: 100%
          blue: 0%
      - wait_until:
          condition:
            api.connected:
      - light.turn_on:
          id: top_led
          effect: none
          red: 0%
          green: 100%
          blue: 0%
      - delay: 1s
      - script.execute: reset_led

dashboard_import:
  package_import_url: github://tetele/onju-voice-satellite/esphome/onju-voice-microwakeword.yaml@main

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      # need to set a s3 compatible board for the adf-sdk to compile
      # board specific code is not used though
      CONFIG_ESP32_S3_BOX_BOARD: "y"

psram:
  mode: octal
  speed: 80MHz

# Enable logging
logger:

# Allow OTA updates
ota:
  platform: esphome

# Allow provisioning Wi-Fi via serial
improv_serial:

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Set up a wifi access point
  ap:
    ssid: "${friendly_name}"
    password: !secret fallback_password

# In combination with the `ap` this allows the user
# to provision wifi credentials to the device via WiFi AP.
captive_portal:

api:
  encryption:
    key: "XXXYYYZZZ"

  services:
    - service: start_va
      then:
        - voice_assistant.start
    - service: stop_va
      then:
        - voice_assistant.stop
    - service: notification_on
      then:
        - script.execute: turn_on_notification
    - service: notification_clear
      then:
        - script.execute: clear_notification

globals:
  - id: thresh_percent
    type: float
    initial_value: "0.03"
    restore_value: false
  - id: touch_calibration_values_left
    type: uint32_t[5]
    restore_value: false
  - id: touch_calibration_values_center
    type: uint32_t[5]
    restore_value: false
  - id: touch_calibration_values_right
    type: uint32_t[5]
    restore_value: false
  - id: notification
    type: bool
    restore_value: false

interval:
  - interval: 1s
    then:
      - script.execute:
          id: calibrate_touch
          button: 0
      - script.execute:
          id: calibrate_touch
          button: 1
      - script.execute:
          id: calibrate_touch
          button: 2

i2s_audio:
  - id: i2s_shared
    i2s_lrclk_pin: GPIO13
    i2s_bclk_pin: GPIO18
    access_mode: duplex

adf_pipeline:
  - platform: i2s_audio
    type: audio_out
    id: adf_i2s_out
    i2s_audio_id: i2s_shared
    i2s_dout_pin: GPIO12
    sample_rate: 16000
    adf_alc: true
    alc_max: .5
    bits_per_sample: 32bit
    fixed_settings: true
    channel: left

  - platform: i2s_audio
    type: audio_in
    id: adf_i2s_in
    i2s_audio_id: i2s_shared
    i2s_din_pin: GPIO17
    channel: left
    pdm: false
    sample_rate: 16000
    bits_per_sample: 32bit
    fixed_settings: true

microphone:
  - platform: adf_pipeline
    id: onju_microphone
    keep_pipeline_alive: true
    gain_log2: 3
    pipeline:
      - adf_i2s_in
      - self

media_player:
  - platform: adf_pipeline
    id: onju_out
    name: None
    internal: false
    keep_pipeline_alive: true
    pipeline:
      - self
      - resampler
      - adf_i2s_out
    on_state:
      then:
        - lambda: |-
            static float old_volume = -1;
            float new_volume = id(onju_out).volume;
            if(abs(new_volume-old_volume) > 0.0001) {
              if(old_volume != -1) {
                id(show_volume)->execute();
              }
            }
            old_volume = new_volume;

micro_wake_word:
  models:
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_jarvis.json
  vad:
    model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/vad.json
  on_wake_word_detected:
    - if:
        condition: media_player.is_playing
        then:
          - media_player.pause
    - media_player.play_media: "${wakeup_sound_url}"
    - wait_until:
        not:
          media_player.is_playing: onju_out
    - voice_assistant.start:
        wake_word: !lambda return wake_word;

voice_assistant:
  id: va
  microphone: onju_microphone
  media_player: onju_out
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: top_led
        blue: 100%
        red: 100%
        green: 100%
        brightness: 100%
        effect: listening
  on_stt_vad_end:
    - light.turn_on:
        id: top_led
        blue: 100%
        red: 0%
        green: 20%
        brightness: 70%
        effect: processing
  on_tts_end:
    - light.turn_on:
        id: top_led
        blue: 0%
        red: 20%
        green: 100%
        effect: speaking
  on_end:
    - wait_until:
        not:
          media_player.is_playing:
    - script.execute: reset_led
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - micro_wake_word.start
  on_timer_started:
    - light.turn_on:
        id: top_led
        effect: random_twinkle
  on_timer_finished:
    - media_player.play_media: "${timer_finished_sound_url}"
    - light.turn_on:
        id: top_led
        blue: 0%
        red: 100%
        green: 80%
        effect: slow_pulse
    - delay: 5s
    - script.execute: reset_led
  on_client_connected:
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - micro_wake_word.start:
  on_client_disconnected:
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - voice_assistant.stop:
          - micro_wake_word.stop:
  on_error:
    - media_player.play_media: "${error_sound_url}"
    - light.turn_on:
        id: top_led
        blue: 0%
        red: 100%
        green: 0%
        effect: none
    - delay: 1s
    - script.execute: reset_led

number:
  - platform: template
    name: "Touch threshold percentage"
    id: touch_threshold_percentage
    update_interval: never
    entity_category: config
    initial_value: 0.75
    min_value: 0.25
    max_value: 5
    step: 0.05
    optimistic: true
    on_value:
      then:
        - lambda: !lambda |-
            id(thresh_percent) = 0.01 * x;

esp32_touch:
  setup_mode: false
  sleep_duration: 2ms
  measurement_duration: 800us
  low_voltage_reference: 0.8V
  high_voltage_reference: 2.4V

  filter_mode: IIR_16
  debounce_count: 2
  noise_threshold: 0
  jitter_step: 0
  smooth_mode: IIR_2

  denoise_grade: BIT8
  denoise_cap_level: L0

binary_sensor:
  - platform: esp32_touch
    id: volume_down
    pin: GPIO4
    threshold: 539000
    on_press:
      then:
        - light.turn_on: left_led
        - script.execute:
            id: set_volume
            volume: -0.05
        - delay: 750ms
        - while:
            condition:
              binary_sensor.is_on: volume_down
            then:
              - script.execute:
                  id: set_volume
                  volume: -0.05
              - delay: 150ms
    on_release:
      then:
        - light.turn_off: left_led

  - platform: esp32_touch
    id: volume_up
    pin: GPIO2
    threshold: 580000
    on_press:
      then:
        - light.turn_on: right_led
        - script.execute:
            id: set_volume
            volume: 0.05
        - delay: 750ms
        - while:
            condition:
              binary_sensor.is_on: volume_up
            then:
              - script.execute:
                  id: set_volume
                  volume: 0.05
              - delay: 150ms
    on_release:
      then:
        - light.turn_off: right_led

  - platform: esp32_touch
    id: action
    pin: GPIO3
    threshold: 751000
    on_click:
      - if:
          condition:
            or:
              - switch.is_off: use_wake_word
              - binary_sensor.is_on: mute_switch
          then:
            - logger.log:
                tag: "action_click"
                format: "Voice assistant is running: %s"
                args: ['id(va).is_running() ? "yes" : "no"']
            - if:
                condition: media_player.is_playing
                then:
                  - media_player.stop
            - if:
                condition: voice_assistant.is_running
                then:
                  - voice_assistant.stop:
                else:
                  - voice_assistant.start:
          else:
            - logger.log:
                tag: "action_click"
                format: "Voice assistant was running with wake word detection enabled. Starting continuously"
            - if:
                condition: media_player.is_playing
                then:
                  - media_player.stop
            - voice_assistant.stop
            - delay: 1s
            - script.execute: reset_led
            - script.wait: reset_led
            - voice_assistant.start_continuous:

  - platform: gpio
    id: mute_switch
    pin:
      number: GPIO38
      mode: INPUT_PULLUP
    name: Disable wake word
    on_press:
      - script.execute: turn_off_wake_word
    on_release:
      - script.execute: turn_on_wake_word

light:
  - platform: esp32_rmt_led_strip
    id: leds
    pin: GPIO11
    chipset: SK6812
    num_leds: 6
    rgb_order: grb
    rmt_channel: 0
    default_transition_length: 0s
    gamma_correct: 2.8
  - platform: partition
    id: left_led
    segments:
      - id: leds
        from: 0
        to: 0
    default_transition_length: 100ms
  - platform: partition
    id: top_led
    segments:
      - id: leds
        from: 1
        to: 4
    default_transition_length: 100ms
    effects:
      - pulse:
          name: pulse
          transition_length: 250ms
          update_interval: 250ms
      - pulse:
          name: slow_pulse
          transition_length: 1s
          update_interval: 2s
      - addressable_lambda:
          name: show_volume
          update_interval: 50ms
          lambda: |-
            int int_volume = int(id(onju_out).volume * 100.0f * it.size());
            int full_leds = int_volume / 100;
            int last_brightness = int_volume % 100;
            int i = 0;
            for(; i < full_leds; i++) {
              it[i] = Color::WHITE;
            }
            if(i < 4) {
              it[i++] = Color(64, 64, 64).fade_to_white(last_brightness*256/100);
            }
            for(; i < it.size(); i++) {
              it[i] = Color(64, 64, 64);
            }
      - addressable_twinkle:
          name: listening_ww
          twinkle_probability: 1%
      - addressable_twinkle:
          name: listening
          twinkle_probability: 45%
      - addressable_scan:
          name: processing
          move_interval: 80ms
      - addressable_flicker:
          name: speaking
          intensity: 35%
      - addressable_random_twinkle:
          name: random_twinkle
          twinkle_probability: 45%
  - platform: partition
    id: right_led
    segments:
      - id: leds
        from: 5
        to: 5
    default_transition_length: 100ms

script:
  - id: reset_led
    then:
      - if:
          condition:
            - lambda: return id(notification);
          then:
            - light.turn_on:
                id: top_led
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: slow_pulse
          else:
            - if:
                condition:
                  and:
                    - switch.is_on: use_wake_word
                    - switch.is_on: flicker_wake_word
                    - binary_sensor.is_off: mute_switch
                then:
                  - light.turn_on:
                      id: top_led
                      blue: 100%
                      red: 100%
                      green: 0%
                      brightness: 60%
                      effect: listening_ww
                else:
                  - light.turn_off: top_led

  - id: turn_on_notification
    then:
      - lambda: id(notification) = true;
      - script.execute: reset_led

  - id: clear_notification
    then:
      - lambda: id(notification) = false;
      - script.execute: reset_led

  - id: set_volume
    mode: restart
    parameters:
      volume: float
    then:
      - media_player.volume_set:
          id: onju_out
          volume: !lambda return clamp(id(onju_out).volume+volume, 0.0f, 1.0f);

  - id: show_volume
    mode: restart
    then:
      - light.turn_on:
          id: top_led
          effect: show_volume
      - delay: 1s
      - script.execute: reset_led

  - id: turn_on_wake_word
    then:
      - if:
          condition:
            and:
              - binary_sensor.is_off: mute_switch
              - switch.is_on: use_wake_word
          then:
            - micro_wake_word.start
            - script.execute: reset_led
          else:
            - logger.log:
                tag: "turn_on_wake_word"
                format: "Trying to start listening for wake word, but %s"
                args:
                  [
                    'id(mute_switch).state ? "mute switch is on" : "use wake word toggle is off"',
                  ]
                level: "INFO"

  - id: turn_off_wake_word
    then:
      - micro_wake_word.stop
      - script.execute: reset_led

  - id: calibrate_touch
    parameters:
      button: int
    then:
      - lambda: |-
          static uint8_t thresh_indices[3] = {0, 0, 0};
          static uint32_t sums[3] = {0, 0, 0};
          static uint8_t qsizes[3] = {0, 0, 0};
          static uint16_t consecutive_anomalies_per_button[3] = {0, 0, 0};

          uint32_t newval;
          uint32_t* calibration_values;
          switch(button) {
            case 0:
              newval = id(volume_down).get_value();
              calibration_values = id(touch_calibration_values_left);
              break;
            case 1:
              newval = id(action).get_value();
              calibration_values = id(touch_calibration_values_center);
              break;
            case 2:
              newval = id(volume_up).get_value();
              calibration_values = id(touch_calibration_values_right);
              break;
            default:
              ESP_LOGE("touch_calibration", "Invalid button ID (%d)", button);
              return;
          }

          if(newval == 0) return;

          //ESP_LOGD("touch_calibration", "[%d] qsize %d, sum %d, thresh_index %d, consecutive_anomalies %d", button, qsizes[button], sums[button], thresh_indices[button], consecutive_anomalies_per_button[button]);
          //ESP_LOGD("touch_calibration", "[%d] New value is %d", button, newval);

          if(qsizes[button] == 5) {
            float avg = float(sums[button])/float(qsizes[button]);
            if((fabs(float(newval)-avg)/avg) > id(thresh_percent)) {
              consecutive_anomalies_per_button[button]++;
              //ESP_LOGD("touch_calibration", "[%d] %d anomalies detected.", button, consecutive_anomalies_per_button[button]);
              if(consecutive_anomalies_per_button[button] < 10)
                return;
            } 
          }

          //ESP_LOGD("touch_calibration", "[%d] Resetting consecutive anomalies counter.", button);
          consecutive_anomalies_per_button[button] = 0;

          if(qsizes[button] == 5) {
            //ESP_LOGD("touch_calibration", "[%d] Queue full, removing %d.", button, id(touch_calibration_values)[thresh_indices[button]]);
            sums[button] -= (uint32_t) *(calibration_values+thresh_indices[button]);// id(touch_calibration_values)[thresh_indices[button]];
            qsizes[button]--;
          }
          *(calibration_values+thresh_indices[button]) = newval;
          sums[button] += newval;
          qsizes[button]++;
          thresh_indices[button] = (thresh_indices[button] + 1) % 5;

          //ESP_LOGD("touch_calibration", "[%d] Average value is %d", button, sums[button]/qsizes[button]);
          uint32_t newthresh = uint32_t((sums[button]/qsizes[button]) * (1.0 + id(thresh_percent)));
          //ESP_LOGD("touch_calibration", "[%d] Setting threshold %d", button, newthresh);

          switch(button) {
            case 0:
              id(volume_down).set_threshold(newthresh);
              break;
            case 1:
              id(action).set_threshold(newthresh);
              break;
            case 2:
              id(volume_up).set_threshold(newthresh);
              break;
            default:
              ESP_LOGE("touch_calibration", "Invalid button ID (%d)", button);
              return;
          }

switch:
  - platform: template
    name: Use Wake Word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - script.execute: turn_on_wake_word
    on_turn_off:
      - script.execute: turn_off_wake_word
  - platform: template
    name: Wake Word Listening Light
    id: flicker_wake_word
    entity_category: config
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - script.execute: reset_led
    on_turn_off:
      - script.execute: reset_led
  - platform: gpio
    id: dac_mute
    restore_mode: ALWAYS_OFF
    pin:
      number: GPIO21
      inverted: True

Reproduction steps

Just run a voice command with Onju Voice.

Debug logs

INFO ESPHome 2024.9.0
INFO Reading configuration /config/esphome/sypialnia-onju-voice.yaml...
INFO Starting log output from 10.20.0.79 using esphome API
INFO Successfully connected to sypialnia-onju-voice @ 10.20.0.79 in 0.093s
INFO Successful handshake with sypialnia-onju-voice @ 10.20.0.79 in 0.097s
[16:42:00][I][app:100]: ESPHome version 2024.8.3 compiled on Sep 17 2024, 13:44:40
[16:42:00][I][app:102]: Project tetele.onju_voice_satellite version 1.1.0
[16:42:00][C][wifi:600]: WiFi:
[16:42:00][C][wifi:428]:   Local MAC: XX:XX:XX:XX:7A:80
[16:42:00][C][wifi:433]:   SSID: [redacted]
[16:42:00][C][wifi:436]:   IP Address: 10.20.0.79
[16:42:00][C][wifi:440]:   BSSID: [redacted]
[16:42:00][C][wifi:441]:   Hostname: 'sypialnia-onju-voice'
[16:42:00][C][wifi:443]:   Signal strength: -48 dB ▂▄▆█
[16:42:00][C][wifi:447]:   Channel: 9
[16:42:00][C][wifi:448]:   Subnet: 255.255.254.0
[16:42:00][C][wifi:449]:   Gateway: 10.20.0.1
[16:42:00][C][wifi:450]:   DNS1: 10.20.0.1
[16:42:00][C][wifi:451]:   DNS2: 0.0.0.0
[16:42:00][C][logger:185]: Logger:
[16:42:00][C][logger:186]:   Level: DEBUG
[16:42:00][C][logger:188]:   Log Baud Rate: 115200
[16:42:00][C][logger:189]:   Hardware UART: USB_SERIAL_JTAG
[16:42:00][C][template.number:050]: Template Number 'Touch threshold percentage'
[16:42:00][C][template.number:051]:   Optimistic: YES
[16:42:00][C][template.number:052]:   Update Interval: never
[16:42:00][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[16:42:00][C][esp32_rmt_led_strip:176]:   Pin: 11
[16:42:00][C][esp32_rmt_led_strip:177]:   Channel: 0
[16:42:00][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[16:42:00][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[16:42:00][C][esp32_rmt_led_strip:204]:   Number of LEDs: 6
[16:42:00][C][switch.gpio:068]: GPIO Switch 'dac_mute'
[16:42:00][C][switch.gpio:091]:   Restore Mode: always OFF
[16:42:00][C][switch.gpio:031]:   Pin: GPIO21
[16:42:00][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[16:42:00][C][gpio.binary_sensor:016]:   Pin: GPIO38
[16:42:00][C][light:103]: Light 'leds'
[16:42:00][C][light:105]:   Default Transition Length: 0.0s
[16:42:00][C][light:106]:   Gamma Correct: 2.80
[16:42:00][C][light:103]: Light 'left_led'
[16:42:00][C][light:105]:   Default Transition Length: 0.1s
[16:42:00][C][light:106]:   Gamma Correct: 2.80
[16:42:00][C][light:103]: Light 'top_led'
[16:42:00][C][light:105]:   Default Transition Length: 0.1s
[16:42:00][C][light:106]:   Gamma Correct: 2.80
[16:42:00][C][light:103]: Light 'right_led'
[16:42:00][C][light:105]:   Default Transition Length: 0.1s
[16:42:00][C][light:106]:   Gamma Correct: 2.80
[16:42:00][C][template.switch:068]: Template Switch 'Use Wake Word'
[16:42:00][C][template.switch:091]:   Restore Mode: restore defaults to ON
[16:42:00][C][template.switch:057]:   Optimistic: YES
[16:42:00][C][template.switch:068]: Template Switch 'Wake Word Listening Light'
[16:42:00][C][template.switch:091]:   Restore Mode: restore defaults to ON
[16:42:00][C][template.switch:057]:   Optimistic: YES
[16:42:01][C][psram:020]: PSRAM:
[16:42:01][C][psram:021]:   Available: YES
[16:42:01][C][psram:024]:   Size: 8191 KB
[16:42:01][C][i2s_audio:028]: I2SController:
[16:42:01][C][i2s_audio:029]:   AccessMode: duplex
[16:42:01][C][i2s_audio:030]:   Port: 0
[16:42:01][C][i2s_audio:032]:   Reader registered.
[16:42:01][C][i2s_audio:035]:   Writer registered.
[16:42:01][C][i2s_audio:139]: I2S-Writer (Fixed-CFG):
[16:42:01][C][i2s_audio:141]:   sample-rate: 16000 bits_per_sample: 32
[16:42:01][C][i2s_audio:142]:   channel_fmt: 4 channels: 1
[16:42:01][C][i2s_audio:143]:   use_apll: no, use_pdm: no
[16:42:01][C][i2s_audio:136]: I2S-Reader (Fixed-CFG):
[16:42:01][C][i2s_audio:141]:   sample-rate: 16000 bits_per_sample: 32
[16:42:01][C][i2s_audio:142]:   channel_fmt: 4 channels: 1
[16:42:01][C][i2s_audio:143]:   use_apll: no, use_pdm: no
[16:42:01][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[16:42:01][C][esp32_touch:074]:   Meas cycle: 0.80ms
[16:42:01][C][esp32_touch:075]:   Sleep cycle: 2.00ms
[16:42:01][C][esp32_touch:095]:   Low Voltage Reference: 0.8V
[16:42:01][C][esp32_touch:115]:   High Voltage Reference: 2.4V
[16:42:01][C][esp32_touch:135]:   Voltage Attenuation: 0V
[16:42:01][C][esp32_touch:169]:   Filter mode: IIR_16
[16:42:01][C][esp32_touch:170]:   Debounce count: 2
[16:42:01][C][esp32_touch:171]:   Noise threshold coefficient: 0
[16:42:01][C][esp32_touch:172]:   Jitter filter step size: 0
[16:42:01][C][esp32_touch:191]:   Smooth level: IIR_2
[16:42:01][C][esp32_touch:213]:   Denoise grade: BIT8
[16:42:01][C][esp32_touch:245]:   Denoise capacitance level: L0
[16:42:01][C][esp32_touch:260]:   Touch Pad 'volume_down'
[16:42:01][C][esp32_touch:261]:     Pad: T4
[16:42:01][C][esp32_touch:262]:     Threshold: 536923
[16:42:01][C][esp32_touch:260]:   Touch Pad 'volume_up'
[16:42:01][C][esp32_touch:261]:     Pad: T2
[16:42:01][C][esp32_touch:262]:     Threshold: 592401
[16:42:01][C][esp32_touch:260]:   Touch Pad 'action'
[16:42:01][C][esp32_touch:261]:     Pad: T3
[16:42:01][C][esp32_touch:262]:     Threshold: 699885
[16:42:01][C][captive_portal:088]: Captive Portal:
[16:42:01][C][mdns:116]: mDNS:
[16:42:01][C][mdns:117]:   Hostname: sypialnia-onju-voice
[16:42:01][C][esphome.ota:073]: Over-The-Air updates:
[16:42:01][C][esphome.ota:074]:   Address: sypialnia-onju-voice.local:3232
[16:42:01][C][esphome.ota:075]:   Version: 2
[16:42:01][C][safe_mode:018]: Safe Mode:
[16:42:01][C][safe_mode:020]:   Boot considered successful after 60 seconds
[16:42:01][C][safe_mode:021]:   Invoke after 10 boot attempts
[16:42:01][C][safe_mode:023]:   Remain in safe mode for 300 seconds
[16:42:01][C][api:139]: API Server:
[16:42:01][C][api:140]:   Address: sypialnia-onju-voice.local:6053
[16:42:01][C][api:142]:   Using noise encryption: YES
[16:42:01][C][improv_serial:032]: Improv Serial:
[16:42:01][C][micro_wake_word:051]: microWakeWord:
[16:42:01][C][micro_wake_word:052]:   models:
[16:42:01][C][micro_wake_word:015]:     - Wake Word: hey jarvis
[16:42:01][C][micro_wake_word:016]:       Probability cutoff: 0.970
[16:42:01][C][micro_wake_word:017]:       Sliding window size: 5
[16:42:01][C][micro_wake_word:021]:     - VAD Model
[16:42:01][C][micro_wake_word:022]:       Probability cutoff: 0.500
[16:42:01][C][micro_wake_word:023]:       Sliding window size: 5
[16:42:01][C][esp_adf_pipeline.microphone:020]: ADF-Microphone
[16:42:01][C][adf_media_player:016]: ESP-ADF-MediaPlayer:
[16:42:01][C][adf_media_player:018]:   MP_ANNOUNCE enabled
[16:42:01][C][adf_media_player:024]:   Number of ADFComponents: 3
[16:42:16][D][micro_wake_word:162]: The 'hey jarvis' model sliding average probability is 0.975 and most recent probability is 1.000
[16:42:16][D][micro_wake_word:123]: Wake Word 'hey jarvis' Detected
[16:42:16][D][micro_wake_word:195]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[16:42:16][D][micro_wake_word:129]: Stopping Microphone
[16:42:16][D][esp_adf_pipeline:070]: Called 'stop' while in RUNNING state.
[16:42:16][D][micro_wake_word:195]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[16:42:16][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from RUNNING to ABORTING. (REQ: 1)
[16:42:16][D][adf_audio_element:324]: [i2s_in] Checking State for stopping, got 3
[16:42:16][D][esp-idf:000][i2s_in]: W (269805618) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[16:42:16][D][esp-idf:000][i2s_in]: W (269805621) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[16:42:16][D][esp-idf:000][i2s_in]: W (269805624) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[16:42:16][D][esp-idf:000][i2s_in]: W (269805627) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[16:42:16][D][esp-idf:000][i2s_in]: W (269805631) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[16:42:16][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from ABORTING to STOPPED. (REQ: 1)
[16:42:16][D][micro_wake_word:195]: State changed from STOPPING_MICROPHONE to IDLE
[16:42:16][D][media_player:061]: 'Sypialnia Onju Voice' - Setting
[16:42:16][D][media_player:068]:   Media URL: https://homeassistant.local:8123/local/sounds/wakeup.mp3
[16:42:16][D][esp_audio_sources:098]: Set new uri: https://homeassistant.local:8123/local/sounds/wakeup.mp3
[16:42:16][D][adf_media_player:057]: Got control call in state IDLE
[16:42:16][D][adf_media_player:058]: req_track stream uri: https://homeassistant.local:8123/local/sounds/wakeup.mp3
[16:42:16][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[16:42:16][D][voice_assistant:504]: State changed from IDLE to START_MICROPHONE
[16:42:16][D][voice_assistant:510]: Desired state set to START_PIPELINE
[16:42:16][D][voice_assistant:221]: Starting Microphone
[16:42:16][D][esp_adf_pipeline.microphone:025]: start request while ine state 0
[16:42:16][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[16:42:16][D][voice_assistant:504]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[16:42:16][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[16:42:16][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[16:42:16][I][adf_media_player:192]: got new pipeline state: 3, while in MP state IDLE
[16:42:16][D][adf_i2s_out:141]: Set final i2s settings: 16000
[16:42:16][D][esp_audio_processors:124]: Current settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:16][I][adf_media_player:256]: current mp state: PLAYING
[16:42:16][I][adf_media_player:257]: anouncement: false
[16:42:16][I][adf_media_player:258]: play_intent: false
[16:42:16][I][adf_media_player:259]: current_uri_: yes
[16:42:16][D][adf_audio_element:108]: Preparing [i2s_in]...
[16:42:16][D][esp_audio_sources:103]: Prepare elements called (initial_call)!
[16:42:16][D][esp_audio_sources:137]: Use fixed settings: no
[16:42:16][D][esp_audio_sources:138]: Streamer status: 6
[16:42:16][D][esp_audio_sources:139]: decoder status: 6
[16:42:16][D][esp_audio_sources:140]: stream uri: https://homeassistant.local:8123/local/sounds/wakeup.mp3
[16:42:16][D][adf_audio_element:108]: Preparing [http]...
[16:42:16][D][adf_audio_element:108]: Preparing [decoder]...
[16:42:16][D][adf_audio_element:108]: Preparing [pcm_reader]...
[16:42:16][D][adf_audio_element:108]: Preparing [resampler]...
[16:42:16][D][adf_audio_element:108]: Preparing [i2s_out]...
[16:42:16][D][esp_adf_pipeline:342]: wait for preparation, done
[16:42:16][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[16:42:16][D][adf_audio_element:165]: Resuming [i2s_in]...
[16:42:16][D][adf_audio_element:172]: [i2s_in] Sending resume command.
[16:42:16][D][esp-idf:000][i2s_in]: I (269805760) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[16:42:16][D][adf_audio_element:191]: [i2s_in] Checking State, got 78
[16:42:16][I][esp_adf_pipeline:132]: [ i2s_in ] status: 12
[16:42:16][D][adf_audio_element:165]: Resuming [http]...
[16:42:16][D][adf_audio_element:172]: [http] Sending resume command.
[16:42:16][D][adf_audio_element:165]: Resuming [decoder]...
[16:42:16][D][adf_audio_element:172]: [decoder] Sending resume command.
[16:42:16][D][esp-idf:000][decoder]: I (269805782) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[16:42:16][D][esp-idf:000][decoder]: I (269805792) MP3_DECODER: MP3 opened

[16:42:16][D][voice_assistant:627]: Event Type: 1
[16:42:16][D][voice_assistant:630]: Assist Pipeline running
[16:42:16][D][voice_assistant:627]: Event Type: 3
[16:42:16][D][voice_assistant:641]: STT started
[16:42:16][D][light:036]: 'top_led' Setting:
[16:42:16][D][light:051]:   Brightness: 100%
[16:42:16][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[16:42:16][D][light:109]:   Effect: 'listening'
[16:42:17][D][voice_assistant:627]: Event Type: 11
[16:42:17][D][voice_assistant:783]: Starting STT by VAD
[16:42:18][I][esp_audio_sources:033][http]: Receive http event: 2
[16:42:18][I][esp_audio_sources:033][http]: Receive http event: 4
[16:42:18][D][esp-idf:000][http]: I (269807534) HTTP_STREAM: total_bytes=19527

[16:42:18][I][HTTPStreamReader:230]: Codec Format reported: 3.
[16:42:18][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=44100, bits=16, ch=2
[16:42:18][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=192000, duration=0, bytes=-93
[16:42:18][D][adf_i2s_out:141]: Set final i2s settings: 16000
[16:42:18][D][esp_audio_processors:124]: Current settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:18][D][adf_audio_element:108]: Preparing [http]...
[16:42:18][D][adf_audio_element:108]: Preparing [decoder]...
[16:42:18][D][esp-idf:000][decoder]: W (269807610) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[16:42:18][D][esp-idf:000][decoder]: W (269807615) MP3_DECODER: output aborted -3

[16:42:18][D][esp-idf:000][decoder]: I (269807619) MP3_DECODER: Closed

[16:42:18][D][esp_audio_sources:193]: Preparation done!
[16:42:18][D][esp_adf_pipeline:342]: wait for preparation, done
[16:42:18][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[16:42:18][I][adf_media_player:192]: got new pipeline state: 5, while in MP state PLAYING
[16:42:18][I][adf_media_player:256]: current mp state: PLAYING
[16:42:18][I][adf_media_player:257]: anouncement: false
[16:42:18][I][adf_media_player:258]: play_intent: false
[16:42:18][I][adf_media_player:259]: current_uri_: yes
[16:42:18][D][adf_audio_element:165]: Resuming [http]...
[16:42:18][D][adf_audio_element:172]: [http] Sending resume command.
[16:42:18][D][adf_audio_element:165]: Resuming [decoder]...
[16:42:18][D][adf_audio_element:172]: [decoder] Sending resume command.
[16:42:18][D][esp-idf:000][decoder]: I (269807672) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:5

[16:42:18][D][adf_audio_element:191]: [http] Checking State, got 79
[16:42:18][D][adf_audio_element:191]: [decoder] Checking State, got 79
[16:42:18][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STARTING to RUNNING. (REQ: 0)
[16:42:18][I][adf_media_player:192]: got new pipeline state: 6, while in MP state PLAYING
[16:42:18][I][adf_media_player:256]: current mp state: PLAYING
[16:42:18][I][adf_media_player:257]: anouncement: false
[16:42:18][I][adf_media_player:258]: play_intent: false
[16:42:18][I][adf_media_player:259]: current_uri_: yes
[16:42:19][D][voice_assistant:627]: Event Type: 12
[16:42:19][D][voice_assistant:787]: STT by VAD end
[16:42:19][D][voice_assistant:504]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[16:42:19][D][voice_assistant:510]: Desired state set to AWAITING_RESPONSE
[16:42:19][D][esp_adf_pipeline:070]: Called 'stop' while in RUNNING state.
[16:42:19][D][voice_assistant:504]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[16:42:19][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from RUNNING to ABORTING. (REQ: 1)
[16:42:19][D][light:036]: 'top_led' Setting:
[16:42:19][D][light:051]:   Brightness: 70%
[16:42:19][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[16:42:19][D][light:109]:   Effect: 'processing'
[16:42:19][D][voice_assistant:627]: Event Type: 4
[16:42:19][D][voice_assistant:655]: Speech recognised as: "oświeć światło w sypialni"
[16:42:19][D][adf_audio_element:324]: [i2s_in] Checking State for stopping, got 3
[16:42:19][D][esp-idf:000][i2s_in]: W (269809231) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[16:42:19][D][voice_assistant:627]: Event Type: 5
[16:42:19][D][voice_assistant:660]: Intent started
[16:42:19][D][adf_audio_element:324]: [pcm_reader] Checking State for stopping, got 3
[16:42:19][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from ABORTING to STOPPED. (REQ: 1)
[16:42:19][D][voice_assistant:504]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[16:42:20][D][voice_assistant:627]: Event Type: 6
[16:42:20][D][voice_assistant:627]: Event Type: 7
[16:42:20][D][voice_assistant:683]: Response: "Włączono światło"
[16:42:20][D][voice_assistant:627]: Event Type: 8
[16:42:20][D][voice_assistant:705]: Response URL: "https://homeassistant.local/api/tts_proxy/ab04a6e6cb798f6134cd8c0a3c7b1807ee4a8b64_pl-pl_6d43988cf6_tts.piper.mp3"
[16:42:20][D][voice_assistant:504]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[16:42:20][D][voice_assistant:510]: Desired state set to STREAMING_RESPONSE
[16:42:20][D][media_player:061]: 'Sypialnia Onju Voice' - Setting
[16:42:20][D][media_player:068]:   Media URL: https://homeassistant.local/api/tts_proxy/ab04a6e6cb798f6134cd8c0a3c7b1807ee4a8b64_pl-pl_6d43988cf6_tts.piper.mp3
[16:42:20][D][media_player:074]:  Announcement: yes
[16:42:20][D][adf_media_player:057]: Got control call in state PLAYING
[16:42:20][D][adf_media_player:058]: req_track stream uri: https://homeassistant.local/api/tts_proxy/ab04a6e6cb798f6134cd8c0a3c7b1807ee4a8b64_pl-pl_6d43988cf6_tts.piper.mp3
[16:42:20][D][esp_adf_pipeline:070]: Called 'stop' while in RUNNING state.
[16:42:20][D][light:036]: 'top_led' Setting:
[16:42:20][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[16:42:20][D][light:109]:   Effect: 'speaking'
[16:42:20][D][voice_assistant:627]: Event Type: 2
[16:42:20][D][voice_assistant:719]: Assist Pipeline ended
[16:42:20][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from RUNNING to ABORTING. (REQ: 1)
[16:42:20][I][adf_media_player:192]: got new pipeline state: 10, while in MP state PLAYING
[16:42:20][I][adf_media_player:256]: current mp state: PLAYING
[16:42:20][I][adf_media_player:257]: anouncement: yes
[16:42:20][I][adf_media_player:258]: play_intent: false
[16:42:20][I][adf_media_player:259]: current_uri_: yes
[16:42:20][D][adf_audio_element:324]: [http] Checking State for stopping, got 2
[16:42:20][D][adf_audio_element:324]: [decoder] Checking State for stopping, got 2
[16:42:20][D][esp-idf:000][resampler]: W (269809528) AUDIO_ELEMENT: IN-[resampler] AEL_IO_ABORT

[16:42:20][D][esp-idf:000][decoder]: W (269809624) AUDIO_ELEMENT: IN-[decoder] AEL_IO_ABORT

[16:42:20][D][esp-idf:000][decoder]: E (269809628) MP3_DECODER: failed to read audio data (line 117)

[16:42:20][D][esp-idf:000][decoder]: W (269809632) AUDIO_ELEMENT: [decoder] AEL_IO_ABORT, -3

[16:42:20][D][esp-idf:000][decoder]: I (269809637) MP3_DECODER: Closed

[16:42:20][I][esp_audio_sources:033][http]: Receive http event: 2
[16:42:20][I][esp_audio_sources:033][http]: Receive http event: 4
[16:42:20][D][esp-idf:000][http]: I (269809688) HTTP_STREAM: total_bytes=19527

[16:42:20][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from ABORTING to STOPPED. (REQ: 1)
[16:42:20][I][adf_media_player:192]: got new pipeline state: 4, while in MP state PLAYING
[16:42:20][D][esp_adf_pipeline:054]: Starting request, current state STOPPED
[16:42:20][I][adf_media_player:256]: current mp state: IDLE
[16:42:20][I][adf_media_player:257]: anouncement: yes
[16:42:20][I][adf_media_player:258]: play_intent: false
[16:42:20][I][adf_media_player:259]: current_uri_: yes
[16:42:20][D][light:036]: 'top_led' Setting:
[16:42:20][D][light:051]:   Brightness: 60%
[16:42:20][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[16:42:20][D][light:109]:   Effect: 'listening_ww'
[16:42:20][D][micro_wake_word:399]: Resetting buffers and probabilities
[16:42:20][D][micro_wake_word:195]: State changed from IDLE to START_MICROPHONE
[16:42:20][D][micro_wake_word:107]: Starting Microphone
[16:42:20][D][esp_adf_pipeline.microphone:025]: start request while ine state 0
[16:42:20][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[16:42:20][D][micro_wake_word:195]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[16:42:20][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[16:42:20][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STOPPED to PREPARING. (REQ: 3)
[16:42:20][I][adf_media_player:192]: got new pipeline state: 3, while in MP state IDLE
[16:42:20][D][adf_i2s_out:141]: Set final i2s settings: 16000
[16:42:20][D][esp_audio_processors:124]: Current settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:20][I][adf_media_player:256]: current mp state: ANNOUNCING
[16:42:20][I][adf_media_player:257]: anouncement: yes
[16:42:20][I][adf_media_player:258]: play_intent: false
[16:42:20][I][adf_media_player:259]: current_uri_: yes
[16:42:20][D][adf_audio_element:108]: Preparing [i2s_in]...
[16:42:20][D][esp_audio_sources:103]: Prepare elements called (initial_call)!
[16:42:20][D][esp_audio_sources:137]: Use fixed settings: no
[16:42:20][D][esp_audio_sources:138]: Streamer status: 5
[16:42:20][D][esp_audio_sources:139]: decoder status: 5
[16:42:20][D][esp_audio_sources:140]: stream uri: https://homeassistant.local/api/tts_proxy/ab04a6e6cb798f6134cd8c0a3c7b1807ee4a8b64_pl-pl_6d43988cf6_tts.piper.mp3
[16:42:20][D][adf_audio_element:108]: Preparing [http]...
[16:42:20][D][adf_audio_element:108]: Preparing [decoder]...
[16:42:20][D][adf_audio_element:108]: Preparing [pcm_reader]...
[16:42:20][D][adf_audio_element:108]: Preparing [resampler]...
[16:42:20][D][adf_audio_element:108]: Preparing [i2s_out]...
[16:42:20][D][esp_adf_pipeline:342]: wait for preparation, done
[16:42:20][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[16:42:20][I][HTTPStreamReader:230]: Codec Format reported: 3.
[16:42:20][D][adf_audio_element:165]: Resuming [i2s_in]...
[16:42:20][D][adf_audio_element:172]: [i2s_in] Sending resume command.
[16:42:20][D][esp-idf:000][i2s_in]: I (269809883) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[16:42:20][D][adf_audio_element:191]: [i2s_in] Checking State, got 78
[16:42:20][I][esp_adf_pipeline:132]: [ i2s_in ] status: 12
[16:42:20][D][adf_audio_element:165]: Resuming [http]...
[16:42:20][D][adf_audio_element:172]: [http] Sending resume command.
[16:42:20][D][adf_audio_element:165]: Resuming [decoder]...
[16:42:20][D][adf_audio_element:172]: [decoder] Sending resume command.
[16:42:20][D][esp-idf:000][decoder]: I (269809906) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[16:42:20][D][esp-idf:000][decoder]: I (269809911) MP3_DECODER: MP3 opened

[16:42:23][I][esp_audio_sources:033][http]: Receive http event: 2
[16:42:23][I][esp_audio_sources:033][http]: Receive http event: 4
[16:42:23][D][esp-idf:000][http]: I (269812486) HTTP_CLIENT: Body received in fetch header state, 0x3fcc8a43, 1693

[16:42:23][D][esp-idf:000][http]: I (269812490) HTTP_STREAM: total_bytes=14169

[16:42:23][I][HTTPStreamReader:230]: Codec Format reported: 3.
[16:42:23][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=22050, bits=16, ch=1
[16:42:23][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=77000, duration=1332, bytes=-1146
[16:42:23][D][adf_i2s_out:141]: Set final i2s settings: 16000
[16:42:23][D][esp_audio_processors:108]: Received request from: HTTPStreamReader
[16:42:23][D][esp_audio_processors:113]: New settings: SRC: rate: 22050, ch: 1 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:23][D][esp_audio_processors:124]: Current settings: SRC: rate: 22050, ch: 1 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:23][D][adf_audio_element:108]: Preparing [http]...
[16:42:23][D][adf_audio_element:108]: Preparing [decoder]...
[16:42:23][D][esp-idf:000][decoder]: W (269812545) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[16:42:23][D][esp-idf:000][decoder]: W (269812549) MP3_DECODER: output aborted -3

[16:42:23][D][esp-idf:000][decoder]: I (269812554) MP3_DECODER: Closed

[16:42:23][D][esp_audio_sources:193]: Preparation done!
[16:42:23][D][esp_adf_pipeline:342]: wait for preparation, done
[16:42:23][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[16:42:23][I][adf_media_player:192]: got new pipeline state: 5, while in MP state ANNOUNCING
[16:42:23][I][adf_media_player:256]: current mp state: ANNOUNCING
[16:42:23][I][adf_media_player:257]: anouncement: yes
[16:42:23][I][adf_media_player:258]: play_intent: false
[16:42:23][I][adf_media_player:259]: current_uri_: yes
[16:42:23][D][adf_audio_element:165]: Resuming [http]...
[16:42:23][D][adf_audio_element:172]: [http] Sending resume command.
[16:42:23][D][adf_audio_element:165]: Resuming [decoder]...
[16:42:23][D][adf_audio_element:172]: [decoder] Sending resume command.
[16:42:23][D][esp-idf:000][decoder]: I (269812727) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[16:42:23][D][esp-idf:000][decoder]: I (269812949) MP3_DECODER: MP3 opened

[16:42:23][D][adf_audio_element:191]: [http] Checking State, got 79
[16:42:23][D][adf_audio_element:191]: [decoder] Checking State, got 79
[16:42:23][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STARTING to RUNNING. (REQ: 0)
[16:42:23][I][adf_media_player:192]: got new pipeline state: 6, while in MP state ANNOUNCING
[16:42:23][I][adf_media_player:256]: current mp state: ANNOUNCING
[16:42:23][I][adf_media_player:257]: anouncement: yes
[16:42:23][I][adf_media_player:258]: play_intent: false
[16:42:23][I][adf_media_player:259]: current_uri_: yes
[16:42:26][I][esp_audio_sources:033][http]: Receive http event: 2
[16:42:26][I][esp_audio_sources:033][http]: Receive http event: 4
[16:42:26][D][esp-idf:000][http]: I (269815403) HTTP_CLIENT: Body received in fetch header state, 0x3fcc77b3, 1693

[16:42:26][D][esp-idf:000][http]: I (269815407) HTTP_STREAM: total_bytes=14169

[16:42:26][I][HTTPStreamReader:230]: Codec Format reported: 3.
[16:42:26][I][esp_adf_pipeline:132]: [ http ] status: 12
[16:42:26][I][esp_adf_pipeline:132]: [ decoder ] status: 12
[16:42:26][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=22050, bits=16, ch=1
[16:42:26][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=77000, duration=1332, bytes=-1146
[16:42:26][D][esp-idf:000][http]: W (269815760) HTTP_STREAM: No more data,errno:0, total_bytes:14169, rlen = 0

[16:42:26][I][esp_audio_sources:033][http]: Receive http event: 7
[16:42:26][D][esp-idf:000][http]: I (269815766) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[16:42:26][I][esp_adf_pipeline:123]: [ http ] byte_pos: 0, total: 14169
[16:42:26][I][esp_adf_pipeline:132]: [ http ] status: 15
[16:42:26][I][esp_adf_pipeline:135]: current state: RUNNING
[16:42:26][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from RUNNING to FINISHING. (REQ: 0)
[16:42:26][I][adf_media_player:192]: got new pipeline state: 7, while in MP state ANNOUNCING
[16:42:26][I][adf_media_player:256]: current mp state: ANNOUNCING
[16:42:26][I][adf_media_player:257]: anouncement: yes
[16:42:26][I][adf_media_player:258]: play_intent: false
[16:42:26][I][adf_media_player:259]: current_uri_: yes
[16:42:26][D][esp-idf:000][decoder]: I (269816080) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[16:42:27][D][esp-idf:000][decoder]: I (269816557) MP3_DECODER: Closed

[16:42:27][I][esp_adf_pipeline:123]: [ decoder ] byte_pos: 0, total: -1146
[16:42:27][I][esp_adf_pipeline:132]: [ decoder ] status: 15
[16:42:27][I][esp_adf_pipeline:135]: current state: FINISHING
[16:42:27][D][esp-idf:000][resampler]: I (269816652) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[16:42:27][I][esp_adf_pipeline:132]: [ resampler ] status: 15
[16:42:27][I][esp_adf_pipeline:135]: current state: FINISHING
[16:42:27][D][esp-idf:000][i2s_out]: I (269816747) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[16:42:27][I][esp_adf_pipeline:123]: [ i2s_out ] byte_pos: 0, total: 0
[16:42:27][I][esp_adf_pipeline:132]: [ i2s_out ] status: 15
[16:42:27][I][esp_adf_pipeline:135]: current state: FINISHING
[16:42:27][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from FINISHING to STOPPED. (REQ: 1)
[16:42:27][I][adf_media_player:192]: got new pipeline state: 4, while in MP state ANNOUNCING
[16:42:27][D][esp_adf_pipeline:054]: Starting request, current state STOPPED
[16:42:27][I][adf_media_player:256]: current mp state: IDLE
[16:42:27][I][adf_media_player:257]: anouncement: false
[16:42:27][I][adf_media_player:258]: play_intent: false
[16:42:27][I][adf_media_player:259]: current_uri_: yes
[16:42:27][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STOPPED to PREPARING. (REQ: 3)
[16:42:27][I][adf_media_player:192]: got new pipeline state: 3, while in MP state IDLE
[16:42:27][D][adf_i2s_out:141]: Set final i2s settings: 16000
[16:42:27][D][esp_audio_processors:124]: Current settings: SRC: rate: 22050, ch: 1 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:27][I][adf_media_player:256]: current mp state: PLAYING
[16:42:27][I][adf_media_player:257]: anouncement: false
[16:42:27][I][adf_media_player:258]: play_intent: false
[16:42:27][I][adf_media_player:259]: current_uri_: yes
[16:42:27][D][esp_audio_sources:103]: Prepare elements called (initial_call)!
[16:42:27][D][esp_audio_sources:137]: Use fixed settings: no
[16:42:27][D][esp_audio_sources:138]: Streamer status: 6
[16:42:27][D][esp_audio_sources:139]: decoder status: 6
[16:42:27][D][esp_audio_sources:140]: stream uri: https://homeassistant.local:8123/local/sounds/wakeup.mp3
[16:42:27][D][adf_audio_element:108]: Preparing [http]...
[16:42:27][D][adf_audio_element:108]: Preparing [decoder]...
[16:42:27][D][adf_audio_element:108]: Preparing [resampler]...
[16:42:27][D][adf_audio_element:108]: Preparing [i2s_out]...
[16:42:27][D][adf_audio_element:165]: Resuming [http]...
[16:42:27][D][adf_audio_element:172]: [http] Sending resume command.
[16:42:27][D][adf_audio_element:165]: Resuming [decoder]...
[16:42:27][D][adf_audio_element:172]: [decoder] Sending resume command.
[16:42:27][D][esp-idf:000][decoder]: I (269817154) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[16:42:27][D][esp-idf:000][decoder]: I (269817167) MP3_DECODER: MP3 opened

[16:42:27][D][adf_audio_element:191]: [http] Checking State, got 79
[16:42:27][D][adf_audio_element:191]: [decoder] Checking State, got 79
[16:42:29][I][esp_audio_sources:033][http]: Receive http event: 2
[16:42:29][I][esp_audio_sources:033][http]: Receive http event: 4
[16:42:29][D][esp-idf:000][http]: I (269818532) HTTP_STREAM: total_bytes=19527

[16:42:29][I][HTTPStreamReader:230]: Codec Format reported: 3.
[16:42:29][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=44100, bits=16, ch=2
[16:42:29][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=192000, duration=0, bytes=-93
[16:42:29][D][adf_i2s_out:141]: Set final i2s settings: 16000
[16:42:29][D][esp_audio_processors:108]: Received request from: HTTPStreamReader
[16:42:29][D][esp_audio_processors:113]: New settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:29][D][esp_audio_processors:124]: Current settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[16:42:29][D][adf_audio_element:108]: Preparing [http]...
[16:42:29][D][adf_audio_element:108]: Preparing [decoder]...
[16:42:29][D][esp-idf:000][decoder]: W (269818613) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[16:42:29][D][esp-idf:000][decoder]: W (269818618) MP3_DECODER: output aborted -3

[16:42:29][D][esp-idf:000][decoder]: I (269818622) MP3_DECODER: Closed

[16:42:29][D][esp_audio_sources:193]: Preparation done!
[16:42:29][D][esp_adf_pipeline:342]: wait for preparation, done
[16:42:29][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[16:42:29][I][adf_media_player:192]: got new pipeline state: 5, while in MP state PLAYING
[16:42:29][I][adf_media_player:256]: current mp state: PLAYING
[16:42:29][I][adf_media_player:257]: anouncement: false
[16:42:29][I][adf_media_player:258]: play_intent: false
[16:42:29][I][adf_media_player:259]: current_uri_: yes
[16:42:29][D][adf_audio_element:165]: Resuming [http]...
[16:42:29][D][adf_audio_element:172]: [http] Sending resume command.
[16:42:29][D][adf_audio_element:165]: Resuming [decoder]...
[16:42:29][D][adf_audio_element:172]: [decoder] Sending resume command.
[16:42:29][D][esp-idf:000][decoder]: I (269818778) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[16:42:29][D][esp-idf:000][decoder]: I (269818985) MP3_DECODER: MP3 opened

[16:42:29][D][adf_audio_element:191]: [http] Checking State, got 79
[16:42:29][D][adf_audio_element:191]: [decoder] Checking State, got 79
[16:42:29][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STARTING to RUNNING. (REQ: 0)
[16:42:29][I][adf_media_player:192]: got new pipeline state: 6, while in MP state PLAYING
[16:42:29][I][adf_media_player:256]: current mp state: PLAYING
[16:42:29][I][adf_media_player:257]: anouncement: false
[16:42:29][I][adf_media_player:258]: play_intent: false
[16:42:29][I][adf_media_player:259]: current_uri_: yes
[16:42:29][D][voice_assistant:504]: State changed from STREAMING_RESPONSE to IDLE
[16:42:29][D][voice_assistant:510]: Desired state set to IDLE
[16:42:30][I][esp_audio_sources:033][http]: Receive http event: 2
[16:42:30][I][esp_audio_sources:033][http]: Receive http event: 4
[16:42:31][D][esp-idf:000][http]: I (269820328) HTTP_STREAM: total_bytes=19527

[16:42:31][I][HTTPStreamReader:230]: Codec Format reported: 3.
[16:42:31][I][esp_adf_pipeline:132]: [ http ] status: 12
[16:42:31][I][esp_adf_pipeline:132]: [ decoder ] status: 12
[16:42:31][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=44100, bits=16, ch=2
[16:42:31][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=192000, duration=0, bytes=-93
[16:42:31][D][esp-idf:000][http]: W (269820772) HTTP_STREAM: No more data,errno:0, total_bytes:19527, rlen = 0

[16:42:31][I][esp_audio_sources:033][http]: Receive http event: 7
[16:42:31][D][esp-idf:000][http]: I (269820778) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[16:42:31][I][esp_adf_pipeline:123]: [ http ] byte_pos: 0, total: 19527
[16:42:31][I][esp_adf_pipeline:132]: [ http ] status: 15
[16:42:31][I][esp_adf_pipeline:135]: current state: RUNNING
[16:42:31][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from RUNNING to FINISHING. (REQ: 0)
[16:42:31][I][adf_media_player:192]: got new pipeline state: 7, while in MP state PLAYING
[16:42:31][I][adf_media_player:256]: current mp state: PLAYING
[16:42:31][I][adf_media_player:257]: anouncement: false
[16:42:31][I][adf_media_player:258]: play_intent: false
[16:42:31][I][adf_media_player:259]: current_uri_: yes
[16:42:31][D][esp-idf:000][decoder]: I (269820931) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[16:42:31][D][esp-idf:000][decoder]: I (269821021) MP3_DECODER: Closed

[16:42:31][I][esp_adf_pipeline:123]: [ decoder ] byte_pos: 0, total: -93
[16:42:31][I][esp_adf_pipeline:132]: [ decoder ] status: 15
[16:42:31][I][esp_adf_pipeline:135]: current state: FINISHING
[16:42:31][D][esp-idf:000][resampler]: I (269821052) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[16:42:31][I][esp_adf_pipeline:132]: [ resampler ] status: 15
[16:42:31][I][esp_adf_pipeline:135]: current state: FINISHING
[16:42:31][D][esp-idf:000][i2s_out]: I (269821131) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[16:42:32][I][esp_adf_pipeline:123]: [ i2s_out ] byte_pos: 0, total: 0
[16:42:32][I][esp_adf_pipeline:132]: [ i2s_out ] status: 15
[16:42:32][I][esp_adf_pipeline:135]: current state: FINISHING
[16:42:32][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from FINISHING to STOPPED. (REQ: 1)
[16:42:32][I][adf_media_player:192]: got new pipeline state: 4, while in MP state PLAYING
[16:42:32][I][adf_media_player:256]: current mp state: IDLE
[16:42:32][I][adf_media_player:257]: anouncement: false
[16:42:32][I][adf_media_player:258]: play_intent: false
[16:42:32][I][adf_media_player:259]: current_uri_: yes
TheStigh commented 2 days ago
  1. You should not put the wakeup sound behind https, but http, AND, avoid DNS lookup and use IP as it do steal you for some milliseconds of lookup time. Do this and it will be immediate wakeup response!

  2. Regarding waiting for the response - that is a combination - first is back to https but here you can't do anything as the firmware of the ESP32 is pointing to your external URL (really bad from the HA Team!) - it should at least pointed to the internal team. Https and downloading mp3 is not a really good combination.

Second part - I'm about to test something this weekend - hopefully this will improve the response time.

witold-gren commented 2 days ago

@TheStigh Thanks for your answer, I will try to check your recommendation. PS. My DNS name of HA is completely different but I had to hide it so that the domain was not publicly available :)

witold-gren commented 2 days ago

I can confirm, that when I replace https to http its start working really fast. Also I removed https connection from internal path and I replace it to don't use domain name but to use directly internal IP, and now piper response also really fast. Home Assistant configuration file:

homeassistant:
  name: Mieszkanie
  ...
  external_url: "https://MY_EXTERNAL_DOMAIN"
  internal_url: "http://10.20.1.4:8123"
...
http:
  server_port: 8123
  use_x_forwarded_for: true
  trusted_proxies:
    - 127.0.0.1
    - 10.20.0.0/23
  # IMPORTANT: Removed this ssl certificates
  # ssl_certificate: /ssl/fullchain.pem
  # ssl_key: /ssl/privkey.pem
  ip_ban_enabled: true
  login_attempts_threshold: 10

Configuration of Onju Voice:

substitutions:
  name: salon-onju-voice
  friendly_name: Salon Onju Voice
  project_version: "1.1.0"
  device_description: "Onju Voice Satellite with ESPHome software and microWakeWord"
  wakeup_sound_url: "http://10.20.1.4:8123/local/sounds/wakeup.mp3"
  error_sound_url: "http://10.20.1.4:8123/local/sounds/error.mp3"
  timer_finished_sound_url: "http://10.20.1.4:8123/local/sounds/timer_finished.mp3"
...

Below you can find logs from my Onju Voice device:

INFO ESPHome 2024.9.0
INFO Reading configuration /config/esphome/kuchnia-onju-voice.yaml...
INFO Starting log output from 10.20.0.77 using esphome API
INFO Successfully connected to kuchnia-onju-voice @ 10.20.0.77 in 0.192s
INFO Successful handshake with kuchnia-onju-voice @ 10.20.0.77 in 0.072s
[21:35:15][I][app:100]: ESPHome version 2024.9.0 compiled on Sep 20 2024, 20:47:51
[21:35:15][I][app:102]: Project tetele.onju_voice_satellite version 1.1.0
[21:35:15][C][wifi:600]: WiFi:
[21:35:15][C][wifi:428]:   Local MAC: 64:E8:33:47:7A:98
[21:35:15][C][wifi:433]:   SSID: [redacted]
[21:35:15][C][wifi:436]:   IP Address: 10.20.0.77
[21:35:15][C][wifi:440]:   BSSID: [redacted]
[21:35:15][C][wifi:441]:   Hostname: 'kuchnia-onju-voice'
[21:35:15][C][wifi:443]:   Signal strength: -67 dB ▂▄▆█
[21:35:15][C][wifi:447]:   Channel: 4
[21:35:15][C][wifi:448]:   Subnet: 255.255.254.0
[21:35:15][C][wifi:449]:   Gateway: 10.20.0.1
[21:35:15][C][wifi:450]:   DNS1: 10.20.0.1
[21:35:15][C][wifi:451]:   DNS2: 0.0.0.0
[21:35:15][C][logger:185]: Logger:
[21:35:15][C][logger:186]:   Level: DEBUG
[21:35:15][C][logger:188]:   Log Baud Rate: 115200
[21:35:15][C][logger:189]:   Hardware UART: USB_SERIAL_JTAG
[21:35:15][C][template.number:050]: Template Number 'Touch threshold percentage'
[21:35:15][C][template.number:051]:   Optimistic: YES
[21:35:15][C][template.number:052]:   Update Interval: never
[21:35:15][C][esp32_rmt_led_strip:187]: ESP32 RMT LED Strip:
[21:35:15][C][esp32_rmt_led_strip:188]:   Pin: 11
[21:35:15][C][esp32_rmt_led_strip:189]:   Channel: 0
[21:35:15][C][esp32_rmt_led_strip:214]:   RGB Order: GRB
[21:35:15][C][esp32_rmt_led_strip:215]:   Max refresh rate: 0
[21:35:15][C][esp32_rmt_led_strip:216]:   Number of LEDs: 6
[21:35:15][C][switch.gpio:068]: GPIO Switch 'dac_mute'
[21:35:15][C][switch.gpio:091]:   Restore Mode: always OFF
[21:35:15][C][switch.gpio:031]:   Pin: GPIO21
[21:35:15][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[21:35:15][C][gpio.binary_sensor:016]:   Pin: GPIO38
[21:35:15][C][light:103]: Light 'leds'
[21:35:15][C][light:105]:   Default Transition Length: 0.0s
[21:35:15][C][light:106]:   Gamma Correct: 2.80
[21:35:15][C][light:103]: Light 'left_led'
[21:35:15][C][light:105]:   Default Transition Length: 0.1s
[21:35:15][C][light:106]:   Gamma Correct: 2.80
[21:35:15][C][light:103]: Light 'top_led'
[21:35:15][C][light:105]:   Default Transition Length: 0.1s
[21:35:15][C][light:106]:   Gamma Correct: 2.80
[21:35:15][C][light:103]: Light 'right_led'
[21:35:15][C][light:105]:   Default Transition Length: 0.1s
[21:35:15][C][light:106]:   Gamma Correct: 2.80
[21:35:15][C][template.switch:068]: Template Switch 'Use Wake Word'
[21:35:15][C][template.switch:091]:   Restore Mode: restore defaults to ON
[21:35:15][C][template.switch:057]:   Optimistic: YES
[21:35:15][C][template.switch:068]: Template Switch 'Wake Word Listening Light'
[21:35:15][C][template.switch:091]:   Restore Mode: restore defaults to ON
[21:35:15][C][template.switch:057]:   Optimistic: YES
[21:35:15][C][psram:020]: PSRAM:
[21:35:15][C][psram:021]:   Available: YES
[21:35:15][C][psram:024]:   Size: 8191 KB
[21:35:15][C][i2s_audio:028]: I2SController:
[21:35:15][C][i2s_audio:029]:   AccessMode: duplex
[21:35:15][C][i2s_audio:030]:   Port: 0
[21:35:15][C][i2s_audio:032]:   Reader registered.
[21:35:15][C][i2s_audio:035]:   Writer registered.
[21:35:15][C][i2s_audio:139]: I2S-Writer (Fixed-CFG):
[21:35:15][C][i2s_audio:141]:   sample-rate: 16000 bits_per_sample: 32
[21:35:15][C][i2s_audio:142]:   channel_fmt: 4 channels: 1
[21:35:15][C][i2s_audio:143]:   use_apll: no, use_pdm: no
[21:35:15][C][i2s_audio:136]: I2S-Reader (Fixed-CFG):
[21:35:15][C][i2s_audio:141]:   sample-rate: 16000 bits_per_sample: 32
[21:35:15][C][i2s_audio:142]:   channel_fmt: 4 channels: 1
[21:35:15][C][i2s_audio:143]:   use_apll: no, use_pdm: no
[21:35:15][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[21:35:15][C][esp32_touch:074]:   Meas cycle: 0.80ms
[21:35:15][C][esp32_touch:075]:   Sleep cycle: 2.00ms
[21:35:15][C][esp32_touch:095]:   Low Voltage Reference: 0.8V
[21:35:15][C][esp32_touch:115]:   High Voltage Reference: 2.4V
[21:35:15][C][esp32_touch:135]:   Voltage Attenuation: 0V
[21:35:15][C][esp32_touch:169]:   Filter mode: IIR_16
[21:35:15][C][esp32_touch:170]:   Debounce count: 2
[21:35:15][C][esp32_touch:171]:   Noise threshold coefficient: 0
[21:35:15][C][esp32_touch:172]:   Jitter filter step size: 0
[21:35:15][C][esp32_touch:191]:   Smooth level: IIR_2
[21:35:15][C][esp32_touch:213]:   Denoise grade: BIT8
[21:35:15][C][esp32_touch:245]:   Denoise capacitance level: L0
[21:35:15][C][esp32_touch:260]:   Touch Pad 'volume_down'
[21:35:15][C][esp32_touch:261]:     Pad: T4
[21:35:15][C][esp32_touch:262]:     Threshold: 477347
[21:35:15][C][esp32_touch:260]:   Touch Pad 'volume_up'
[21:35:15][C][esp32_touch:261]:     Pad: T2
[21:35:15][C][esp32_touch:262]:     Threshold: 545831
[21:35:15][C][esp32_touch:260]:   Touch Pad 'action'
[21:35:15][C][esp32_touch:261]:     Pad: T3
[21:35:15][C][esp32_touch:262]:     Threshold: 702088
[21:35:15][C][captive_portal:089]: Captive Portal:
[21:35:15][C][mdns:116]: mDNS:
[21:35:15][C][mdns:117]:   Hostname: kuchnia-onju-voice
[21:35:15][C][esphome.ota:073]: Over-The-Air updates:
[21:35:15][C][esphome.ota:074]:   Address: kuchnia-onju-voice.local:3232
[21:35:15][C][esphome.ota:075]:   Version: 2
[21:35:15][C][safe_mode:018]: Safe Mode:
[21:35:15][C][safe_mode:020]:   Boot considered successful after 60 seconds
[21:35:15][C][safe_mode:021]:   Invoke after 10 boot attempts
[21:35:15][C][safe_mode:023]:   Remain in safe mode for 300 seconds
[21:35:15][C][api:139]: API Server:
[21:35:15][C][api:140]:   Address: kuchnia-onju-voice.local:6053
[21:35:15][C][api:142]:   Using noise encryption: YES
[21:35:15][C][improv_serial:032]: Improv Serial:
[21:35:15][C][micro_wake_word:051]: microWakeWord:
[21:35:15][C][micro_wake_word:052]:   models:
[21:35:15][C][micro_wake_word:015]:     - Wake Word: hey jarvis
[21:35:15][C][micro_wake_word:016]:       Probability cutoff: 0.970
[21:35:15][C][micro_wake_word:017]:       Sliding window size: 5
[21:35:15][C][micro_wake_word:021]:     - VAD Model
[21:35:15][C][micro_wake_word:022]:       Probability cutoff: 0.500
[21:35:15][C][micro_wake_word:023]:       Sliding window size: 5
[21:35:15][C][esp_adf_pipeline.microphone:020]: ADF-Microphone
[21:35:15][C][adf_media_player:016]: ESP-ADF-MediaPlayer:
[21:35:15][C][adf_media_player:018]:   MP_ANNOUNCE enabled
[21:35:15][C][adf_media_player:024]:   Number of ADFComponents: 3
[21:35:28][D][micro_wake_word:162]: The 'hey jarvis' model sliding average probability is 0.987 and most recent probability is 1.000
[21:35:28][D][micro_wake_word:123]: Wake Word 'hey jarvis' Detected
[21:35:28][D][micro_wake_word:195]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[21:35:28][D][micro_wake_word:129]: Stopping Microphone
[21:35:28][D][esp_adf_pipeline:070]: Called 'stop' while in RUNNING state.
[21:35:28][D][micro_wake_word:195]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[21:35:28][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from RUNNING to ABORTING. (REQ: 1)
[21:35:28][D][adf_audio_element:324]: [i2s_in] Checking State for stopping, got 3
[21:35:28][D][esp-idf:000][i2s_in]: W (1916424) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[21:35:28][D][esp-idf:000][i2s_in]: W (1916427) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[21:35:28][D][esp-idf:000][i2s_in]: W (1916430) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[21:35:28][D][esp-idf:000][i2s_in]: W (1916433) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[21:35:28][D][esp-idf:000][i2s_in]: W (1916436) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[21:35:28][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from ABORTING to STOPPED. (REQ: 1)
[21:35:28][D][micro_wake_word:195]: State changed from STOPPING_MICROPHONE to IDLE
[21:35:28][D][media_player:061]: 'Kuchnia Onju Voice' - Setting
[21:35:28][D][media_player:068]:   Media URL: http://10.20.1.4:8123/local/sounds/wakeup.mp3
[21:35:28][D][esp_audio_sources:098]: Set new uri: http://10.20.1.4:8123/local/sounds/wakeup.mp3
[21:35:28][D][adf_media_player:057]: Got control call in state IDLE
[21:35:28][D][adf_media_player:058]: req_track stream uri: http://10.20.1.4:8123/local/sounds/wakeup.mp3
[21:35:28][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[21:35:28][D][voice_assistant:514]: State changed from IDLE to START_MICROPHONE
[21:35:28][D][voice_assistant:520]: Desired state set to START_PIPELINE
[21:35:28][D][voice_assistant:226]: Starting Microphone
[21:35:28][D][esp_adf_pipeline.microphone:025]: start request while ine state 0
[21:35:28][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[21:35:28][D][voice_assistant:514]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:35:28][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[21:35:28][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[21:35:28][I][adf_media_player:192]: got new pipeline state: 3, while in MP state IDLE
[21:35:28][D][adf_i2s_out:141]: Set final i2s settings: 16000
[21:35:28][D][esp_audio_processors:124]: Current settings: SRC: rate: 22050, ch: 1 bits: 16, DST: rate: 16000, ch: 1, bits 16
[21:35:28][I][adf_media_player:256]: current mp state: PLAYING
[21:35:28][I][adf_media_player:257]: anouncement: false
[21:35:28][I][adf_media_player:258]: play_intent: false
[21:35:28][I][adf_media_player:259]: current_uri_: yes
[21:35:28][D][adf_audio_element:108]: Preparing [i2s_in]...
[21:35:28][D][esp_audio_sources:103]: Prepare elements called (initial_call)!
[21:35:28][D][esp_audio_sources:137]: Use fixed settings: no
[21:35:28][D][esp_audio_sources:138]: Streamer status: 6
[21:35:28][D][esp_audio_sources:139]: decoder status: 6
[21:35:28][D][esp_audio_sources:140]: stream uri: http://10.20.1.4:8123/local/sounds/wakeup.mp3
[21:35:28][D][adf_audio_element:108]: Preparing [http]...
[21:35:28][D][adf_audio_element:108]: Preparing [decoder]...
[21:35:28][D][adf_audio_element:108]: Preparing [pcm_reader]...
[21:35:28][D][adf_audio_element:108]: Preparing [resampler]...
[21:35:28][D][adf_audio_element:108]: Preparing [i2s_out]...
[21:35:28][D][esp_adf_pipeline:342]: wait for preparation, done
[21:35:28][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[21:35:28][D][adf_audio_element:165]: Resuming [i2s_in]...
[21:35:28][D][adf_audio_element:172]: [i2s_in] Sending resume command.
[21:35:28][D][esp-idf:000][i2s_in]: I (1916574) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[21:35:28][D][adf_audio_element:191]: [i2s_in] Checking State, got 78
[21:35:28][I][esp_adf_pipeline:132]: [ i2s_in ] status: 12
[21:35:28][D][adf_audio_element:165]: Resuming [http]...
[21:35:28][D][adf_audio_element:172]: [http] Sending resume command.
[21:35:28][D][adf_audio_element:165]: Resuming [decoder]...
[21:35:28][D][adf_audio_element:172]: [decoder] Sending resume command.
[21:35:28][D][adf_audio_element:191]: [pcm_reader] Checking State, got 65
[21:35:28][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from STARTING to RUNNING. (REQ: 0)
[21:35:28][D][voice_assistant:514]: State changed from STARTING_MICROPHONE to START_PIPELINE
[21:35:28][D][voice_assistant:280]: Requesting start...
[21:35:28][D][voice_assistant:514]: State changed from START_PIPELINE to STARTING_PIPELINE
[21:35:28][D][adf_audio_element:191]: [http] Checking State, got 79
[21:35:28][D][adf_audio_element:191]: [decoder] Checking State, got 79
[21:35:28][D][voice_assistant:535]: Client started, streaming microphone
[21:35:28][D][voice_assistant:514]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[21:35:28][D][voice_assistant:520]: Desired state set to STREAMING_MICROPHONE
[21:35:28][I][HTTPStreamReader:230]: Codec Format reported: 3.
[21:35:28][D][voice_assistant:637]: Event Type: 1
[21:35:28][D][voice_assistant:640]: Assist Pipeline running
[21:35:28][D][voice_assistant:637]: Event Type: 3
[21:35:28][D][voice_assistant:651]: STT started
[21:35:28][D][light:036]: 'top_led' Setting:
[21:35:28][D][light:051]:   Brightness: 100%
[21:35:28][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[21:35:28][D][light:109]:   Effect: 'listening'
[21:35:28][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=44100, bits=16, ch=2
[21:35:28][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=192000, duration=0, bytes=-93
[21:35:28][D][adf_i2s_out:141]: Set final i2s settings: 16000
[21:35:28][D][esp_audio_processors:108]: Received request from: HTTPStreamReader
[21:35:28][D][esp_audio_processors:113]: New settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[21:35:28][D][esp_audio_processors:124]: Current settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[21:35:28][D][adf_audio_element:108]: Preparing [http]...
[21:35:28][D][adf_audio_element:108]: Preparing [decoder]...
[21:35:28][D][esp-idf:000][decoder]: W (1916719) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[21:35:28][D][esp-idf:000][decoder]: W (1916722) MP3_DECODER: output aborted -3

[21:35:28][D][esp-idf:000][decoder]: I (1916728) MP3_DECODER: Closed

[21:35:28][D][esp_audio_sources:193]: Preparation done!
[21:35:28][D][esp_adf_pipeline:342]: wait for preparation, done
[21:35:28][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[21:35:28][I][adf_media_player:192]: got new pipeline state: 5, while in MP state PLAYING
[21:35:28][I][adf_media_player:256]: current mp state: PLAYING
[21:35:28][I][adf_media_player:257]: anouncement: false
[21:35:28][I][adf_media_player:258]: play_intent: false
[21:35:28][I][adf_media_player:259]: current_uri_: yes
[21:35:28][D][adf_audio_element:165]: Resuming [http]...
[21:35:28][D][adf_audio_element:172]: [http] Sending resume command.
[21:35:28][D][adf_audio_element:165]: Resuming [decoder]...
[21:35:28][D][adf_audio_element:172]: [decoder] Sending resume command.
[21:35:28][D][esp-idf:000][decoder]: I (1916774) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:5

[21:35:29][D][esp-idf:000][decoder]: I (1916778) MP3_DECODER: MP3 opened

[21:35:29][D][esp-idf:000][http]: I (1917080) HTTP_CLIENT: Body received in fetch header state, 0x3fccab9e, 1702

[21:35:29][D][esp-idf:000][http]: I (1917085) HTTP_STREAM: total_bytes=19527

[21:35:29][I][HTTPStreamReader:230]: Codec Format reported: 3.
[21:35:29][I][esp_adf_pipeline:132]: [ http ] status: 12
[21:35:29][I][esp_adf_pipeline:132]: [ decoder ] status: 12
[21:35:29][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=44100, bits=16, ch=2
[21:35:29][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=192000, duration=0, bytes=-93
[21:35:29][D][esp-idf:000][http]: W (1917524) HTTP_STREAM: No more data,errno:0, total_bytes:19527, rlen = 0

[21:35:29][I][esp_audio_sources:033][http]: Receive http event: 7
[21:35:29][D][esp-idf:000][http]: I (1917533) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[21:35:29][I][esp_adf_pipeline:123]: [ http ] byte_pos: 19527, total: 19527
[21:35:29][I][esp_adf_pipeline:132]: [ http ] status: 15
[21:35:29][I][esp_adf_pipeline:135]: current state: RUNNING
[21:35:29][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from RUNNING to FINISHING. (REQ: 0)
[21:35:29][I][adf_media_player:192]: got new pipeline state: 7, while in MP state PLAYING
[21:35:29][I][adf_media_player:256]: current mp state: PLAYING
[21:35:29][I][adf_media_player:257]: anouncement: false
[21:35:29][I][adf_media_player:258]: play_intent: false
[21:35:29][I][adf_media_player:259]: current_uri_: yes
[21:35:29][D][esp-idf:000][decoder]: I (1917684) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[21:35:29][D][esp-idf:000][decoder]: I (1917770) MP3_DECODER: Closed

[21:35:29][I][esp_adf_pipeline:123]: [ decoder ] byte_pos: 0, total: -93
[21:35:29][I][esp_adf_pipeline:132]: [ decoder ] status: 15
[21:35:29][I][esp_adf_pipeline:135]: current state: FINISHING
[21:35:29][D][esp-idf:000][resampler]: I (1917801) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[21:35:29][I][esp_adf_pipeline:132]: [ resampler ] status: 15
[21:35:29][I][esp_adf_pipeline:135]: current state: FINISHING
[21:35:29][D][esp-idf:000][i2s_out]: I (1917880) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[21:35:30][I][esp_adf_pipeline:123]: [ i2s_out ] byte_pos: 0, total: 0
[21:35:30][I][esp_adf_pipeline:132]: [ i2s_out ] status: 15
[21:35:30][I][esp_adf_pipeline:135]: current state: FINISHING
[21:35:30][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from FINISHING to STOPPED. (REQ: 1)
[21:35:30][I][adf_media_player:192]: got new pipeline state: 4, while in MP state PLAYING
[21:35:30][I][adf_media_player:256]: current mp state: IDLE
[21:35:30][I][adf_media_player:257]: anouncement: false
[21:35:30][I][adf_media_player:258]: play_intent: false
[21:35:30][I][adf_media_player:259]: current_uri_: yes
[21:35:30][D][voice_assistant:637]: Event Type: 11
[21:35:30][D][voice_assistant:793]: Starting STT by VAD
[21:35:32][D][voice_assistant:637]: Event Type: 12
[21:35:32][D][voice_assistant:797]: STT by VAD end
[21:35:32][D][voice_assistant:514]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[21:35:32][D][voice_assistant:520]: Desired state set to AWAITING_RESPONSE
[21:35:32][D][esp_adf_pipeline:070]: Called 'stop' while in RUNNING state.
[21:35:32][D][voice_assistant:514]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[21:35:32][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from RUNNING to ABORTING. (REQ: 1)
[21:35:32][D][light:036]: 'top_led' Setting:
[21:35:32][D][light:051]:   Brightness: 70%
[21:35:32][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[21:35:32][D][light:109]:   Effect: 'processing'
[21:35:32][D][adf_audio_element:324]: [i2s_in] Checking State for stopping, got 3
[21:35:32][D][adf_audio_element:324]: [pcm_reader] Checking State for stopping, got 3
[21:35:32][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from ABORTING to STOPPED. (REQ: 1)
[21:35:32][D][voice_assistant:514]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[21:35:32][D][voice_assistant:637]: Event Type: 4
[21:35:32][D][voice_assistant:665]: Speech recognised as: "zgaś światło w kuchni"
[21:35:32][D][voice_assistant:637]: Event Type: 5
[21:35:32][D][voice_assistant:670]: Intent started
[21:35:32][D][voice_assistant:637]: Event Type: 6
[21:35:32][D][voice_assistant:637]: Event Type: 7
[21:35:32][D][voice_assistant:693]: Response: "Wyłączono światło"
[21:35:32][D][voice_assistant:637]: Event Type: 8
[21:35:32][D][voice_assistant:715]: Response URL: "http://10.20.1.4:8123/api/tts_proxy/721138292ba7c094c6131830fda7bea4a1865fb4_pl-pl_6d43988cf6_tts.piper.mp3"
[21:35:32][D][voice_assistant:514]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[21:35:32][D][voice_assistant:520]: Desired state set to STREAMING_RESPONSE
[21:35:32][D][media_player:061]: 'Kuchnia Onju Voice' - Setting
[21:35:32][D][media_player:068]:   Media URL: http://10.20.1.4:8123/api/tts_proxy/721138292ba7c094c6131830fda7bea4a1865fb4_pl-pl_6d43988cf6_tts.piper.mp3
[21:35:32][D][media_player:074]:  Announcement: yes
[21:35:32][D][adf_media_player:057]: Got control call in state IDLE
[21:35:32][D][adf_media_player:058]: req_track stream uri: http://10.20.1.4:8123/api/tts_proxy/721138292ba7c094c6131830fda7bea4a1865fb4_pl-pl_6d43988cf6_tts.piper.mp3
[21:35:32][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[21:35:32][D][light:036]: 'top_led' Setting:
[21:35:32][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[21:35:32][D][light:109]:   Effect: 'speaking'
[21:35:32][D][voice_assistant:637]: Event Type: 2
[21:35:32][D][voice_assistant:729]: Assist Pipeline ended
[21:35:32][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[21:35:32][I][adf_media_player:192]: got new pipeline state: 3, while in MP state IDLE
[21:35:32][D][adf_i2s_out:141]: Set final i2s settings: 16000
[21:35:32][D][esp_audio_processors:124]: Current settings: SRC: rate: 44100, ch: 2 bits: 16, DST: rate: 16000, ch: 1, bits 16
[21:35:32][I][adf_media_player:256]: current mp state: ANNOUNCING
[21:35:32][I][adf_media_player:257]: anouncement: yes
[21:35:32][I][adf_media_player:258]: play_intent: false
[21:35:32][I][adf_media_player:259]: current_uri_: yes
[21:35:32][D][light:036]: 'top_led' Setting:
[21:35:32][D][light:051]:   Brightness: 60%
[21:35:32][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[21:35:32][D][light:109]:   Effect: 'listening_ww'
[21:35:32][D][micro_wake_word:399]: Resetting buffers and probabilities
[21:35:32][D][micro_wake_word:195]: State changed from IDLE to START_MICROPHONE
[21:35:32][D][micro_wake_word:107]: Starting Microphone
[21:35:32][D][esp_adf_pipeline.microphone:025]: start request while ine state 0
[21:35:32][D][esp_adf_pipeline:060]: Starting request, current state STOPPED
[21:35:32][D][micro_wake_word:195]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:35:32][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from STOPPED to PREPARING. (REQ: 0)
[21:35:32][D][esp_audio_sources:103]: Prepare elements called (initial_call)!
[21:35:32][D][esp_audio_sources:137]: Use fixed settings: no
[21:35:32][D][esp_audio_sources:138]: Streamer status: 6
[21:35:32][D][esp_audio_sources:139]: decoder status: 6
[21:35:32][D][esp_audio_sources:140]: stream uri: http://10.20.1.4:8123/api/tts_proxy/721138292ba7c094c6131830fda7bea4a1865fb4_pl-pl_6d43988cf6_tts.piper.mp3
[21:35:32][D][adf_audio_element:108]: Preparing [http]...
[21:35:32][D][adf_audio_element:108]: Preparing [decoder]...
[21:35:32][D][adf_audio_element:108]: Preparing [i2s_in]...
[21:35:32][D][adf_audio_element:108]: Preparing [resampler]...
[21:35:32][D][adf_audio_element:108]: Preparing [pcm_reader]...
[21:35:32][D][adf_audio_element:108]: Preparing [i2s_out]...
[21:35:32][D][esp_adf_pipeline:342]: wait for preparation, done
[21:35:32][D][esp_adf_pipeline:448]: [ADFMicrophone] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[21:35:32][D][adf_audio_element:165]: Resuming [i2s_in]...
[21:35:32][D][adf_audio_element:172]: [i2s_in] Sending resume command.
[21:35:32][D][esp-idf:000][i2s_in]: I (1920721) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[21:35:32][I][esp_audio_sources:033][http]: Receive http event: 1
[21:35:32][D][micro_wake_word:195]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[21:35:32][D][adf_audio_element:191]: [http] Checking State, got 79
[21:35:32][D][adf_audio_element:191]: [decoder] Checking State, got 79
[21:35:32][I][esp_audio_sources:033][http]: Receive http event: 2
[21:35:32][I][esp_audio_sources:033][http]: Receive http event: 4
[21:35:32][D][esp-idf:000][http]: I (1920778) HTTP_CLIENT: Body received in fetch header state, 0x3fcc8573, 1841

[21:35:32][D][esp-idf:000][http]: I (1920782) HTTP_STREAM: total_bytes=14223

[21:35:32][I][HTTPStreamReader:230]: Codec Format reported: 3.
[21:35:32][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=22050, bits=16, ch=1
[21:35:32][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=75000, duration=1384, bytes=-1147
[21:35:32][D][adf_i2s_out:141]: Set final i2s settings: 16000
[21:35:32][D][esp_audio_processors:108]: Received request from: HTTPStreamReader
[21:35:32][D][esp_audio_processors:113]: New settings: SRC: rate: 22050, ch: 1 bits: 16, DST: rate: 16000, ch: 1, bits 16
[21:35:32][D][esp_audio_processors:124]: Current settings: SRC: rate: 22050, ch: 1 bits: 16, DST: rate: 16000, ch: 1, bits 16
[21:35:32][D][adf_audio_element:108]: Preparing [http]...
[21:35:32][D][adf_audio_element:108]: Preparing [decoder]...
[21:35:32][D][esp-idf:000][decoder]: W (1920854) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[21:35:32][D][esp-idf:000][decoder]: W (1920858) MP3_DECODER: output aborted -3

[21:35:32][D][esp-idf:000][decoder]: I (1920861) MP3_DECODER: Closed

[21:35:32][D][esp_audio_sources:193]: Preparation done!
[21:35:32][D][esp_adf_pipeline:342]: wait for preparation, done
[21:35:32][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[21:35:32][I][adf_media_player:192]: got new pipeline state: 5, while in MP state ANNOUNCING
[21:35:32][I][adf_media_player:256]: current mp state: ANNOUNCING
[21:35:32][I][adf_media_player:257]: anouncement: yes
[21:35:32][I][adf_media_player:258]: play_intent: false
[21:35:32][I][adf_media_player:259]: current_uri_: yes
[21:35:32][D][adf_audio_element:165]: Resuming [http]...
[21:35:32][D][adf_audio_element:172]: [http] Sending resume command.
[21:35:32][D][adf_audio_element:165]: Resuming [decoder]...
[21:35:33][D][adf_audio_element:172]: [decoder] Sending resume command.
[21:35:33][D][esp-idf:000][decoder]: I (1920988) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[21:35:33][D][esp-idf:000][decoder]: I (1921217) MP3_DECODER: MP3 opened

[21:35:33][D][esp-idf:000][http]: I (1921234) HTTP_CLIENT: Body received in fetch header state, 0x3fcca637, 1841

[21:35:33][D][esp-idf:000][http]: I (1921238) HTTP_STREAM: total_bytes=14223

[21:35:33][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from STARTING to RUNNING. (REQ: 0)
[21:35:33][I][adf_media_player:192]: got new pipeline state: 6, while in MP state ANNOUNCING
[21:35:33][I][adf_media_player:256]: current mp state: ANNOUNCING
[21:35:33][I][adf_media_player:257]: anouncement: yes
[21:35:33][I][adf_media_player:258]: play_intent: false
[21:35:33][I][adf_media_player:259]: current_uri_: yes
[21:35:33][I][HTTPStreamReader:230]: Codec Format reported: 3.
[21:35:33][I][esp_adf_pipeline:132]: [ http ] status: 12
[21:35:33][I][esp_adf_pipeline:132]: [ decoder ] status: 12
[21:35:33][I][HTTPStreamReader:240]: [ * ] Receive music info from decoder, sample_rates=22050, bits=16, ch=1
[21:35:33][I][HTTPStreamReader:243]: [ * ] Receive music info from decoder, codec_fmt=3, bps=75000, duration=1384, bytes=-1147
[21:35:33][D][esp-idf:000][http]: W (1921648) HTTP_STREAM: No more data,errno:0, total_bytes:14223, rlen = 0

[21:35:33][I][esp_audio_sources:033][http]: Receive http event: 7
[21:35:33][D][esp-idf:000][http]: I (1921659) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[21:35:33][I][esp_adf_pipeline:123]: [ http ] byte_pos: 0, total: 14223
[21:35:33][I][esp_adf_pipeline:132]: [ http ] status: 15
[21:35:33][I][esp_adf_pipeline:135]: current state: RUNNING
[21:35:33][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from RUNNING to FINISHING. (REQ: 0)
[21:35:33][I][adf_media_player:192]: got new pipeline state: 7, while in MP state ANNOUNCING
[21:35:33][I][adf_media_player:256]: current mp state: ANNOUNCING
[21:35:33][I][adf_media_player:257]: anouncement: yes
[21:35:33][I][adf_media_player:258]: play_intent: false
[21:35:33][I][adf_media_player:259]: current_uri_: yes
[21:35:33][D][esp-idf:000][decoder]: I (1921996) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[21:35:34][D][esp-idf:000][decoder]: I (1922440) MP3_DECODER: Closed

[21:35:34][I][esp_adf_pipeline:123]: [ decoder ] byte_pos: 0, total: -1147
[21:35:34][I][esp_adf_pipeline:132]: [ decoder ] status: 15
[21:35:34][I][esp_adf_pipeline:135]: current state: FINISHING
[21:35:34][D][esp-idf:000][resampler]: I (1922568) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[21:35:34][I][esp_adf_pipeline:132]: [ resampler ] status: 15
[21:35:34][I][esp_adf_pipeline:135]: current state: FINISHING
[21:35:34][D][esp-idf:000][i2s_out]: I (1922631) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[21:35:34][I][esp_adf_pipeline:123]: [ i2s_out ] byte_pos: 0, total: 0
[21:35:34][I][esp_adf_pipeline:132]: [ i2s_out ] status: 15
[21:35:34][I][esp_adf_pipeline:135]: current state: FINISHING
[21:35:34][D][esp_adf_pipeline:448]: [MediaPlayer] Pipeline changed from FINISHING to STOPPED. (REQ: 1)
[21:35:34][I][adf_media_player:192]: got new pipeline state: 4, while in MP state ANNOUNCING
[21:35:34][I][adf_media_player:256]: current mp state: IDLE
[21:35:34][I][adf_media_player:257]: anouncement: false
[21:35:34][I][adf_media_player:258]: play_intent: false
[21:35:34][I][adf_media_player:259]: current_uri_: yes
[21:35:36][D][voice_assistant:514]: State changed from STREAMING_RESPONSE to IDLE
[21:35:36][D][voice_assistant:520]: Desired state set to IDLE
witold-gren commented 2 days ago

@TheStigh I have one more question. Currently, it works so quickly that the prefix bo is added to each sentence - it is definitely a translation of the signal. Can I somehow delay listening to the command by 1 second? 😀

image

Currently we have: "bo zgaś światło w sypialni" but it should be: "zgaś światło w sypialni" 😀 I see this prefix bo really in all sentences..

witold-gren commented 1 day ago

I found a solution.. I just add delay: 500ms when wake word detected:

micro_wake_word:
  models:
    #- model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/alexa.json
    # - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/okay_nabu.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_jarvis.json
    #- model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_mycroft.json
  vad:
    model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/vad.json
  on_wake_word_detected:
    - if:
        condition: media_player.is_playing
        then:
          - media_player.pause
    - media_player.play_media: "${wakeup_sound_url}"
    - delay: 500ms
    - wait_until:
        not:
          media_player.is_playing: onju_out
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
witold-gren commented 1 day ago

All problems have been solved 😀 I think it is worth adding such information to the README.md file:

  1. If possible, use an IP address instead of a domain name for an internal address
  2. in internal network, do not use https connection, only http (encrypted connection delays communication very much)

cc: @tetele

TheStigh commented 1 day ago

All problems have been solved 😀 I think it is worth adding such information to the README.md file:

  1. If possible, use an IP address instead of a domain name for an internal address
  2. in internal network, do not use https connection, only http (encrypted connection delays communication very much)

cc: @tetele

Great that it worked out :) Though, how many seconds does it take from you've finished talking until you hear the sound? Here, I think it is more time to cut as we do download the response as an mp3 file, and this is done through also the external URL which is https (as you still use).

I think I've found a way to cut with at least 1 second but I will need help from somebody better with the ESPHome code than me.

EDIT: @witold-gren I just looked through your logs and see the response is played from your INTERNAL url and not your EXPTERNAL? I'm baffled .... So I changed my configuration.yaml according to yours and tested with http: and 8123 and now it point to my INTERNAL url? But I don't understand WHY it changed from EXTERNAL to INTERNAL ?

witold-gren commented 1 day ago

I post new video with example how fast it works.. https://youtu.be/h63s-1HTkN8?feature=shared

Unfortunately, now due to the addition of a 500ms delay, the LED lighting works a bit strange - you can see it in the video. Unfortunately, I don't know how to properly deal with the listening delay. So far I haven't found any reasonable solution to deal with it..