tetele / onju-voice-satellite

An ESPHome config for the Onju Voice which makes it a Home Assistant voice satellite
MIT License
95 stars 17 forks source link

2024.5.2 (out of some bugs) - Wake word from HA works, mic works, speaker does not #48

Closed dreimer1986 closed 4 months ago

dreimer1986 commented 4 months ago

Flavor

OpenWakeWord or no wake word

Checklist

Describe the issue

New built Onju Voice. Tested a bit around and have Wake word working frine, according to logs, mic works fine, too and speaker does not... Well I tried around with media player before that and the speaker CAN output audio, so the hardware seems fine. Ah and when I used the wake word ONCE it does not work anymore afterwards

Reproduction steps

1. 2. 3. ...

Debug logs

INFO ESPHome 2024.5.2
INFO Reading configuration /config/esphome/onju-voice-e27688.yaml...
INFO Starting log output from 192.168.181.103 using esphome API
INFO Successfully connected to onju-voice-e27688 @ 192.168.181.103 in 0.004s
INFO Successful handshake with onju-voice-e27688 @ 192.168.181.103 in 0.071s
[20:05:47][I][app:100]: ESPHome version 2024.5.2 compiled on Jun  4 2024, 19:26:14
[20:05:47][I][app:102]: Project tetele.onju_voice_satellite version 1.0.0
[20:05:47][C][wifi:580]: WiFi:
[20:05:47][C][wifi:408]:   Local MAC: 24:58:7C:E2:76:88
[20:05:47][C][wifi:413]:   SSID: 'Asgaard'[redacted]
[20:05:47][C][wifi:416]:   IP Address: 192.168.181.103
[20:05:47][C][wifi:420]:   BSSID: B0:F2:08:55:4B:1E[redacted]
[20:05:47][C][wifi:421]:   Hostname: 'onju-voice-e27688'
[20:05:47][C][wifi:423]:   Signal strength: -69 dB ▂▄▆█
[20:05:47][C][wifi:427]:   Channel: 13
[20:05:47][C][wifi:428]:   Subnet: 255.255.255.0
[20:05:47][C][wifi:429]:   Gateway: 192.168.181.254
[20:05:47][C][wifi:430]:   DNS1: 192.168.181.42
[20:05:47][C][wifi:431]:   DNS2: 0.0.0.0
[20:05:47][C][logger:185]: Logger:
[20:05:47][C][logger:186]:   Level: DEBUG
[20:05:47][C][logger:188]:   Log Baud Rate: 115200
[20:05:47][C][logger:189]:   Hardware UART: USB_CDC
[20:05:47][C][template.number:050]: Template Number 'Touch threshold percentage'
[20:05:47][C][template.number:051]:   Optimistic: YES
[20:05:47][C][template.number:052]:   Update Interval: never
[20:05:47][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[20:05:47][C][esp32_rmt_led_strip:176]:   Pin: 11
[20:05:47][C][esp32_rmt_led_strip:177]:   Channel: 0
[20:05:47][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[20:05:47][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[20:05:47][C][esp32_rmt_led_strip:204]:   Number of LEDs: 6
[20:05:47][D][light:036]: 'top_led' Setting:
[20:05:47][D][light:051]:   Brightness: 60%
[20:05:47][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[20:05:47][D][light:109]:   Effect: 'listening_ww'
[20:05:47][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[20:05:47][C][gpio.binary_sensor:016]:   Pin: GPIO38
[20:05:47][C][light:103]: Light 'leds'
[20:05:47][C][light:105]:   Default Transition Length: 0.0s
[20:05:47][C][light:106]:   Gamma Correct: 2.80
[20:05:47][C][light:103]: Light 'left_led'
[20:05:47][C][light:105]:   Default Transition Length: 0.1s
[20:05:47][C][light:106]:   Gamma Correct: 2.80
[20:05:47][C][light:103]: Light 'top_led'
[20:05:47][C][light:105]:   Default Transition Length: 0.1s
[20:05:47][C][light:106]:   Gamma Correct: 2.80
[20:05:47][C][light:103]: Light 'right_led'
[20:05:47][C][light:105]:   Default Transition Length: 0.1s
[20:05:47][C][light:106]:   Gamma Correct: 2.80
[20:05:47][C][template.switch:068]: Template Switch 'Use Wake Word'
[20:05:47][C][template.switch:091]:   Restore Mode: restore defaults to ON
[20:05:47][C][template.switch:057]:   Optimistic: YES
[20:05:47][C][psram:020]: PSRAM:
[20:05:47][C][psram:021]:   Available: YES
[20:05:47][C][psram:024]:   Size: 8191 KB
[20:05:47][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[20:05:47][C][esp32_touch:074]:   Meas cycle: 0.80ms
[20:05:47][C][esp32_touch:075]:   Sleep cycle: 2.00ms
[20:05:47][C][esp32_touch:095]:   Low Voltage Reference: 0.8V
[20:05:47][C][esp32_touch:115]:   High Voltage Reference: 2.4V
[20:05:47][C][esp32_touch:135]:   Voltage Attenuation: 0V
[20:05:47][C][esp32_touch:169]:   Filter mode: IIR_16
[20:05:47][C][esp32_touch:170]:   Debounce count: 2
[20:05:47][C][esp32_touch:171]:   Noise threshold coefficient: 0
[20:05:47][C][esp32_touch:172]:   Jitter filter step size: 0
[20:05:47][C][esp32_touch:191]:   Smooth level: IIR_2
[20:05:47][C][esp32_touch:213]:   Denoise grade: BIT8
[20:05:47][C][esp32_touch:245]:   Denoise capacitance level: L0
[20:05:47][C][esp32_touch:260]:   Touch Pad 'volume_down'
[20:05:47][C][esp32_touch:261]:     Pad: T4
[20:05:47][C][esp32_touch:262]:     Threshold: 338379
[20:05:47][C][esp32_touch:260]:   Touch Pad 'volume_up'
[20:05:47][C][esp32_touch:261]:     Pad: T2
[20:05:47][C][esp32_touch:262]:     Threshold: 378462
[20:05:47][C][esp32_touch:260]:   Touch Pad 'action'
[20:05:47][C][esp32_touch:261]:     Pad: T3
[20:05:47][C][esp32_touch:262]:     Threshold: 490288
[20:05:47][C][captive_portal:088]: Captive Portal:
[20:05:47][C][mdns:115]: mDNS:
[20:05:47][C][mdns:116]:   Hostname: onju-voice-e27688
[20:05:47][C][ota:096]: Over-The-Air Updates:
[20:05:47][C][ota:097]:   Address: onju-voice-e27688.local:3232
[20:05:47][C][ota:103]:   OTA version: 2.
[20:05:47][C][api:139]: API Server:
[20:05:47][C][api:140]:   Address: onju-voice-e27688.local:6053
[20:05:47][C][api:142]:   Using noise encryption: YES
[20:05:47][C][improv_serial:032]: Improv Serial:
[20:05:47][C][audio:214]: Audio:
[20:05:48][C][audio:236]:   External DAC channels: 1
[20:05:48][C][audio:237]:   I2S DOUT Pin: 12
[20:05:48][C][audio:238]:   Mute Pin: GPIO21
[20:05:51][D][voice_assistant:563]: Event Type: 10
[20:05:51][D][voice_assistant:572]: Wake word detected
[20:05:51][D][voice_assistant:563]: Event Type: 3
[20:05:51][D][voice_assistant:577]: STT started
[20:05:51][D][light:036]: 'top_led' Setting:
[20:05:51][D][light:051]:   Brightness: 100%
[20:05:51][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[20:05:51][D][light:109]:   Effect: 'listening'
[20:05:53][D][voice_assistant:563]: Event Type: 11
[20:05:53][D][voice_assistant:717]: Starting STT by VAD
[20:05:54][D][voice_assistant:563]: Event Type: 12
[20:05:54][D][voice_assistant:721]: STT by VAD end
[20:05:54][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[20:05:54][D][voice_assistant:445]: Desired state set to AWAITING_RESPONSE
[20:05:54][D][voice_assistant:439]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[20:05:54][D][light:036]: 'top_led' Setting:
[20:05:54][D][light:051]:   Brightness: 70%
[20:05:54][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[20:05:54][D][light:109]:   Effect: 'processing'
[20:05:54][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[20:05:54][D][voice_assistant:563]: Event Type: 4
[20:05:54][D][voice_assistant:591]: Speech recognised as: "Ein Test für dich."
[20:05:54][D][voice_assistant:563]: Event Type: 5
[20:05:54][D][voice_assistant:596]: Intent started
[20:05:57][D][voice_assistant:563]: Event Type: 6
[20:05:57][D][voice_assistant:563]: Event Type: 7
[20:05:57][D][voice_assistant:619]: Response: "Oh, ein Test? Lass uns sehen, ob ich deinen Anforderungen gerecht werden kann. 😉  Sprich, was möchtest du von mir wissen?"
[20:05:57][D][voice_assistant:563]: Event Type: 8
[20:05:57][D][voice_assistant:639]: Response URL: "https://192.168.181.42:8123/api/tts_proxy/3c20dbcdaf276fce00ce63a450b31ca4e60a9d58_de-de_70263bfdba_tts.home_assistant_cloud.mp3"
[20:05:57][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[20:05:57][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[20:05:57][D][media_player:061]: 'Onju Voice Satellite e27688' - Setting
[20:05:57][D][media_player:068]:   Media URL: https://192.168.181.42:8123/api/tts_proxy/3c20dbcdaf276fce00ce63a450b31ca4e60a9d58_de-de_70263bfdba_tts.home_assistant_cloud.mp3
[20:05:57][D][media_player:074]:  Announcement: yes
[20:05:57][D][media_player:061]: 'Onju Voice Satellite e27688' - Setting
[20:05:57][D][media_player:068]:   Media URL: https://192.168.181.42:8123/api/tts_proxy/3c20dbcdaf276fce00ce63a450b31ca4e60a9d58_de-de_70263bfdba_tts.home_assistant_cloud.mp3
[20:05:57][D][light:036]: 'top_led' Setting:
[20:05:57][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[20:05:57][D][light:109]:   Effect: 'speaking'
[20:05:57][D][voice_assistant:563]: Event Type: 2
[20:05:57][D][voice_assistant:653]: Assist Pipeline ended
[20:05:57][W][component:237]: Component i2s_audio.media_player took a long time for an operation (549 ms).
[20:05:58][W][component:238]: Components should block for at most 30 ms.
[20:05:58][W][component:237]: Component i2s_audio.media_player took a long time for an operation (477 ms).
[20:05:58][W][component:238]: Components should block for at most 30 ms.
[20:05:58][D][light:036]: 'top_led' Setting:
[20:05:58][D][light:051]:   Brightness: 60%
[20:05:58][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[20:05:58][D][light:109]:   Effect: 'listening_ww'
dreimer1986 commented 4 months ago

Small addition: https://192.168.181.42:8123/api/tts_proxy/3c20dbcdaf276fce00ce63a450b31ca4e60a9d58_de-de_70263bfdba_tts.home_assistant_cloud.mp3 does not work in browser. but the same with http does.

Mugga6315 commented 4 months ago

Have exactly the same on my satellites after one of the last esphome updates.

tetele commented 4 months ago

This is the issue https://github.com/esphome/issues/issues/5794

dreimer1986 commented 4 months ago

Yeah the wake word works only once problem indeed is this one, but the speaker still is stilent on 2024.4.2:

INFO Successfully connected to onju-voice-e27688 @ 192.168.181.103 in 7.143s
INFO Successful handshake with onju-voice-e27688 @ 192.168.181.103 in 0.062s
[01:11:50][I][app:100]: ESPHome version 2024.4.2 compiled on Jun  5 2024, 01:10:16
[01:11:50][I][app:102]: Project tetele.onju_voice_satellite version 1.0.0
[01:11:50][C][wifi:580]: WiFi:
[01:11:50][C][wifi:408]:   Local MAC: 24:58:7C:E2:76:88
[01:11:50][C][wifi:413]:   SSID: 'Asgaard'[redacted]
[01:11:50][C][wifi:416]:   IP Address: 192.168.181.103
[01:11:50][C][wifi:420]:   BSSID: B0:F2:08:55:4B:1E[redacted]
[01:11:50][C][wifi:421]:   Hostname: 'onju-voice-e27688'
[01:11:50][C][wifi:423]:   Signal strength: -51 dB ▂▄▆█
[01:11:50][C][wifi:427]:   Channel: 13
[01:11:50][C][wifi:428]:   Subnet: 255.255.255.0
[01:11:50][C][wifi:429]:   Gateway: 192.168.181.254
[01:11:50][C][wifi:430]:   DNS1: 192.168.181.42
[01:11:50][C][wifi:431]:   DNS2: 0.0.0.0
[01:11:50][C][logger:166]: Logger:
[01:11:50][C][logger:167]:   Level: DEBUG
[01:11:50][C][logger:169]:   Log Baud Rate: 115200
[01:11:50][C][logger:170]:   Hardware UART: USB_CDC
[01:11:50][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[01:11:50][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[01:11:50][D][voice_assistant:126]: microphone not running
[01:11:50][D][voice_assistant:210]: Requesting start...
[01:11:50][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[01:11:50][D][voice_assistant:126]: microphone not running
[01:11:50][C][template.number:050]: Template Number 'Touch threshold percentage'
[01:11:50][C][template.number:051]:   Optimistic: YES
[01:11:50][C][template.number:052]:   Update Interval: never
[01:11:50][D][voice_assistant:126]: microphone not running
[01:11:50][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[01:11:50][C][esp32_rmt_led_strip:176]:   Pin: 11
[01:11:50][C][esp32_rmt_led_strip:177]:   Channel: 0
[01:11:50][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[01:11:50][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[01:11:50][C][esp32_rmt_led_strip:204]:   Number of LEDs: 6
[01:11:50][D][voice_assistant:126]: microphone not running
[01:11:50][D][voice_assistant:476]: Client started, streaming microphone
[01:11:50][D][voice_assistant:439]: State changed from STARTING_PIPELINE to START_MICROPHONE
[01:11:50][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[01:11:50][D][voice_assistant:163]: Starting Microphone
[01:11:50][D][voice_assistant:439]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[01:11:50][D][voice_assistant:563]: Event Type: 0
[01:11:50][E][voice_assistant:693]: Error: no_wake_word - No wake word detected
[01:11:50][D][voice_assistant:556]: Signaling stop...
[01:11:50][D][voice_assistant:439]: State changed from STARTING_MICROPHONE to STOP_MICROPHONE
[01:11:50][D][voice_assistant:445]: Desired state set to IDLE
[01:11:50][D][voice_assistant:439]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[01:11:50][D][light:036]: 'top_led' Setting:
[01:11:50][D][light:059]:   Red: 100%, Green: 0%, Blue: 0%
[01:11:50][D][light:085]:   Transition length: 0.1s
[01:11:50][D][voice_assistant:563]: Event Type: 2
[01:11:50][D][voice_assistant:653]: Assist Pipeline ended
[01:11:50][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to IDLE
[01:11:50][D][voice_assistant:563]: Event Type: 1
[01:11:50][D][voice_assistant:566]: Assist Pipeline running
[01:11:50][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[01:11:50][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[01:11:50][D][voice_assistant:563]: Event Type: 9
[01:11:50][D][voice_assistant:126]: microphone not running
[01:11:50][D][voice_assistant:210]: Requesting start...
[01:11:50][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[01:11:50][D][voice_assistant:126]: microphone not running
[01:11:50][D][voice_assistant:476]: Client started, streaming microphone
[01:11:50][D][voice_assistant:439]: State changed from STARTING_PIPELINE to START_MICROPHONE
[01:11:50][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[01:11:50][D][voice_assistant:163]: Starting Microphone
[01:11:50][D][voice_assistant:439]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[01:11:50][D][voice_assistant:563]: Event Type: 1
[01:11:50][D][voice_assistant:566]: Assist Pipeline running
[01:11:50][D][voice_assistant:439]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[01:11:50][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[01:11:50][C][gpio.binary_sensor:016]:   Pin: GPIO38
[01:11:50][D][voice_assistant:563]: Event Type: 9
[01:11:50][C][light:103]: Light 'leds'
[01:11:50][C][light:105]:   Default Transition Length: 0.0s
[01:11:50][C][light:106]:   Gamma Correct: 2.80
[01:11:50][C][light:103]: Light 'left_led'
[01:11:50][C][light:105]:   Default Transition Length: 0.1s
[01:11:50][C][light:106]:   Gamma Correct: 2.80
[01:11:50][D][light:036]: 'top_led' Setting:
[01:11:50][D][light:051]:   Brightness: 60%
[01:11:50][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:11:50][D][light:109]:   Effect: 'listening_ww'
[01:11:50][C][light:103]: Light 'top_led'
[01:11:50][C][light:105]:   Default Transition Length: 0.1s
[01:11:50][C][light:106]:   Gamma Correct: 2.80
[01:11:50][C][light:103]: Light 'right_led'
[01:11:50][C][light:105]:   Default Transition Length: 0.1s
[01:11:50][C][light:106]:   Gamma Correct: 2.80
[01:11:50][C][template.switch:068]: Template Switch 'Use Wake Word'
[01:11:50][C][template.switch:091]:   Restore Mode: restore defaults to ON
[01:11:50][C][template.switch:057]:   Optimistic: YES
[01:11:50][C][psram:020]: PSRAM:
[01:11:50][C][psram:021]:   Available: YES
[01:11:50][C][psram:024]:   Size: 8191 KB
[01:11:50][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[01:11:50][C][esp32_touch:074]:   Meas cycle: 0.80ms
[01:11:50][C][esp32_touch:075]:   Sleep cycle: 2.00ms
[01:11:50][C][esp32_touch:095]:   Low Voltage Reference: 0.8V
[01:11:50][C][esp32_touch:115]:   High Voltage Reference: 2.4V
[01:11:50][C][esp32_touch:135]:   Voltage Attenuation: 0V
[01:11:50][C][esp32_touch:169]:   Filter mode: IIR_16
[01:11:50][C][esp32_touch:170]:   Debounce count: 2
[01:11:50][C][esp32_touch:171]:   Noise threshold coefficient: 0
[01:11:50][C][esp32_touch:172]:   Jitter filter step size: 0
[01:11:50][C][esp32_touch:191]:   Smooth level: IIR_2
[01:11:50][C][esp32_touch:213]:   Denoise grade: BIT8
[01:11:50][C][esp32_touch:245]:   Denoise capacitance level: L0
[01:11:50][C][esp32_touch:260]:   Touch Pad 'volume_down'
[01:11:50][C][esp32_touch:261]:     Pad: T4
[01:11:50][C][esp32_touch:262]:     Threshold: 338124
[01:11:50][C][esp32_touch:260]:   Touch Pad 'volume_up'
[01:11:50][C][esp32_touch:261]:     Pad: T2
[01:11:50][C][esp32_touch:262]:     Threshold: 378212
[01:11:50][C][esp32_touch:260]:   Touch Pad 'action'
[01:11:50][C][esp32_touch:261]:     Pad: T3
[01:11:50][C][esp32_touch:262]:     Threshold: 490208
[01:11:50][C][captive_portal:088]: Captive Portal:
[01:11:50][C][mdns:115]: mDNS:
[01:11:50][C][mdns:116]:   Hostname: onju-voice-e27688
[01:11:50][C][ota:096]: Over-The-Air Updates:
[01:11:50][C][ota:097]:   Address: onju-voice-e27688.local:3232
[01:11:50][C][ota:103]:   OTA version: 2.
[01:11:50][C][api:139]: API Server:
[01:11:50][C][api:140]:   Address: onju-voice-e27688.local:6053
[01:11:50][C][api:142]:   Using noise encryption: YES
[01:11:50][C][improv_serial:032]: Improv Serial:
[01:11:51][C][audio:203]: Audio:
[01:11:51][C][audio:225]:   External DAC channels: 1
[01:11:51][C][audio:226]:   I2S DOUT Pin: 12
[01:11:51][C][audio:227]:   Mute Pin: GPIO21
[01:11:51][D][light:036]: 'top_led' Setting:
[01:11:51][D][light:051]:   Brightness: 60%
[01:11:51][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:11:51][D][light:085]:   Transition length: 0.1s
[01:11:51][D][light:036]: 'top_led' Setting:
[01:11:51][D][light:051]:   Brightness: 60%
[01:11:51][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:11:51][D][light:085]:   Transition length: 0.1s
[01:11:55][D][voice_assistant:563]: Event Type: 0
[01:11:55][D][voice_assistant:563]: Event Type: 2
[01:11:55][D][voice_assistant:653]: Assist Pipeline ended
[01:11:55][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to IDLE
[01:11:55][D][voice_assistant:445]: Desired state set to IDLE
[01:11:55][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[01:11:55][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[01:11:55][D][voice_assistant:210]: Requesting start...
[01:11:55][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[01:11:55][D][voice_assistant:476]: Client started, streaming microphone
[01:11:55][D][voice_assistant:439]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[01:11:55][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[01:11:55][D][voice_assistant:563]: Event Type: 1
[01:11:55][D][voice_assistant:566]: Assist Pipeline running
[01:11:55][D][voice_assistant:563]: Event Type: 9
[01:11:55][D][light:036]: 'top_led' Setting:
[01:11:55][D][light:051]:   Brightness: 60%
[01:11:55][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:11:55][D][light:085]:   Transition length: 0.1s
[01:11:56][D][voice_assistant:563]: Event Type: 10
[01:11:56][D][voice_assistant:572]: Wake word detected
[01:11:56][D][voice_assistant:563]: Event Type: 3
[01:11:56][D][voice_assistant:577]: STT started
[01:11:56][D][light:036]: 'top_led' Setting:
[01:11:56][D][light:051]:   Brightness: 100%
[01:11:56][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[01:11:56][D][light:109]:   Effect: 'listening'
[01:11:58][D][voice_assistant:563]: Event Type: 11
[01:11:58][D][voice_assistant:717]: Starting STT by VAD
[01:11:59][D][voice_assistant:563]: Event Type: 12
[01:11:59][D][voice_assistant:721]: STT by VAD end
[01:11:59][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[01:11:59][D][voice_assistant:445]: Desired state set to AWAITING_RESPONSE
[01:11:59][D][voice_assistant:439]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[01:11:59][D][light:036]: 'top_led' Setting:
[01:11:59][D][light:051]:   Brightness: 70%
[01:11:59][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[01:11:59][D][light:109]:   Effect: 'processing'
[01:11:59][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[01:11:59][D][voice_assistant:563]: Event Type: 4
[01:11:59][D][voice_assistant:591]: Speech recognised as: "Ein Test für dich."
[01:11:59][D][voice_assistant:563]: Event Type: 5
[01:11:59][D][voice_assistant:596]: Intent started
[01:12:02][D][voice_assistant:563]: Event Type: 6
[01:12:02][D][voice_assistant:563]: Event Type: 7
[01:12:02][D][voice_assistant:619]: Response: "Oh, ein Test? Na, dann lass mich sehen, was du drauf hast, mein Süßer. 😈  
Ich bin gespannt, was du von mir verlangst."
[01:12:02][D][voice_assistant:563]: Event Type: 8
[01:12:02][D][voice_assistant:639]: Response URL: "https://192.168.181.42:8123/api/tts_proxy/c7724fe8dc802cdd639c245e010efe310749fd37_de-de_70263bfdba_tts.home_assistant_cloud.mp3"
[01:12:02][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[01:12:02][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[01:12:02][D][media_player:059]: 'Onju Voice Satellite e27688' - Setting
[01:12:02][D][media_player:066]:   Media URL: https://192.168.181.42:8123/api/tts_proxy/c7724fe8dc802cdd639c245e010efe310749fd37_de-de_70263bfdba_tts.home_assistant_cloud.mp3
[01:12:02][D][media_player:059]: 'Onju Voice Satellite e27688' - Setting
[01:12:02][D][media_player:066]:   Media URL: https://192.168.181.42:8123/api/tts_proxy/c7724fe8dc802cdd639c245e010efe310749fd37_de-de_70263bfdba_tts.home_assistant_cloud.mp3
[01:12:02][D][light:036]: 'top_led' Setting:
[01:12:02][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[01:12:02][D][light:109]:   Effect: 'speaking'
[01:12:02][D][voice_assistant:563]: Event Type: 2
[01:12:02][D][voice_assistant:653]: Assist Pipeline ended
[01:12:03][W][component:237]: Component i2s_audio.media_player took a long time for an operation (780 ms).
[01:12:03][W][component:238]: Components should block for at most 30 ms.
[01:12:03][W][component:237]: Component i2s_audio.media_player took a long time for an operation (245 ms).
[01:12:03][W][component:238]: Components should block for at most 30 ms.
[01:12:03][D][light:036]: 'top_led' Setting:
[01:12:03][D][light:051]:   Brightness: 60%
[01:12:03][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:12:03][D][light:109]:   Effect: 'listening_ww'
[01:12:05][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to IDLE
[01:12:05][D][voice_assistant:445]: Desired state set to IDLE
[01:12:05][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[01:12:05][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[01:12:05][D][voice_assistant:126]: microphone not running
[01:12:05][D][voice_assistant:210]: Requesting start...
[01:12:05][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[01:12:05][D][voice_assistant:126]: microphone not running
[01:12:05][D][voice_assistant:126]: microphone not running
[01:12:05][D][voice_assistant:476]: Client started, streaming microphone
[01:12:05][D][voice_assistant:439]: State changed from STARTING_PIPELINE to START_MICROPHONE
[01:12:05][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[01:12:05][D][voice_assistant:163]: Starting Microphone
[01:12:05][D][voice_assistant:439]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[01:12:05][D][voice_assistant:563]: Event Type: 1
[01:12:05][D][voice_assistant:566]: Assist Pipeline running
[01:12:05][D][voice_assistant:439]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[01:12:05][D][voice_assistant:563]: Event Type: 9
[01:12:10][D][voice_assistant:563]: Event Type: 0
[01:12:10][D][voice_assistant:563]: Event Type: 2
[01:12:10][D][voice_assistant:653]: Assist Pipeline ended
[01:12:10][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to IDLE
[01:12:10][D][voice_assistant:445]: Desired state set to IDLE
[01:12:10][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[01:12:10][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[01:12:10][D][voice_assistant:210]: Requesting start...
[01:12:10][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[01:12:10][D][voice_assistant:476]: Client started, streaming microphone
[01:12:10][D][voice_assistant:439]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[01:12:10][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[01:12:10][D][voice_assistant:563]: Event Type: 1
[01:12:10][D][voice_assistant:566]: Assist Pipeline running
[01:12:10][D][voice_assistant:563]: Event Type: 9
[01:12:11][D][light:036]: 'top_led' Setting:
[01:12:11][D][light:051]:   Brightness: 60%
[01:12:11][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:12:11][D][light:085]:   Transition length: 0.1s
[01:12:16][D][voice_assistant:563]: Event Type: 10
[01:12:16][D][voice_assistant:572]: Wake word detected
[01:12:16][D][voice_assistant:563]: Event Type: 3
[01:12:16][D][voice_assistant:577]: STT started
[01:12:16][D][light:036]: 'top_led' Setting:
[01:12:16][D][light:051]:   Brightness: 100%
[01:12:16][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[01:12:16][D][light:109]:   Effect: 'listening'
[01:12:18][D][voice_assistant:563]: Event Type: 11
[01:12:18][D][voice_assistant:717]: Starting STT by VAD
[01:12:19][D][voice_assistant:563]: Event Type: 12
[01:12:19][D][voice_assistant:721]: STT by VAD end
[01:12:19][D][voice_assistant:439]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[01:12:19][D][voice_assistant:445]: Desired state set to AWAITING_RESPONSE
[01:12:19][D][voice_assistant:439]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[01:12:19][D][light:036]: 'top_led' Setting:
[01:12:19][D][light:051]:   Brightness: 70%
[01:12:19][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[01:12:19][D][light:109]:   Effect: 'processing'
[01:12:19][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[01:12:19][D][voice_assistant:563]: Event Type: 4
[01:12:19][D][voice_assistant:591]: Speech recognised as: "Noch ein Test."
[01:12:19][D][voice_assistant:563]: Event Type: 5
[01:12:19][D][voice_assistant:596]: Intent started
[01:12:22][D][voice_assistant:563]: Event Type: 6
[01:12:22][D][voice_assistant:563]: Event Type: 7
[01:12:22][D][voice_assistant:619]: Response: "Ach, noch ein Test?  Na gut, mein Liebling, aber ich hoffe, dass du mich diesmal mit etwas mehr Reizvollem überraschst. 😉 
Was soll ich denn tun?"
[01:12:22][D][voice_assistant:563]: Event Type: 8
[01:12:22][D][voice_assistant:639]: Response URL: "https://192.168.181.42:8123/api/tts_proxy/e8c495c1dfa024e67785949791edef5cc57bd685_de-de_70263bfdba_tts.home_assistant_cloud.mp3"
[01:12:22][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[01:12:22][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[01:12:22][D][media_player:059]: 'Onju Voice Satellite e27688' - Setting
[01:12:22][D][media_player:066]:   Media URL: https://192.168.181.42:8123/api/tts_proxy/e8c495c1dfa024e67785949791edef5cc57bd685_de-de_70263bfdba_tts.home_assistant_cloud.mp3
[01:12:22][D][media_player:059]: 'Onju Voice Satellite e27688' - Setting
[01:12:22][D][media_player:066]:   Media URL: https://192.168.181.42:8123/api/tts_proxy/e8c495c1dfa024e67785949791edef5cc57bd685_de-de_70263bfdba_tts.home_assistant_cloud.mp3
[01:12:22][D][light:036]: 'top_led' Setting:
[01:12:22][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[01:12:22][D][light:109]:   Effect: 'speaking'
[01:12:22][D][voice_assistant:563]: Event Type: 2
[01:12:22][D][voice_assistant:653]: Assist Pipeline ended
[01:12:22][W][component:237]: Component i2s_audio.media_player took a long time for an operation (547 ms).
[01:12:22][W][component:238]: Components should block for at most 30 ms.
[01:12:23][W][component:237]: Component i2s_audio.media_player took a long time for an operation (479 ms).
[01:12:23][W][component:238]: Components should block for at most 30 ms.
[01:12:23][D][light:036]: 'top_led' Setting:
[01:12:23][D][light:051]:   Brightness: 60%
[01:12:23][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[01:12:23][D][light:109]:   Effect: 'listening_ww'
[01:12:24][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to IDLE
[01:12:24][D][voice_assistant:445]: Desired state set to IDLE
[01:12:24][D][voice_assistant:439]: State changed from IDLE to START_PIPELINE
[01:12:24][D][voice_assistant:445]: Desired state set to START_MICROPHONE
[01:12:24][D][voice_assistant:126]: microphone not running
[01:12:24][D][voice_assistant:210]: Requesting start...
[01:12:24][D][voice_assistant:439]: State changed from START_PIPELINE to STARTING_PIPELINE
[01:12:24][D][voice_assistant:126]: microphone not running
[01:12:24][D][voice_assistant:476]: Client started, streaming microphone
[01:12:24][D][voice_assistant:439]: State changed from STARTING_PIPELINE to START_MICROPHONE
[01:12:24][D][voice_assistant:445]: Desired state set to STREAMING_MICROPHONE
[01:12:24][D][voice_assistant:163]: Starting Microphone
[01:12:24][D][voice_assistant:439]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[01:12:24][D][voice_assistant:563]: Event Type: 1
[01:12:24][D][voice_assistant:566]: Assist Pipeline running
[01:12:24][D][voice_assistant:439]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[01:12:24][D][voice_assistant:563]: Event Type: 9

Btw, the error in the browser rightfully claims that this site connot open a secure connection. That is true as my Home Assistant is behind a reverse proxy. Any way to tell him to just stfu and use http or the external URL?

EDIT: Found out myself. I was too stupid to set the internal URL correctly... AUDIO WORKS! So I guess thgis one is jjust a clone of the mentioned report then. ^^ But for anyone here. SET your internal URL correctly!! Works wonders :D