tetele / onju-voice-satellite

An ESPHome config for the Onju Voice which makes it a Home Assistant voice satellite
MIT License
72 stars 10 forks source link

Wake word detected only once. #51

Open nagydavid opened 2 weeks ago

nagydavid commented 2 weeks ago

Flavor

MicroWakeWord

Checklist

Describe the issue

After installing the latest updated, wake word detection is happening only once, after boot up and after is dead.

Reproduction steps

install microwakeword flavour.

Debug logs

INFO ESPHome 2024.5.5
INFO Reading configuration /config/esphome/voice-assist-living-room.yaml...
INFO Starting log output from 192.168.50.36 using esphome API
INFO Successfully connected to voice-assist-living-room @ 192.168.50.36 in 0.056s
INFO Successful handshake with voice-assist-living-room @ 192.168.50.36 in 0.056s
[21:50:43][I][app:100]: ESPHome version 2024.5.5 compiled on Jun  9 2024, 20:24:15
[21:50:43][I][app:102]: Project tetele.onju_voice_satellite version 1.0.0
[21:50:43][C][wifi:580]: WiFi:
[21:50:43][C][wifi:408]:   Local MAC: 68:B6:B3:20:C0:74
[21:50:43][C][wifi:413]:   SSID: [redacted]
[21:50:43][C][wifi:416]:   IP Address: 192.168.50.36
[21:50:43][C][wifi:420]:   BSSID: [redacted]
[21:50:43][C][wifi:421]:   Hostname: 'voice-assist-living-room'
[21:50:43][C][wifi:423]:   Signal strength: -53 dB ▂▄▆█
[21:50:43][C][wifi:427]:   Channel: 8
[21:50:43][C][wifi:428]:   Subnet: 255.255.255.0
[21:50:43][C][wifi:429]:   Gateway: 192.168.50.1
[21:50:43][C][wifi:430]:   DNS1: 0.0.0.0
[21:50:43][C][wifi:431]:   DNS2: 0.0.0.0
[21:50:43][C][logger:185]: Logger:
[21:50:43][C][logger:186]:   Level: DEBUG
[21:50:43][C][logger:188]:   Log Baud Rate: 115200
[21:50:43][C][logger:189]:   Hardware UART: USB_CDC
[21:50:43][C][template.number:050]: Template Number 'Touch threshold percentage'
[21:50:43][C][template.number:051]:   Optimistic: YES
[21:50:43][C][template.number:052]:   Update Interval: never
[21:50:43][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[21:50:43][C][esp32_rmt_led_strip:176]:   Pin: 11
[21:50:43][C][esp32_rmt_led_strip:177]:   Channel: 0
[21:50:43][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[21:50:43][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[21:50:43][C][esp32_rmt_led_strip:204]:   Number of LEDs: 6
[21:50:43][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[21:50:43][C][gpio.binary_sensor:016]:   Pin: GPIO38
[21:50:43][C][light:103]: Light 'leds'
[21:50:43][C][light:105]:   Default Transition Length: 0.0s
[21:50:43][C][light:106]:   Gamma Correct: 2.80
[21:50:43][C][light:103]: Light 'left_led'
[21:50:43][C][light:105]:   Default Transition Length: 0.1s
[21:50:43][C][light:106]:   Gamma Correct: 2.80
[21:50:43][C][light:103]: Light 'top_led'
[21:50:43][C][light:105]:   Default Transition Length: 0.1s
[21:50:43][C][light:106]:   Gamma Correct: 2.80
[21:50:43][C][light:103]: Light 'right_led'
[21:50:43][C][light:105]:   Default Transition Length: 0.1s
[21:50:43][C][light:106]:   Gamma Correct: 2.80
[21:50:43][C][template.switch:068]: Template Switch 'Use Wake Word'
[21:50:43][C][template.switch:091]:   Restore Mode: restore defaults to ON
[21:50:43][C][template.switch:057]:   Optimistic: YES
[21:50:43][C][psram:020]: PSRAM:
[21:50:43][C][psram:021]:   Available: YES
[21:50:43][C][psram:024]:   Size: 8191 KB
[21:50:43][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[21:50:43][C][esp32_touch:074]:   Meas cycle: 0.80ms
[21:50:43][C][esp32_touch:075]:   Sleep cycle: 2.00ms
[21:50:43][C][esp32_touch:095]:   Low Voltage Reference: 0.8V
[21:50:43][C][esp32_touch:115]:   High Voltage Reference: 2.4V
[21:50:43][C][esp32_touch:135]:   Voltage Attenuation: 0V
[21:50:43][C][esp32_touch:169]:   Filter mode: IIR_16
[21:50:43][C][esp32_touch:170]:   Debounce count: 2
[21:50:43][C][esp32_touch:171]:   Noise threshold coefficient: 0
[21:50:43][C][esp32_touch:172]:   Jitter filter step size: 0
[21:50:43][C][esp32_touch:191]:   Smooth level: IIR_2
[21:50:43][C][esp32_touch:213]:   Denoise grade: BIT8
[21:50:43][C][esp32_touch:245]:   Denoise capacitance level: L0
[21:50:43][C][esp32_touch:260]:   Touch Pad 'volume_down'
[21:50:43][C][esp32_touch:261]:     Pad: T4
[21:50:43][C][esp32_touch:262]:     Threshold: 383719
[21:50:43][C][esp32_touch:260]:   Touch Pad 'volume_up'
[21:50:43][C][esp32_touch:261]:     Pad: T2
[21:50:43][C][esp32_touch:262]:     Threshold: 476477
[21:50:43][C][esp32_touch:260]:   Touch Pad 'action'
[21:50:43][C][esp32_touch:261]:     Pad: T3
[21:50:43][C][esp32_touch:262]:     Threshold: 600358
[21:50:43][C][captive_portal:088]: Captive Portal:
[21:50:43][C][mdns:115]: mDNS:
[21:50:43][C][mdns:116]:   Hostname: voice-assist-living-room
[21:50:43][C][ota:096]: Over-The-Air Updates:
[21:50:43][C][ota:097]:   Address: 192.168.50.36:3232
[21:50:43][C][ota:100]:   Using Password.
[21:50:43][C][ota:103]:   OTA version: 2.
[21:50:43][C][api:139]: API Server:
[21:50:43][C][api:140]:   Address: 192.168.50.36:6053
[21:50:43][C][api:142]:   Using noise encryption: YES
[21:50:43][C][improv_serial:032]: Improv Serial:
[21:50:43][C][audio:214]: Audio:
[21:50:43][C][audio:236]:   External DAC channels: 1
[21:50:43][C][audio:237]:   I2S DOUT Pin: 12
[21:50:43][C][audio:238]:   Mute Pin: GPIO21
[21:50:44][D][light:036]: 'top_led' Setting:
[21:50:44][D][light:051]:   Brightness: 60%
[21:50:44][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[21:50:44][D][light:109]:   Effect: 'listening_ww'
[21:50:50][D][api:102]: Accepted 192.168.50.188
[21:50:50][D][api.connection:1321]: Home Assistant 2024.6.1 (192.168.50.188): Connected successfully
[21:50:50][D][voice_assistant:502]: State changed from IDLE to START_MICROPHONE
[21:50:50][D][voice_assistant:508]: Desired state set to START_PIPELINE
[21:50:50][D][voice_assistant:220]: Starting Microphone
[21:50:50][D][ring_buffer:024]: Created ring buffer with size 16384
[21:50:50][D][voice_assistant:502]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:50:50][D][voice_assistant:502]: State changed from STARTING_MICROPHONE to START_PIPELINE
[21:50:50][D][voice_assistant:274]: Requesting start...
[21:50:50][D][voice_assistant:502]: State changed from START_PIPELINE to STARTING_PIPELINE
[21:50:50][D][voice_assistant:523]: Client started, streaming microphone
[21:50:50][D][voice_assistant:502]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[21:50:50][D][voice_assistant:508]: Desired state set to STREAMING_MICROPHONE
[21:50:50][D][voice_assistant:625]: Event Type: 1
[21:50:50][D][voice_assistant:628]: Assist Pipeline running
[21:50:50][D][voice_assistant:625]: Event Type: 9
[21:50:55][D][voice_assistant:625]: Event Type: 0
[21:50:55][D][voice_assistant:625]: Event Type: 2
[21:50:55][D][voice_assistant:715]: Assist Pipeline ended
[21:50:55][D][voice_assistant:502]: State changed from STREAMING_MICROPHONE to IDLE
[21:50:55][D][voice_assistant:508]: Desired state set to IDLE
[21:50:55][D][voice_assistant:502]: State changed from IDLE to START_MICROPHONE
[21:50:55][D][voice_assistant:508]: Desired state set to START_PIPELINE
[21:50:55][D][voice_assistant:220]: Starting Microphone
[21:50:55][D][voice_assistant:502]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:50:55][D][voice_assistant:502]: State changed from STARTING_MICROPHONE to START_PIPELINE
[21:50:55][D][voice_assistant:274]: Requesting start...
[21:50:55][D][voice_assistant:502]: State changed from START_PIPELINE to STARTING_PIPELINE
[21:50:55][D][voice_assistant:523]: Client started, streaming microphone
[21:50:55][D][voice_assistant:502]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[21:50:55][D][voice_assistant:508]: Desired state set to STREAMING_MICROPHONE
[21:50:55][D][voice_assistant:625]: Event Type: 1
[21:50:55][D][voice_assistant:628]: Assist Pipeline running
[21:50:55][D][voice_assistant:625]: Event Type: 9
[21:50:55][D][light:036]: 'top_led' Setting:
[21:50:55][D][light:051]:   Brightness: 60%
[21:50:55][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[21:50:55][D][light:085]:   Transition length: 0.1s
[21:50:56][D][voice_assistant:625]: Event Type: 10
[21:50:56][D][voice_assistant:634]: Wake word detected
[21:50:56][D][voice_assistant:625]: Event Type: 3
[21:50:56][D][voice_assistant:639]: STT started
[21:50:56][D][light:036]: 'top_led' Setting:
[21:50:56][D][light:051]:   Brightness: 100%
[21:50:56][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[21:50:56][D][light:109]:   Effect: 'listening'
[21:50:59][D][voice_assistant:625]: Event Type: 11
[21:50:59][D][voice_assistant:779]: Starting STT by VAD
[21:51:01][D][voice_assistant:625]: Event Type: 12
[21:51:01][D][voice_assistant:783]: STT by VAD end
[21:51:01][D][voice_assistant:502]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[21:51:01][D][voice_assistant:508]: Desired state set to AWAITING_RESPONSE
[21:51:01][D][voice_assistant:502]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[21:51:01][D][light:036]: 'top_led' Setting:
[21:51:01][D][light:051]:   Brightness: 70%
[21:51:01][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[21:51:01][D][light:109]:   Effect: 'processing'
[21:51:01][D][voice_assistant:502]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[21:51:03][D][voice_assistant:625]: Event Type: 4
[21:51:03][D][voice_assistant:653]: Speech recognised as: " Turn off the light in the kitchen."
[21:51:03][D][voice_assistant:625]: Event Type: 5
[21:51:03][D][voice_assistant:658]: Intent started
[21:51:03][D][voice_assistant:625]: Event Type: 6
[21:51:03][D][voice_assistant:625]: Event Type: 7
[21:51:03][D][voice_assistant:681]: Response: "Turned off the lights"
[21:51:03][D][voice_assistant:625]: Event Type: 8
[21:51:03][D][voice_assistant:701]: Response URL: "http://192.168.50.188:8123/api/tts_proxy/85d43b448ab715eae17c0361864a34ff749eb14a_en-us_4e62813ccd_tts.piper.mp3"
[21:51:03][D][voice_assistant:502]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[21:51:03][D][voice_assistant:508]: Desired state set to STREAMING_RESPONSE
[21:51:03][D][media_player:061]: 'Voice Assist Living Room' - Setting
[21:51:03][D][media_player:068]:   Media URL: http://192.168.50.188:8123/api/tts_proxy/85d43b448ab715eae17c0361864a34ff749eb14a_en-us_4e62813ccd_tts.piper.mp3
[21:51:03][D][media_player:074]:  Announcement: yes
[21:51:03][D][media_player:061]: 'Voice Assist Living Room' - Setting
[21:51:03][D][media_player:068]:   Media URL: http://192.168.50.188:8123/api/tts_proxy/85d43b448ab715eae17c0361864a34ff749eb14a_en-us_4e62813ccd_tts.piper.mp3
[21:51:03][D][light:036]: 'top_led' Setting:
[21:51:03][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[21:51:03][D][light:109]:   Effect: 'speaking'
[21:51:03][D][voice_assistant:625]: Event Type: 2
[21:51:03][D][voice_assistant:715]: Assist Pipeline ended
[21:51:04][W][component:237]: Component i2s_audio.media_player took a long time for an operation (524 ms).
[21:51:04][W][component:238]: Components should block for at most 30 ms.
[21:51:06][W][component:237]: Component i2s_audio.media_player took a long time for an operation (371 ms).
[21:51:06][W][component:238]: Components should block for at most 30 ms.
[21:51:06][W][component:237]: Component i2s_audio.media_player took a long time for an operation (369 ms).
[21:51:06][W][component:238]: Components should block for at most 30 ms.
[21:51:06][D][light:036]: 'top_led' Setting:
[21:51:06][D][light:051]:   Brightness: 60%
[21:51:06][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[21:51:06][D][light:109]:   Effect: 'listening_ww'
tetele commented 2 weeks ago

Seems like a duplicate of #46. Please reopen if that's not the case.

nagydavid commented 2 weeks ago

HI. @tetele , So I think it is not the same problem. After the first wakeword detected and command executed, after the "warning" nothing works. Same applies, if wake word is disabled and I trigger the pipline by touch.

tetele commented 2 weeks ago

Can you please try this fix and see if it works?

nagydavid commented 2 weeks ago

I have tried it, also with cleaned build files. I reinstalled openwakeword, and the same happens. After the first wake word detection, I get this warning

[13:06:05][W][component:237]: Component i2s_audio.media_player took a long time for an operation (525 ms).
[13:06:05][W][component:238]: Components should block for at most 30 ms.
[13:06:07][W][component:237]: Component i2s_audio.media_player took a long time for an operation (371 ms).
[13:06:07][W][component:238]: Components should block for at most 30 ms.
[13:06:08][W][component:237]: Component i2s_audio.media_player took a long time for an operation (370 ms).
[13:06:08][W][component:238]: Components should block for at most 30 ms.
[13:06:08][D][light:036]: 'top_led' Setting:
[13:06:08][D][light:051]:   Brightness: 60%
[13:06:08][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[13:06:08][D][light:109]:   Effect: 'listening_ww'

and nothing happens.

nagydavid commented 2 weeks ago

I have reverted back to esphome 2023.4.2. Openwakeword pipeline works, microwakeword does not