Closed ther3zz closed 5 months ago
3. see error in logs
What error? I don't see any error in the logs.
Also, please provide the full logs. Power cycle your device, reproduce the issue and then copy and paste the ESPHome logs from the device boot up to that point.
Here are the full logs:
INFO Successful handshake with office-onju-2a44d8 @ 192.168.2. in 0.058s
[17:08:48][I][app:102]: ESPHome version 2024.3.2 compiled on Apr 15 2024, 16:49:05
[17:08:48][I][app:104]: Project tetele.onju_voice_satellite version 1.0.0
[17:08:48][C][wifi:580]: WiFi:
[17:08:48][C][wifi:408]: Local MAC: DC
[17:08:48][C][wifi:413]: SSID: [redacted]
[17:08:48][C][wifi:416]: IP Address: 192.168.2.
[17:08:48][C][wifi:420]: BSSID: [redacted]
[17:08:48][C][wifi:421]: Hostname: 'office-onju-2a44d8'
[17:08:48][C][wifi:423]: Signal strength: -48 dB ▂▄▆█
[17:08:48][C][wifi:427]: Channel: 11
[17:08:48][C][wifi:428]: Subnet: 255.255.255.0
[17:08:48][C][wifi:429]: Gateway: 192.168.2.1
[17:08:48][C][wifi:430]: DNS1: 0.0.0.0
[17:08:48][C][wifi:431]: DNS2: 0.0.0.0
[17:08:48][C][logger:166]: Logger:
[17:08:48][C][logger:167]: Level: DEBUG
[17:08:48][C][logger:169]: Log Baud Rate: 115200
[17:08:48][C][logger:170]: Hardware UART: USB_CDC
[17:08:48][C][template.number:050]: Template Number 'Touch threshold percentage'
[17:08:48][C][template.number:051]: Optimistic: YES
[17:08:48][C][template.number:052]: Update Interval: never
[17:08:48][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[17:08:48][C][esp32_rmt_led_strip:176]: Pin: 11
[17:08:48][C][esp32_rmt_led_strip:177]: Channel: 0
[17:08:48][C][esp32_rmt_led_strip:202]: RGB Order: GRB
[17:08:48][C][esp32_rmt_led_strip:203]: Max refresh rate: 0
[17:08:48][C][esp32_rmt_led_strip:204]: Number of LEDs: 6
[17:08:48][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[17:08:48][C][gpio.binary_sensor:016]: Pin: GPIO38
[17:08:49][C][light:103]: Light 'leds'
[17:08:49][C][light:105]: Default Transition Length: 0.0s
[17:08:49][C][light:106]: Gamma Correct: 2.80
[17:08:49][C][light:103]: Light 'left_led'
[17:08:49][C][light:105]: Default Transition Length: 0.1s
[17:08:49][C][light:106]: Gamma Correct: 2.80
[17:08:49][C][light:103]: Light 'top_led'
[17:08:49][C][light:105]: Default Transition Length: 0.1s
[17:08:49][C][light:106]: Gamma Correct: 2.80
[17:08:49][C][light:103]: Light 'right_led'
[17:08:49][C][light:105]: Default Transition Length: 0.1s
[17:08:49][C][light:106]: Gamma Correct: 2.80
[17:08:49][C][template.switch:068]: Template Switch 'Use Wake Word'
[17:08:49][C][template.switch:091]: Restore Mode: restore defaults to ON
[17:08:49][C][template.switch:057]: Optimistic: YES
[17:08:49][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[17:08:49][C][esp32_touch:074]: Meas cycle: 0.80ms
[17:08:49][C][esp32_touch:075]: Sleep cycle: 2.00ms
[17:08:49][C][esp32_touch:095]: Low Voltage Reference: 0.8V
[17:08:49][C][esp32_touch:115]: High Voltage Reference: 2.4V
[17:08:49][C][esp32_touch:135]: Voltage Attenuation: 0V
[17:08:49][C][esp32_touch:169]: Filter mode: IIR_16
[17:08:49][C][esp32_touch:170]: Debounce count: 2
[17:08:49][C][esp32_touch:171]: Noise threshold coefficient: 0
[17:08:49][C][esp32_touch:172]: Jitter filter step size: 0
[17:08:49][C][esp32_touch:191]: Smooth level: IIR_2
[17:08:49][C][esp32_touch:213]: Denoise grade: BIT8
[17:08:49][C][esp32_touch:245]: Denoise capacitance level: L0
[17:08:49][C][esp32_touch:260]: Touch Pad 'volume_down'
[17:08:49][C][esp32_touch:261]: Pad: T4
[17:08:49][C][esp32_touch:262]: Threshold: 529174
[17:08:49][C][esp32_touch:260]: Touch Pad 'volume_up'
[17:08:49][C][esp32_touch:261]: Pad: T2
[17:08:49][C][esp32_touch:262]: Threshold: 501357
[17:08:49][C][esp32_touch:260]: Touch Pad 'action'
[17:08:49][C][esp32_touch:261]: Pad: T3
[17:08:49][C][esp32_touch:262]: Threshold: 690433
[17:08:49][C][captive_portal:088]: Captive Portal:
[17:08:49][C][mdns:115]: mDNS:
[17:08:49][C][mdns:116]: Hostname: office-onju-2a44d8
[17:08:49][C][ota:096]: Over-The-Air Updates:
[17:08:49][C][ota:097]: Address: 192.168.2. :3232
[17:08:49][C][ota:100]: Using Password.
[17:08:49][C][ota:103]: OTA version: 2.
[17:08:49][C][api:139]: API Server:
[17:08:49][C][api:140]: Address: 192.168.2.176:6053
[17:08:49][C][api:142]: Using noise encryption: YES
[17:08:49][C][improv_serial:032]: Improv Serial:
[17:08:49][C][audio:203]: Audio:
[17:08:49][C][audio:225]: External DAC channels: 1
[17:08:49][C][audio:226]: I2S DOUT Pin: 12
[17:08:49][C][audio:227]: Mute Pin: GPIO21
[17:08:49][D][voice_assistant:523]: Event Type: 0
[17:08:49][D][voice_assistant:523]: Event Type: 2
[17:08:49][D][voice_assistant:613]: Assist Pipeline ended
[17:08:49][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to IDLE
[17:08:49][D][voice_assistant:422]: Desired state set to IDLE
[17:08:49][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[17:08:49][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[17:08:49][D][voice_assistant:202]: Requesting start...
[17:08:49][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[17:08:49][D][voice_assistant:437]: Client started, streaming microphone
[17:08:49][D][voice_assistant:416]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[17:08:49][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[17:08:49][D][voice_assistant:523]: Event Type: 1
[17:08:49][D][voice_assistant:526]: Assist Pipeline running
[17:08:49][D][voice_assistant:523]: Event Type: 9
[17:08:49][D][light:036]: 'top_led' Setting:
[17:08:49][D][light:051]: Brightness: 60%
[17:08:49][D][light:059]: Red: 100%, Green: 0%, Blue: 100%
[17:08:49][D][light:085]: Transition length: 0.1s
[17:08:51][D][voice_assistant:523]: Event Type: 10
[17:08:51][D][voice_assistant:532]: Wake word detected
[17:08:51][D][voice_assistant:523]: Event Type: 3
[17:08:51][D][voice_assistant:537]: STT started
[17:08:51][D][light:036]: 'top_led' Setting:
[17:08:51][D][light:051]: Brightness: 100%
[17:08:51][D][light:059]: Red: 100%, Green: 100%, Blue: 100%
[17:08:51][D][light:109]: Effect: 'listening'
[17:08:53][D][voice_assistant:523]: Event Type: 11
[17:08:53][D][voice_assistant:677]: Starting STT by VAD
[17:08:53][D][voice_assistant:523]: Event Type: 12
[17:08:53][D][voice_assistant:681]: STT by VAD end
[17:08:53][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[17:08:53][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE
[17:08:53][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[17:08:53][D][light:036]: 'top_led' Setting:
[17:08:53][D][light:051]: Brightness: 70%
[17:08:53][D][light:059]: Red: 0%, Green: 20%, Blue: 100%
[17:08:53][D][light:109]: Effect: 'processing'
[17:08:53][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[17:08:55][D][voice_assistant:523]: Event Type: 4
[17:08:55][D][voice_assistant:551]: Speech recognised as: " What time is it?"
[17:08:55][D][voice_assistant:523]: Event Type: 5
[17:08:55][D][voice_assistant:556]: Intent started
[17:08:55][D][voice_assistant:523]: Event Type: 6
[17:08:55][D][voice_assistant:523]: Event Type: 7
[17:08:55][D][voice_assistant:579]: Response: "Sorry, I couldn't understand that"
[17:08:55][D][voice_assistant:523]: Event Type: 8
[17:08:55][D][voice_assistant:599]: Response URL: "https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3"
[17:08:55][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[17:08:55][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE
[17:08:55][D][media_player:059]: 'Office Onju 2a44d8' - Setting
[17:08:55][D][media_player:066]: Media URL: https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3
[17:08:55][D][media_player:059]: 'Office Onju 2a44d8' - Setting
[17:08:55][D][media_player:066]: Media URL: https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3
[17:08:55][D][light:036]: 'top_led' Setting:
[17:08:55][D][light:059]: Red: 20%, Green: 100%, Blue: 0%
[17:08:55][D][light:109]: Effect: 'speaking'
[17:08:55][D][voice_assistant:523]: Event Type: 2
[17:08:55][D][voice_assistant:613]: Assist Pipeline ended
[17:08:56][W][component:232]: Component i2s_audio.media_player took a long time for an operation (521 ms).
[17:08:56][W][component:233]: Components should block for at most 30 ms.
[17:08:56][W][component:232]: Component i2s_audio.media_player took a long time for an operation (504 ms).
[17:08:56][W][component:233]: Components should block for at most 30 ms.
[17:08:56][D][light:036]: 'top_led' Setting:
[17:08:56][D][light:051]: Brightness: 60%
[17:08:56][D][light:059]: Red: 100%, Green: 0%, Blue: 100%
[17:08:56][D][light:109]: Effect: 'listening_ww'
[17:08:58][D][voice_assistant:416]: State changed from STREAMING_RESPONSE to IDLE
[17:08:58][D][voice_assistant:422]: Desired state set to IDLE
[17:08:58][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[17:08:58][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[17:08:58][D][voice_assistant:118]: microphone not running
[17:08:58][D][voice_assistant:202]: Requesting start...
[17:08:58][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[17:08:58][D][voice_assistant:118]: microphone not running
[17:08:58][D][voice_assistant:437]: Client started, streaming microphone
[17:08:58][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE
[17:08:58][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[17:08:58][D][voice_assistant:155]: Starting Microphone
[17:08:58][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[17:08:58][D][voice_assistant:523]: Event Type: 1
[17:08:58][D][voice_assistant:526]: Assist Pipeline running
[17:08:58][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[17:08:58][D][voice_assistant:523]: Event Type: 9
The specific error i see popping up is:
[17:08:56][W][component:232]: Component i2s_audio.media_player took a long time for an operation (521 ms).
[17:08:56][W][component:233]: Components should block for at most 30 ms.
[17:08:56][W][component:232]: Component i2s_audio.media_player took a long time for an operation (504 ms).
[17:08:56][W][component:233]: Components should block for at most 30 ms.
It's just after the media URLs are printed in logs
The specific error i see popping up is:
That's a common warning (not error), and I don't think it's the culprit.
Just making sure: are you certain you've properly inserted the speaker connector back into the PCB?
Gotcha, that was the only thing that stood out to me in the logs...
Yeah I actually just opened it back up to confirm... It's plugged in correctly
Here's the config I tried
substitutions:
name: "office-onju"
friendly_name: "Office Onju"
project_version: "1.0.0"
device_description: "Onju Voice Satellite with ESPHome software and microWakeWord"
esphome:
name: "${name}"
friendly_name: "{$friendly_name}"
comment: "${device_description}"
name_add_mac_suffix: true
project:
name: tetele.onju_voice_satellite
version: "${project_version}"
min_version: 2024.3.0
platformio_options:
board_build.flash_mode: dio
build_flags: "-DBOARD_HAS_PSRAM"
board_build.arduino.memory_type: qio_opi
on_boot:
then:
- light.turn_on:
id: top_led
effect: slow_pulse
red: 100%
green: 60%
blue: 0%
- wait_until:
condition:
wifi.connected:
- light.turn_on:
id: top_led
effect: pulse
red: 0%
green: 100%
blue: 0%
- wait_until:
condition:
api.connected:
- light.turn_on:
id: top_led
effect: none
red: 0%
green: 100%
blue: 0%
- delay: 1s
- script.execute: reset_led
dashboard_import:
package_import_url: github://tetele/onju-voice-satellite/esphome/onju-voice-microwakeword.yaml@main
esp32:
board: esp32-s3-devkitc-1
framework:
type: esp-idf
psram:
mode: octal
speed: 80MHz
# Enable logging
logger:
# Allow OTA updates
ota:
password: "some password"
# Allow provisioning Wi-Fi via serial
improv_serial:
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
manual_ip:
static_ip: 192.168.1.
gateway: 192.168.2.1
subnet: 255.255.255.0
ap:
ssid: "Office-Onju Fallback Hotspot"
password: "some password"
# In combination with the `ap` this allows the user
# to provision wifi credentials to the device via WiFi AP.
captive_portal:
api:
encryption:
key: "some key"
services:
- service: start_va
then:
- voice_assistant.start
- service: stop_va
then:
- voice_assistant.stop
- service: notification_on
then:
- script.execute: turn_on_notification
- service: notification_clear
then:
- script.execute: clear_notification
globals:
- id: thresh_percent
type: float
initial_value: "0.03"
restore_value: false
- id: touch_calibration_values_left
type: uint32_t[5]
restore_value: false
- id: touch_calibration_values_center
type: uint32_t[5]
restore_value: false
- id: touch_calibration_values_right
type: uint32_t[5]
restore_value: false
- id: notification
type: bool
restore_value: false
interval:
- interval: 1s
then:
- script.execute:
id: calibrate_touch
button: 0
- script.execute:
id: calibrate_touch
button: 1
- script.execute:
id: calibrate_touch
button: 2
i2s_audio:
- i2s_lrclk_pin: GPIO13
i2s_bclk_pin: GPIO18
micro_wake_word:
#model: okay_nabu
model: hey_jarvis
# model: alexa
on_wake_word_detected:
- voice_assistant.start:
wake_word: !lambda return wake_word;
speaker:
- platform: i2s_audio
id: onju_out
dac_type: external
i2s_dout_pin: GPIO12
microphone:
- platform: i2s_audio
id: onju_microphone
i2s_din_pin: GPIO17
adc_type: external
pdm: false
voice_assistant:
id: va
microphone: onju_microphone
speaker: onju_out
use_wake_word: false
on_listening:
- light.turn_on:
id: top_led
blue: 100%
red: 100%
green: 100%
brightness: 100%
effect: listening
on_stt_vad_end:
- light.turn_on:
id: top_led
blue: 100%
red: 0%
green: 20%
brightness: 70%
effect: processing
on_tts_end:
- light.turn_on:
id: top_led
blue: 0%
red: 20%
green: 100%
effect: speaking
on_end:
- delay: 500ms
- wait_until:
not:
speaker.is_playing: onju_out
- script.execute: reset_led
- if:
condition:
and:
- switch.is_on: use_wake_word
- binary_sensor.is_off: mute_switch
then:
- delay: 200ms
- micro_wake_word.start
on_client_connected:
- if:
condition:
and:
- switch.is_on: use_wake_word
- binary_sensor.is_off: mute_switch
then:
- micro_wake_word.start:
on_client_disconnected:
- if:
condition:
and:
- switch.is_on: use_wake_word
- binary_sensor.is_off: mute_switch
then:
- voice_assistant.stop:
- micro_wake_word.stop:
on_error:
- light.turn_on:
id: top_led
blue: 0%
red: 100%
green: 0%
effect: none
- delay: 1s
- script.execute: reset_led
number:
- platform: template
name: "Touch threshold percentage"
id: touch_threshold_percentage
update_interval: never
entity_category: config
initial_value: 0.75
min_value: 0.25
max_value: 5
step: 0.05
optimistic: true
on_value:
then:
- lambda: !lambda |-
id(thresh_percent) = 0.01 * x;
esp32_touch:
setup_mode: false
sleep_duration: 2ms
measurement_duration: 800us
low_voltage_reference: 0.8V
high_voltage_reference: 2.4V
filter_mode: IIR_16
debounce_count: 2
noise_threshold: 0
jitter_step: 0
smooth_mode: IIR_2
denoise_grade: BIT8
denoise_cap_level: L0
binary_sensor:
- platform: esp32_touch
id: volume_down
pin: GPIO4
threshold: 539000
- platform: esp32_touch
id: volume_up
pin: GPIO2
threshold: 580000
- platform: esp32_touch
id: action
pin: GPIO3
threshold: 751000
on_click:
- if:
condition:
or:
- switch.is_off: use_wake_word
- binary_sensor.is_on: mute_switch
then:
- logger.log:
tag: "action_click"
format: "Voice assistant is running: %s"
args: ['id(va).is_running() ? "yes" : "no"']
- if:
condition: speaker.is_playing
then:
- speaker.stop
- if:
condition: voice_assistant.is_running
then:
- voice_assistant.stop:
else:
- voice_assistant.start:
else:
- logger.log:
tag: "action_click"
format: "Voice assistant was running with wake word detection enabled. Starting continuously"
- if:
condition: speaker.is_playing
then:
- speaker.stop
- voice_assistant.stop
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- voice_assistant.start_continuous:
- platform: gpio
id: mute_switch
pin:
number: GPIO38
mode: INPUT_PULLUP
name: Disable wake word
on_press:
- script.execute: turn_off_wake_word
on_release:
- script.execute: turn_on_wake_word
light:
- platform: esp32_rmt_led_strip
id: leds
pin: GPIO11
chipset: SK6812
num_leds: 6
rgb_order: grb
rmt_channel: 0
default_transition_length: 0s
gamma_correct: 2.8
- platform: partition
id: left_led
segments:
- id: leds
from: 0
to: 0
default_transition_length: 100ms
- platform: partition
id: top_led
segments:
- id: leds
from: 1
to: 4
default_transition_length: 100ms
effects:
- pulse:
name: pulse
transition_length: 250ms
update_interval: 250ms
- pulse:
name: slow_pulse
transition_length: 1s
update_interval: 2s
- addressable_twinkle:
name: listening_ww
twinkle_probability: 1%
- addressable_twinkle:
name: listening
twinkle_probability: 45%
- addressable_scan:
name: processing
move_interval: 80ms
- addressable_flicker:
name: speaking
intensity: 35%
- platform: partition
id: right_led
segments:
- id: leds
from: 5
to: 5
default_transition_length: 100ms
script:
- id: reset_led
then:
- if:
condition:
- lambda: return id(notification);
then:
- light.turn_on:
id: top_led
blue: 100%
red: 100%
green: 0%
brightness: 100%
effect: slow_pulse
else:
- if:
condition:
and:
- switch.is_on: use_wake_word
- binary_sensor.is_off: mute_switch
then:
- light.turn_on:
id: top_led
blue: 100%
red: 100%
green: 0%
brightness: 60%
effect: listening_ww
else:
- light.turn_off: top_led
- id: turn_on_notification
then:
- lambda: id(notification) = true;
- script.execute: reset_led
- id: clear_notification
then:
- lambda: id(notification) = false;
- script.execute: reset_led
- id: turn_on_wake_word
then:
- if:
condition:
and:
- binary_sensor.is_off: mute_switch
- switch.is_on: use_wake_word
then:
- micro_wake_word.start
- if:
condition:
speaker.is_playing:
then:
- speaker.stop:
- script.execute: reset_led
else:
- logger.log:
tag: "turn_on_wake_word"
format: "Trying to start listening for wake word, but %s"
args:
[
'id(mute_switch).state ? "mute switch is on" : "use wake word toggle is off"',
]
level: "INFO"
- id: turn_off_wake_word
then:
- micro_wake_word.stop
- script.execute: reset_led
- id: calibrate_touch
parameters:
button: int
then:
- lambda: |-
static uint8_t thresh_indices[3] = {0, 0, 0};
static uint32_t sums[3] = {0, 0, 0};
static uint8_t qsizes[3] = {0, 0, 0};
static uint16_t consecutive_anomalies_per_button[3] = {0, 0, 0};
uint32_t newval;
uint32_t* calibration_values;
switch(button) {
case 0:
newval = id(volume_down).get_value();
calibration_values = id(touch_calibration_values_left);
break;
case 1:
newval = id(action).get_value();
calibration_values = id(touch_calibration_values_center);
break;
case 2:
newval = id(volume_up).get_value();
calibration_values = id(touch_calibration_values_right);
break;
default:
ESP_LOGE("touch_calibration", "Invalid button ID (%d)", button);
return;
}
if(newval == 0) return;
//ESP_LOGD("touch_calibration", "[%d] qsize %d, sum %d, thresh_index %d, consecutive_anomalies %d", button, qsizes[button], sums[button], thresh_indices[button], consecutive_anomalies_per_button[button]);
//ESP_LOGD("touch_calibration", "[%d] New value is %d", button, newval);
if(qsizes[button] == 5) {
float avg = float(sums[button])/float(qsizes[button]);
if((fabs(float(newval)-avg)/avg) > id(thresh_percent)) {
consecutive_anomalies_per_button[button]++;
//ESP_LOGD("touch_calibration", "[%d] %d anomalies detected.", button, consecutive_anomalies_per_button[button]);
if(consecutive_anomalies_per_button[button] < 10)
return;
}
}
//ESP_LOGD("touch_calibration", "[%d] Resetting consecutive anomalies counter.", button);
consecutive_anomalies_per_button[button] = 0;
if(qsizes[button] == 5) {
//ESP_LOGD("touch_calibration", "[%d] Queue full, removing %d.", button, id(touch_calibration_values)[thresh_indices[button]]);
sums[button] -= (uint32_t) *(calibration_values+thresh_indices[button]);// id(touch_calibration_values)[thresh_indices[button]];
qsizes[button]--;
}
*(calibration_values+thresh_indices[button]) = newval;
sums[button] += newval;
qsizes[button]++;
thresh_indices[button] = (thresh_indices[button] + 1) % 5;
//ESP_LOGD("touch_calibration", "[%d] Average value is %d", button, sums[button]/qsizes[button]);
uint32_t newthresh = uint32_t((sums[button]/qsizes[button]) * (1.0 + id(thresh_percent)));
//ESP_LOGD("touch_calibration", "[%d] Setting threshold %d", button, newthresh);
switch(button) {
case 0:
id(volume_down).set_threshold(newthresh);
break;
case 1:
id(action).set_threshold(newthresh);
break;
case 2:
id(volume_up).set_threshold(newthresh);
break;
default:
ESP_LOGE("touch_calibration", "Invalid button ID (%d)", button);
return;
}
switch:
- platform: template
name: Use Wake Word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
on_turn_on:
- script.execute: turn_on_wake_word
on_turn_off:
- script.execute: turn_off_wake_word
- platform: gpio
id: dac_mute
restore_mode: ALWAYS_OFF
pin:
number: GPIO21
inverted: True
And here's the current config I'm using:
packages:
esphome.voice-assistant: github://tetele/onju-voice-satellite/esphome/onju-voice.yaml@main
esphome:
name: office-onju
friendly_name: Office Onju
#micro_wake_word:
# model: hey_jarvis
#esp32:
# board: esp32-s3-devkitc-1
# framework:
# type: arduino
# Enable logging
logger:
# Enable Home Assistant API
api:
encryption:
key: "some key"
ota:
password: "some password"
wifi:
ssid: "some wifi"
password: "some password"
#ssid: !secret wifi_ssid
#password: !secret wifi_password
#manual_ip:
# static_ip: 192.168.2.
# gateway: 192.168.2.1
# subnet: 255.255.255.0
manual_ip:
static_ip: 192.168.1.
gateway: 192.168.1.1
subnet: 255.255.255.0
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Office-Onju Fallback Hotspot"
password: "some password"
captive_portal:
Hi,
Just got my boards in the mail and am having the same experience.
Have tried both the micro wake word branch and the normal.
With the micro wake word branch, no matter what I do will it actually pick up the wake word, only works on "touch".
Hoping the non-wake-word version would be more stable, I installed that. It does pick up the "ok nabu" wake word using the pipeline, and occasionally sound does come out.
When the TTS works I see:
[20:24:54][D][voice_assistant:579]: Response: "20:24 Mountain Time, sir."
[20:24:54][D][voice_assistant:523]: Event Type: 8
[20:24:54][D][voice_assistant:599]: Response URL: "http://10.19.15.100:8123/api/tts_proxy/23a1fb0f92188d42f4f8333babbd21b0fa62e1ce_en-gb_add2e9951e_tts.piper.mp3"
[20:24:54][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[20:24:54][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE
[20:24:54][D][media_player:059]: 'onju-office' - Setting
[20:24:54][D][media_player:066]: Media URL: http://10.19.15.100:8123/api/tts_proxy/23a1fb0f92188d42f4f8333babbd21b0fa62e1ce_en-gb_add2e9951e_tts.piper.mp3
[20:24:54][D][media_player:059]: 'onju-office' - Setting
[20:24:54][D][media_player:066]: Media URL: http://10.19.15.100:8123/api/tts_proxy/23a1fb0f92188d42f4f8333babbd21b0fa62e1ce_en-gb_add2e9951e_tts.piper.mp3
[20:24:54][D][light:036]: 'top_led' Setting:
[20:24:54][D][light:059]: Red: 20%, Green: 100%, Blue: 0%
[20:24:54][D][light:109]: Effect: 'speaking'
[20:24:54][D][voice_assistant:523]: Event Type: 2
[20:24:54][D][voice_assistant:613]: Assist Pipeline ended
[20:24:54][D][light:036]: 'top_led' Setting:
[20:24:54][D][light:109]: Effect: 'show_volume'
[20:24:54][W][component:232]: Component i2s_audio.media_player took a long time for an operation (549 ms).
[20:24:54][W][component:233]: Components should block for at most 30 ms.
[20:24:54][W][component:232]: Component i2s_audio.media_player took a long time for an operation (66 ms).
[20:24:54][W][component:233]: Components should block for at most 30 ms.
[20:24:55][W][component:232]: Component i2s_audio.media_player took a long time for an operation (56 ms).
[20:24:55][W][component:233]: Components should block for at most 30 ms.
[20:24:55][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:24:55][W][component:233]: Components should block for at most 30 ms.
[20:24:55][W][component:232]: Component i2s_audio.media_player took a long time for an operation (56 ms).
[20:24:55][W][component:233]: Components should block for at most 30 ms.
when it doesnt work i see:
[20:25:18][D][voice_assistant:579]: Response: "I'm quite well, thank you for asking! How can I assist you today?"
[20:25:18][D][voice_assistant:523]: Event Type: 8
[20:25:18][D][voice_assistant:599]: Response URL: "http://10.19.15.100:8123/api/tts_proxy/86d0bfd2e659d831fd2a25f256e719186a9ad4a0_en-gb_add2e9951e_tts.piper.mp3"
[20:25:18][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[20:25:18][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE
[20:25:18][D][media_player:059]: 'onju-office' - Setting
[20:25:18][D][media_player:066]: Media URL: http://10.19.15.100:8123/api/tts_proxy/86d0bfd2e659d831fd2a25f256e719186a9ad4a0_en-gb_add2e9951e_tts.piper.mp3
[20:25:18][D][media_player:059]: 'onju-office' - Setting
[20:25:18][D][media_player:066]: Media URL: http://10.19.15.100:8123/api/tts_proxy/86d0bfd2e659d831fd2a25f256e719186a9ad4a0_en-gb_add2e9951e_tts.piper.mp3
[20:25:18][D][light:036]: 'top_led' Setting:
[20:25:18][D][light:059]: Red: 20%, Green: 100%, Blue: 0%
[20:25:18][D][light:109]: Effect: 'speaking'
[20:25:18][D][voice_assistant:523]: Event Type: 2
[20:25:18][D][voice_assistant:613]: Assist Pipeline ended
[20:25:19][W][component:232]: Component i2s_audio.media_player took a long time for an operation (533 ms).
[20:25:19][W][component:233]: Components should block for at most 30 ms.
[20:25:19][W][component:232]: Component i2s_audio.media_player took a long time for an operation (66 ms).
[20:25:19][W][component:233]: Components should block for at most 30 ms.
[20:25:19][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:19][W][component:233]: Components should block for at most 30 ms.
[20:25:19][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:19][W][component:233]: Components should block for at most 30 ms.
[20:25:20][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:20][W][component:233]: Components should block for at most 30 ms.
[20:25:20][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:20][W][component:233]: Components should block for at most 30 ms.
[20:25:20][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:20][W][component:233]: Components should block for at most 30 ms.
[20:25:21][W][component:232]: Component i2s_audio.media_player took a long time for an operation (56 ms).
[20:25:21][W][component:233]: Components should block for at most 30 ms.
[20:25:21][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:21][W][component:233]: Components should block for at most 30 ms.
[20:25:21][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:21][W][component:233]: Components should block for at most 30 ms.
[20:25:21][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:21][W][component:233]: Components should block for at most 30 ms.
[20:25:22][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:22][W][component:233]: Components should block for at most 30 ms.
[20:25:22][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:22][W][component:233]: Components should block for at most 30 ms.
[20:25:22][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:22][W][component:233]: Components should block for at most 30 ms.
[20:25:23][W][component:232]: Component i2s_audio.media_player took a long time for an operation (57 ms).
[20:25:23][W][component:233]: Components should block for at most 30 ms.
[20:25:23][W][component:232]: Component i2s_audio.media_player took a long time for an operation (58 ms).
[20:25:23][W][component:233]: Components should block for at most 30 ms.
[20:25:24][W][component:232]: Component i2s_audio.media_player took a long time for an operation (511 ms).
[20:25:24][W][component:233]: Components should block for at most 30 ms.
[20:25:24][W][component:232]: Component i2s_audio.media_player took a long time for an operation (509 ms).
[20:25:24][W][component:233]: Components should block for at most 30 ms.
In looking at yaml file (non-wakeword one), this is the only suspicious thing obvious. Being that I can get it to work sometimes, I am confident that the speaker works and is
No smoking gun, but for what its worth, I edited the same yaml, until it was just a media_player, no touch, no LEDs, no voice assistant, etc. Once it was in this state, I could finally stream a wav file to it without error.
Will try to add yaml back to it piece by piece until I figure out what is going on that it does not like. But for what its worth I had even taken the voice_assistant functionality out, and had the media_player there, with all of the perhiperals, and it still refused to play media.
@ther3zz why did you select the "OperWakeWord" flavor when you added the issue if you're using the microWakeWord config? The difference between the 2 is quite significant.
Also, how are you playing anything else apart from voice responses with the microWakeWord config? That version does not expose a media_player
. Are you sure you're using the config you pasted?
@cowboyrushforth thanks for debugging this! I'm not sure it's the same issue, as you're also reporting the wake word not working. How confident are you in your network setup and the fact that your satellite has good WiFi signal?
@tetele yes, i believe its the same issue. lets focus on the non-micro wake word config:
later I will try to provide more details, as well as also trying on additional onju boards and try to isolate the issue down further. I see in the past there was non-wake word versions that dont have media_player but "speaker", so will probably experiment with that too.
@ther3zz why did you select the "OperWakeWord" flavor when you added the issue if you're using the microWakeWord config? The difference between the 2 is quite significant.
Also, how are you playing anything else apart from voice responses with the microWakeWord config? That version does not expose a
media_player
. Are you sure you're using the config you pasted?@cowboyrushforth thanks for debugging this! I'm not sure it's the same issue, as you're also reporting the wake word not working. How confident are you in your network setup and the fact that your satellite has good WiFi signal?
Apologies, I should have been clearer... I've tried both micro wake word and open wake word configs. I started with the micro wake word config and ran into the same issues described by @cowboyrushforth but since it was listed as beta, I switched to the open wake word config.
I'm also seeing the same issues with the open wake word config that @cowboyrushforth mentioned (wake word not picked up, no audio).
In regards to the wake word issue, it'll work one time and then stop. Basically I either have to toggle the physical mute switch or toggle the "use wake word" switch in home assistant to get it to start listening again.
During this test I was using the wake word and it wasnt being recognized until i toggled the physical mute switch (I dont really see anything that stands out there, logs just dont show the wake word triggered):
[08:26:56][D][light:036]: 'top_led' Setting:
[08:26:56][D][light:051]: Brightness: 60%
[08:26:56][D][light:059]: Red: 100%, Green: 0%, Blue: 100%
[08:26:56][D][light:109]: Effect: 'listening_ww'
[08:26:57][D][voice_assistant:416]: State changed from STREAMING_RESPONSE to IDLE
[08:26:57][D][voice_assistant:422]: Desired state set to IDLE
[08:26:57][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[08:26:57][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[08:26:57][D][voice_assistant:118]: microphone not running
[08:26:57][D][voice_assistant:202]: Requesting start...
[08:26:57][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[08:26:57][D][voice_assistant:118]: microphone not running
[08:26:57][D][voice_assistant:437]: Client started, streaming microphone
[08:26:57][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE
[08:26:57][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[08:26:57][D][voice_assistant:155]: Starting Microphone
[08:26:57][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[08:26:57][D][voice_assistant:523]: Event Type: 1
[08:26:57][D][voice_assistant:526]: Assist Pipeline running
[08:26:57][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[08:26:57][D][voice_assistant:523]: Event Type: 9
[08:30:18][I][ota:117]: Boot seems successful, resetting boot loop counter.
[08:30:18][D][esp32.preferences:114]: Saving 1 preferences to flash...
[08:30:18][D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[08:30:45][D][binary_sensor:036]: 'Disable wake word': Sending state ON
[08:30:45][D][voice_assistant:516]: Signaling stop...
[08:30:45][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[08:30:45][D][voice_assistant:422]: Desired state set to IDLE
[08:30:45][D][light:036]: 'top_led' Setting:
[08:30:45][D][light:047]: State: OFF
[08:30:45][D][light:085]: Transition length: 0.1s
[08:30:45][D][light:091]: Effect: 'None'
[08:30:45][D][voice_assistant:523]: Event Type: 0
[08:30:45][E][voice_assistant:653]: Error: no_wake_word - No wake word detected
[08:30:45][D][voice_assistant:516]: Signaling stop...
[08:30:45][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOP_MICROPHONE
[08:30:45][D][voice_assistant:422]: Desired state set to IDLE
[08:30:45][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[08:30:45][D][light:036]: 'top_led' Setting:
[08:30:45][D][light:047]: State: ON
[08:30:45][D][light:059]: Red: 100%, Green: 0%, Blue: 0%
[08:30:45][D][light:085]: Transition length: 0.1s
[08:30:45][D][voice_assistant:523]: Event Type: 2
[08:30:45][D][voice_assistant:613]: Assist Pipeline ended
[08:30:45][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to IDLE
[08:30:45][D][light:036]: 'top_led' Setting:
[08:30:45][D][light:047]: State: OFF
[08:30:45][D][light:085]: Transition length: 0.1s
[08:30:46][D][light:036]: 'top_led' Setting:
[08:30:46][D][light:085]: Transition length: 0.1s
[08:30:50][D][binary_sensor:036]: 'Disable wake word': Sending state OFF
[08:30:50][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[08:30:50][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[08:30:50][D][light:036]: 'top_led' Setting:
[08:30:50][D][light:047]: State: ON
[08:30:50][D][light:051]: Brightness: 60%
[08:30:50][D][light:059]: Red: 100%, Green: 0%, Blue: 100%
[08:30:50][D][light:109]: Effect: 'listening_ww'
[08:30:50][D][voice_assistant:118]: microphone not running
[08:30:50][D][voice_assistant:202]: Requesting start...
[08:30:50][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[08:30:50][D][voice_assistant:437]: Client started, streaming microphone
[08:30:50][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE
[08:30:50][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[08:30:50][D][voice_assistant:155]: Starting Microphone
[08:30:50][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[08:30:50][D][voice_assistant:523]: Event Type: 1
[08:30:50][D][voice_assistant:526]: Assist Pipeline running
[08:30:50][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[08:30:50][D][voice_assistant:523]: Event Type: 9
[08:30:53][D][voice_assistant:523]: Event Type: 10
[08:30:53][D][voice_assistant:532]: Wake word detected
[08:30:53][D][voice_assistant:523]: Event Type: 3
[08:30:53][D][voice_assistant:537]: STT started
[08:30:53][D][light:036]: 'top_led' Setting:
[08:30:53][D][light:051]: Brightness: 100%
[08:30:53][D][light:059]: Red: 100%, Green: 100%, Blue: 100%
[08:30:53][D][light:109]: Effect: 'listening'
[08:30:54][D][voice_assistant:523]: Event Type: 11
[08:30:54][D][voice_assistant:677]: Starting STT by VAD
[08:30:55][D][voice_assistant:523]: Event Type: 12
[08:30:55][D][voice_assistant:681]: STT by VAD end
[08:30:55][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[08:30:55][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE
[08:30:55][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[08:30:55][D][light:036]: 'top_led' Setting:
[08:30:55][D][light:051]: Brightness: 70%
[08:30:55][D][light:059]: Red: 0%, Green: 20%, Blue: 100%
[08:30:55][D][light:109]: Effect: 'processing'
[08:30:55][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[08:30:56][D][voice_assistant:523]: Event Type: 4
[08:30:56][D][voice_assistant:551]: Speech recognised as: " What time is it?"
[08:30:56][D][voice_assistant:523]: Event Type: 5
[08:30:56][D][voice_assistant:556]: Intent started
[08:30:56][D][voice_assistant:523]: Event Type: 6
[08:30:56][D][voice_assistant:523]: Event Type: 7
[08:30:56][D][voice_assistant:579]: Response: "Sorry, I couldn't understand that"
[08:30:56][D][voice_assistant:523]: Event Type: 8
[08:30:56][D][voice_assistant:599]: Response URL: "https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3"
[08:30:56][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[08:30:56][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE
[08:30:56][D][media_player:059]: 'Office Onju 2a44d8' - Setting
[08:30:56][D][media_player:066]: Media URL: https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3
[08:30:56][D][media_player:059]: 'Office Onju 2a44d8' - Setting
[08:30:56][D][media_player:066]: Media URL: https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3
[08:30:56][D][light:036]: 'top_led' Setting:
[08:30:56][D][light:059]: Red: 20%, Green: 100%, Blue: 0%
[08:30:56][D][light:109]: Effect: 'speaking'
[08:30:56][D][voice_assistant:523]: Event Type: 2
[08:30:57][D][voice_assistant:613]: Assist Pipeline ended
[08:30:57][W][component:232]: Component i2s_audio.media_player took a long time for an operation (522 ms).
[08:30:57][W][component:233]: Components should block for at most 30 ms.
[08:30:58][W][component:232]: Component i2s_audio.media_player took a long time for an operation (504 ms).
[08:30:58][W][component:233]: Components should block for at most 30 ms.
[08:30:58][D][light:036]: 'top_led' Setting:
[08:30:58][D][light:051]: Brightness: 60%
[08:30:58][D][light:059]: Red: 100%, Green: 0%, Blue: 100%
[08:30:58][D][light:109]: Effect: 'listening_ww'
[08:30:59][D][voice_assistant:416]: State changed from STREAMING_RESPONSE to IDLE
[08:30:59][D][voice_assistant:422]: Desired state set to IDLE
[08:30:59][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[08:30:59][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[08:30:59][D][voice_assistant:118]: microphone not running
[08:30:59][D][voice_assistant:202]: Requesting start...
[08:30:59][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[08:30:59][D][voice_assistant:118]: microphone not running
[08:30:59][D][voice_assistant:118]: microphone not running
[08:30:59][D][voice_assistant:437]: Client started, streaming microphone
[08:30:59][D][voice_assistant:416]: State changed from STARTING_PIPELINE to START_MICROPHONE
[08:30:59][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[08:30:59][D][voice_assistant:155]: Starting Microphone
[08:30:59][D][voice_assistant:416]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[08:30:59][D][voice_assistant:523]: Event Type: 1
[08:30:59][D][voice_assistant:526]: Assist Pipeline running
[08:30:59][D][voice_assistant:416]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE
[08:30:59][D][voice_assistant:523]: Event Type: 9
after more hours debugging than I would like to admit - I have come to a conclusion...
during my debugging and assembly I had mostly had the unit fully assembled. but during parts of debugging to triple, quadruple check things, or use a serial cable (for example to go back and forth between arduino and esp-idf, you cant OTA) I had run the unit in various states of assembled and dis-assembled.
in one thought process I was starting to think that perhaps I had a bad ac adapter, because i had a refurbished unit.
after more debugging I realized, it works on the ac adaptor, OR on usb, BUT ONLY IF ITS NOT ASSEMBLED.
after more debugging I realized that I would look at the inside of a new unit, (because i bought a couple, and luckily one i bought brand new and one i bought refurbished off amazon)
in the new unit the plate that covers the PCBA is PLASTIC. In the refurbished ones, the PLATE IS METAL.
finally my eureka moment, I re-assembled it with the plastic plate, or removed the metal plate, and it all works.
So in short - if you got a metal plate that holds the PCB down, just remove it, and re-assemble. It must quietly short something in the audio circuitry. The reason I thought it was working with some code vs other code had nothing to do with the code, but had everything to do with the state of the physical assembly and whether that silly metal plate was installed that appears to short something.
Hope this helps someone!
Very interesting, thanks for taking the time to debug this @cowboyrushforth!
The only issue i was aware of regarding the metal plate is the fact that it sometimes would make the unit not boot at all. Apparently, there's a bit of conductive foam that's causing all the headache
However, if that indeed is the problem that @ther3zz is facing, I can add a note to the README
Very interesting, thanks for taking the time to debug this @cowboyrushforth!
The only issue i was aware of regarding the metal plate is the fact that it sometimes would make the unit not boot at all. Apparently, there's a bit of conductive foam that's causing all the headache
However, if that indeed is the problem that @ther3zz is facing, I can add a note to the README
Mine doesnt have that foam and is plastic instead of metal. I also disassembled it and have the board and the speaker connected together (the board is still within the top part though) and I'm still not hearing any playback. I Tried powering through USB and through the AC adapter and it still doesnt work.
Just to be verbose, you also have the sort of secondary pcb also connected (where the mute switch is) ya? Because if that is not connected, then the device will be set to mute.
Also you have "V3" of the onju device pcba? Did you get it from pcbway?
Finally, you could try something super super simple like this to see if it plays: (you will need to create a startup.h sound file for this, per instructions here https://esphome.io/guides/audio_clips_for_i2s.html
If you run this, you would end up with a button on the device page to play a sound. This is the most simple barebones thing to see if you can produce any sound.
substitutions:
name: onju-voice-office
friendly_name: onju-office
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
esphome:
name: "${name}"
name_add_mac_suffix: false
friendly_name: "${friendly_name}"
min_version: 2023.11.6
includes:
- startup.h
on_boot:
priority: 600
then:
- output.turn_off: set_low_speaker
platformio_options:
board_build.flash_mode: dio
build_flags: "-DBOARD_HAS_PSRAM"
board_build.arduino.memory_type: qio_opi
esp32:
board: esp32-s3-devkitc-1
framework:
type: arduino
switch:
- platform: gpio
pin: GPIO21
name: "speaker enable"
id: speakeren
restore_mode: ALWAYS_ON
psram:
mode: octal
speed: 80MHz
logger:
level: VERY_VERBOSE
ota:
improv_serial:
captive_portal:
api:
encryption:
key: YOUR_ENCRYPTION_KEY
i2s_audio:
- i2s_lrclk_pin: GPIO13
i2s_bclk_pin: GPIO18
id: theaudio
output:
- platform: gpio
pin:
number: GPIO12
allow_other_uses: true
id: set_low_speaker
speaker:
- platform: i2s_audio
dac_type: external
i2s_audio_id: theaudio
i2s_dout_pin:
number: GPIO12
allow_other_uses: true
id: foobar
mode: mono
button:
- platform: template
name: Play Sound
id: playsound
icon: "mdi:emoticon-outline"
on_press:
- logger.log: "Button pressed"
- speaker.play:
id: foobar
data: !lambda return startup_raw;
Just to be verbose, you also have the sort of secondary pcb also connected (where the mute switch is) ya? Because if that is not connected, then the device will be set to mute.
Also you have "V3" of the onju device pcba? Did you get it from pcbway?
Finally, you could try something super super simple like this to see if it plays: (you will need to create a startup.h sound file for this, per instructions here https://esphome.io/guides/audio_clips_for_i2s.html
If you run this, you would end up with a button on the device page to play a sound. This is the most simple barebones thing to see if you can produce any sound.
substitutions: name: onju-voice-office friendly_name: onju-office wifi: ssid: !secret wifi_ssid password: !secret wifi_password esphome: name: "${name}" name_add_mac_suffix: false friendly_name: "${friendly_name}" min_version: 2023.11.6 includes: - startup.h on_boot: priority: 600 then: - output.turn_off: set_low_speaker platformio_options: board_build.flash_mode: dio build_flags: "-DBOARD_HAS_PSRAM" board_build.arduino.memory_type: qio_opi esp32: board: esp32-s3-devkitc-1 framework: type: arduino switch: - platform: gpio pin: GPIO21 name: "speaker enable" id: speakeren restore_mode: ALWAYS_ON psram: mode: octal speed: 80MHz logger: level: VERY_VERBOSE ota: improv_serial: captive_portal: api: encryption: key: YOUR_ENCRYPTION_KEY i2s_audio: - i2s_lrclk_pin: GPIO13 i2s_bclk_pin: GPIO18 id: theaudio output: - platform: gpio pin: number: GPIO12 allow_other_uses: true id: set_low_speaker speaker: - platform: i2s_audio dac_type: external i2s_audio_id: theaudio i2s_dout_pin: number: GPIO12 allow_other_uses: true id: foobar mode: mono button: - platform: template name: Play Sound id: playsound icon: "mdi:emoticon-outline" on_press: - logger.log: "Button pressed" - speaker.play: id: foobar data: !lambda return startup_raw;
Yeah, I made sure to plug in the mute board and power it through the AC adapter. Yup, I got it through PCBWay and it's the v3 board...
I'm having difficulty getting the whole conversion working properly on windows... would you mind sharing the file you've used? Thank you!
Just to be verbose, you also have the sort of secondary pcb also connected (where the mute switch is) ya? Because if that is not connected, then the device will be set to mute. Also you have "V3" of the onju device pcba? Did you get it from pcbway? Finally, you could try something super super simple like this to see if it plays: (you will need to create a startup.h sound file for this, per instructions here https://esphome.io/guides/audio_clips_for_i2s.html If you run this, you would end up with a button on the device page to play a sound. This is the most simple barebones thing to see if you can produce any sound.
substitutions: name: onju-voice-office friendly_name: onju-office wifi: ssid: !secret wifi_ssid password: !secret wifi_password esphome: name: "${name}" name_add_mac_suffix: false friendly_name: "${friendly_name}" min_version: 2023.11.6 includes: - startup.h on_boot: priority: 600 then: - output.turn_off: set_low_speaker platformio_options: board_build.flash_mode: dio build_flags: "-DBOARD_HAS_PSRAM" board_build.arduino.memory_type: qio_opi esp32: board: esp32-s3-devkitc-1 framework: type: arduino switch: - platform: gpio pin: GPIO21 name: "speaker enable" id: speakeren restore_mode: ALWAYS_ON psram: mode: octal speed: 80MHz logger: level: VERY_VERBOSE ota: improv_serial: captive_portal: api: encryption: key: YOUR_ENCRYPTION_KEY i2s_audio: - i2s_lrclk_pin: GPIO13 i2s_bclk_pin: GPIO18 id: theaudio output: - platform: gpio pin: number: GPIO12 allow_other_uses: true id: set_low_speaker speaker: - platform: i2s_audio dac_type: external i2s_audio_id: theaudio i2s_dout_pin: number: GPIO12 allow_other_uses: true id: foobar mode: mono button: - platform: template name: Play Sound id: playsound icon: "mdi:emoticon-outline" on_press: - logger.log: "Button pressed" - speaker.play: id: foobar data: !lambda return startup_raw;
Yeah, I made sure to plug in the mute board and power it through the AC adapter. Yup, I got it through PCBWay and it's the v3 board...
I'm having difficulty getting the whole conversion working properly on windows... would you mind sharing the file you've used? Thank you!
Nevermind, I got something to work. I can confirm that the audio does indeed playback on the speaker when i click the button
Here are some verbose logs of when it's supposed to be playing back the audio:
[12:02:02][VV][api.service:964]: on_voice_assistant_event_response: VoiceAssistantEventResponse {
event_type: VOICE_ASSISTANT_STT_END
data: VoiceAssistantEventData {
name: 'text'
value: ' What time is it?'
}
}
[12:02:02][D][voice_assistant:563]: Event Type: 4
[12:02:02][D][voice_assistant:591]: Speech recognised as: " What time is it?"
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=0)
[12:02:02][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=40940 (now=40942)
[12:02:02][VV][api.service:964]: on_voice_assistant_event_response: VoiceAssistantEventResponse {
event_type: VOICE_ASSISTANT_INTENT_START
}
[12:02:02][D][voice_assistant:563]: Event Type: 5
[12:02:02][D][voice_assistant:596]: Intent started
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=0)
[12:02:02][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=40949 (now=40951)
[12:02:02][VV][esp32_rmt_led_strip:095]: Writing RGB values to bus...
[12:02:02][VV][api.service:964]: on_voice_assistant_event_response: VoiceAssistantEventResponse {
event_type: VOICE_ASSISTANT_INTENT_END
data: VoiceAssistantEventData {
name: 'conversation_id'
value: ''
}
}
[12:02:02][D][voice_assistant:563]: Event Type: 6
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=0)
[12:02:02][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=41032 (now=41035)
[12:02:02][VV][api.service:964]: on_voice_assistant_event_response: VoiceAssistantEventResponse {
event_type: VOICE_ASSISTANT_TTS_START
data: VoiceAssistantEventData {
name: 'text'
value: 'Sorry, I couldn't understand that'
}
}
[12:02:02][D][voice_assistant:563]: Event Type: 7
[12:02:02][D][voice_assistant:619]: Response: "Sorry, I couldn't understand that"
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=0)
[12:02:02][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=41042 (now=41045)
[12:02:02][VV][api.service:964]: on_voice_assistant_event_response: VoiceAssistantEventResponse {
event_type: VOICE_ASSISTANT_TTS_END
data: VoiceAssistantEventData {
name: 'url'
value: 'https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3'
}
}
[12:02:02][D][voice_assistant:563]: Event Type: 8
[12:02:02][D][voice_assistant:639]: Response URL: "https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3"
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=0)
[12:02:02][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[12:02:02][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[12:02:02][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=41053 (now=41058)
[12:02:02][D][media_player:059]: 'Office Onju 2a44d8' - Setting
[12:02:02][D][media_player:066]: Media URL: https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3
[12:02:02][D][media_player:059]: 'Office Onju 2a44d8' - Setting
[12:02:02][D][media_player:066]: Media URL: https://homeassistant.my.domain/api/tts_proxy/dae2cdcb27a1d1c3b07ba2c7db91480f9d4bfd8f_en-us_7238ee98e6_marytts.mp3
[12:02:02][D][light:036]: 'top_led' Setting:
[12:02:02][D][light:059]: Red: 20%, Green: 100%, Blue: 0%
[12:02:02][D][light:109]: Effect: 'speaking'
[12:02:02][VV][api.service:964]: on_voice_assistant_event_response: VoiceAssistantEventResponse {
event_type: VOICE_ASSISTANT_RUN_END
}
[12:02:02][D][voice_assistant:563]: Event Type: 2
[12:02:02][D][voice_assistant:653]: Assist Pipeline ended
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=0)
[12:02:02][VV][api.service:324]: send_media_player_state_response: MediaPlayerStateResponse {
key: 3307342432
state: MEDIA_PLAYER_STATE_PLAYING
volume: 1
muted: NO
}
[12:02:02][W][component:237]: Component i2s_audio.media_player took a long time for an operation (543 ms).
[12:02:02][W][component:238]: Components should block for at most 30 ms.
[12:02:02][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=41078 (now=41626)
[12:02:02][VV][scheduler:032]: set_timeout(name='', timeout=100)
[12:02:02][VV][scheduler:226]: Running interval 'update' with interval=1000 last_execution=40185 (now=41626)
[12:02:02][VV][esp32_rmt_led_strip:095]: Writing RGB values to bus...
[12:02:02][VV][scheduler:032]: set_timeout(name='playing', timeout=2000)
[12:02:02][VV][esp32_rmt_led_strip:095]: Writing RGB values to bus...
[12:02:02][VV][scheduler:032]: set_timeout(name='playing', timeout=2000)
[12:02:03][VV][api.service:324]: send_media_player_state_response: MediaPlayerStateResponse {
key: 3307342432
state: MEDIA_PLAYER_STATE_IDLE
volume: 1
muted: NO
}
[12:02:03][W][component:237]: Component i2s_audio.media_player took a long time for an operation (472 ms).
[12:02:03][W][component:238]: Components should block for at most 30 ms.
[12:02:03][VV][scheduler:226]: Running timeout '' with interval=100 last_execution=41628 (now=42115)
Thats great that you got something! So maybe not a hardware issue. By chance is "homeassistant.my.domain" actually in your logs, is that actually internally resolvable? For my ESPHome config I use the IP of my homeassistant server and port, so for me its 10.19.15.100:8123, not homeassistant.my.domain. Thats the only thing that looks suspicious from your logs to me.
Thats great that you got something! So maybe not a hardware issue. By chance is "homeassistant.my.domain" actually in your logs, is that actually internally resolvable? For my ESPHome config I use the IP of my homeassistant server and port, so for me its 10.19.15.100:8123, not homeassistant.my.domain. Thats the only thing that looks suspicious from your logs to me.
Yeah, I was really worried I had messed the hardware up during the replacement process!
Yeah, thats just a placeholder. It contains the actual domain which is internally accessible. That being said... ITS ALWAYS DNS!!! Looks like esphome isnt using my 2 internal dns servers (home assistant itself is).... when I manually specified the dns servers on the onju config, it started working
Thank you for your help!
Thats great that you got something! So maybe not a hardware issue. By chance is "homeassistant.my.domain" actually in your logs, is that actually internally resolvable? For my ESPHome config I use the IP of my homeassistant server and port, so for me its 10.19.15.100:8123, not homeassistant.my.domain. Thats the only thing that looks suspicious from your logs to me.
Yeah, I was really worried I had messed the hardware up during the replacement process!
Yeah, thats just a placeholder. It contains the actual domain which is internally accessible. That being said... ITS ALWAYS DNS!!! Looks like esphome isnt using my 2 internal dns servers (home assistant itself is).... when I manually specified the dns servers on the onju config, it started working
Thank you for your help!
So it looks like I can get responses back when asking it to turn things on/off but it's not playing back TTS generated from the Media section and it does not seem to play media files (though i only tried with an .m4a )
EDIT: So I just figured this out. I noticed that while speaking to the assistant I would get responses but if I attempted to playback an MP3 or TTS via media it would not work. You have to turn off the Wake Word switch and then you can playback stuff.
If you have an automation that playback specific TTS or mp3s, you can set an action to turn off the wake word switch ->delay 1 seond -> playback tts/mp3 -> delay for the length of the audio played -> turn on the switch
Ya, in my understanding of the current esphome audio frameworks, support for a lot of codecs and things is not supported. I think basic MP3 and WAV are the only reliable things, and from my understanding no HTTPS works. There is some upstream patches that help this if you dig around in esphome/esphome repo which look to have a lot of that improved, but some work to integrate.
Flavor
OpenWakeWord or no wake word
Checklist
Describe the issue
No audio plays back either via the voice assistant or from the media player
Reproduction steps
Debug logs