Sonoff S31s with total_daily_energy enabled reboot

tschundler commented 2 years ago

The problem

The devices work file with total_daily_energy disabled. But with total_daily_energy enabled, they periodically reboot. I added uptime measurement, and it isn't a consistent amount of time - as little as a few minutes and as long as many hours. But I have noticed it reboots sooner on devices using more power.

Is there some sort of possible overflow + divide by zero happening.

It is measuring Watt-hours, not kWh. I'm suspicious this might occur 1000x less if I switch to kWh, but I haven't tested that yet.

Which version of ESPHome has the issue?

v2021.8.2

What type of installation are you using?

pip

Which version of Home Assistant has the issue?

2021.8.8

What platform are you using?

ESP8266

Board

Sonoff S31

Component causing the issue

total_daily_energy

Example YAML snippet

`esphome config s31_2.yaml`

substitutions:
  device_prefix: s31_2
  device_hostname: s31-2
  device_title: S31 2
esphome:
  name: s31-2
  platform: ESP8266
  board: esp01_1m
  board_flash_mode: dout
  esp8266_restore_from_flash: true
  arduino_version: platformio/espressif8266@2.6.2
  build_path: s31-2
  platformio_options: {}
  includes: []
  libraries: []
  name_add_mac_suffix: false
captive_portal: {}
debug: {}
logger:
  baud_rate: 0
  tx_buffer_size: 512
  deassert_rts_dtr: false
  hardware_uart: UART0
  level: DEBUG
  logs: {}
  esp8266_store_log_strings_in_flash: true
web_server:
  port: 80
  css_url: https://esphome.io/_static/webserver-v1.min.css
  js_url: https://esphome.io/_static/webserver-v1.min.js
status_led:
  pin:
    number: 13
    mode: OUTPUT
    inverted: false
binary_sensor:
- platform: gpio
  pin:
    number: 0
    mode: INPUT_PULLUP
    inverted: true
  id: s31_2_button
  disabled_by_default: true
  internal: true
  on_press:
  - then:
    - light.toggle:
        id: s31_2_relayl
  name: s31_2_button
- platform: status
  name: S31 2 Status
  disabled_by_default: false
  device_class: connectivity
output:
- platform: gpio
  id: s31_2_relay
  pin:
    number: 12
    mode: OUTPUT
    inverted: false
light:
- platform: binary
  name: S31 2 Relay
  output: s31_2_relay
  id: s31_2_relayl
  disabled_by_default: false
  restore_mode: RESTORE_DEFAULT_OFF
uart:
- rx_pin: 3
  baud_rate: 4800
  rx_buffer_size: 256
  stop_bits: 1
  data_bits: 8
  parity: NONE
text_sensor:
- platform: template
  name: S31 2 Uptime Human
  id: s31_2_uptime_human
  icon: mdi:clock-start
  internal: false
  disabled_by_default: true
  update_interval: 60s
sensor:
- platform: uptime
  name: S31 2 Uptime
  id: s31_2_uptime_sensor
  update_interval: 60s
  internal: false
  disabled_by_default: true
  on_raw_value:
  - then:
    - text_sensor.template.publish:
        id: s31_2_uptime_human
        state: !lambda |-
          int seconds = round(id(s31_2_uptime_sensor).raw_state);
          int days = seconds / (24 * 3600);
          seconds = seconds % (24 * 3600);
          int hours = seconds / 3600;
          seconds = seconds % 3600;
          int minutes = seconds /  60;
          seconds = seconds % 60;
          return (
            (days ? String(days) + "d " : "") +
            (hours ? String(hours) + "h " : "") +
            (minutes ? String(minutes) + "m " : "") +
            (String(seconds) + "s")
          ).c_str();
  force_update: false
  unit_of_measurement: s
  icon: mdi:timer-outline
  accuracy_decimals: 0
  state_class: ''
- platform: cse7766
  current:
    name: S31 2 Current
    internal: false
    disabled_by_default: true
    force_update: false
    unit_of_measurement: A
    accuracy_decimals: 2
    device_class: current
    state_class: measurement
  voltage:
    name: S31 2 Voltage
    internal: false
    disabled_by_default: true
    force_update: false
    unit_of_measurement: V
    accuracy_decimals: 1
    device_class: voltage
    state_class: measurement
  power:
    name: S31 2 Power
    id: s31_2_power
    internal: false
    disabled_by_default: false
    force_update: false
    unit_of_measurement: W
    accuracy_decimals: 1
    device_class: power
    state_class: measurement
  update_interval: 60s
- platform: total_daily_energy
  name: S31 2 Daily Energy
  power_id: s31_2_power
  disabled_by_default: false
  force_update: false
  device_class: energy
  state_class: measurement
  last_reset_type: auto
  min_save_interval: 0s
time:
- platform: sntp
  servers:
  - XXREDACTEDXX
  - north-america.pool.ntp.org
  timezone: PST8PDT7,M3.2.0/2,M11.1.0/2
  update_interval: 15min
wifi:
  ap:
    ssid: S31 2 Setup
    password: !secret 'wifi_password'
    ap_timeout: 1min
  enable_mdns: true
  domain: .local
  reboot_timeout: 15min
  power_save_mode: NONE
  fast_connect: false
  output_power: 20.0
  networks:
  - ssid: XXREDACTEDXX
    password: XXREDACTEDXX
    priority: 0.0
  use_address: s31-2.local
mqtt:
  broker: 192.168.1.180
  username: ''
  password: ''
  discovery: true
  port: 1883
  discovery_retain: true
  discovery_prefix: homeassistant
  topic_prefix: s31-2
  keepalive: 15s
  reboot_timeout: 15min
  birth_message:
    topic: s31-2/status
    payload: online
    qos: 0
    retain: true
  will_message:
    topic: s31-2/status
    payload: offline
    qos: 0
    retain: true
  shutdown_message:
    topic: s31-2/status
    payload: offline
    qos: 0
    retain: true
  log_topic:
    topic: s31-2/debug
    qos: 0
    retain: true
ota:
  password: XXREDACTEDXX
  safe_mode: true
  port: 8266
  reboot_timeout: 5min
  num_attempts: 10

Anything in the logs that might be useful for us?

boot after update:

[11:34:25][I][ota:275]: OTA update finished!
[11:34:37][I][mqtt:212]: MQTT Connected!
[11:34:37][C][sntp:022]: Setting up SNTP...
[11:34:37][I][app:106]: ESPHome version 2021.8.2 compiled on Aug 28 2021, 11:34:02
[11:34:37][C][status_led:019]: Status LED:
[11:34:37][C][status_led:020]:   Pin: GPIO13 (Mode: OUTPUT)
[11:34:37][D][sntp:061]: Synchronized time: Sat Aug 28 11:34:37 2021
[11:34:37][C][wifi:499]: WiFi:
[11:34:37][C][wifi:359]:   SSID: XXREDACTEDXX
[11:34:37][C][wifi:360]:   IP Address: 192.168.1.37
[11:34:37][C][wifi:362]:   BSSID: XXREDACTEDXX
[11:34:37][C][wifi:363]:   Hostname: 's31_4'
[11:34:37][C][wifi:367]:   Signal strength: -48 dB ▂▄▆█
[11:34:37][C][wifi:371]:   Channel: 8
[11:34:37][C][wifi:372]:   Subnet: 255.255.255.0
[11:34:37][C][wifi:373]:   Gateway: 192.168.1.1
[11:34:37][C][wifi:374]:   DNS1: 192.168.1.1
[11:34:37][C][wifi:375]:   DNS2: 75.75.75.75
[11:34:37][C][template.text_sensor:020]: Template Sensor 'S31 4 Uptime Human'
[11:34:37][C][template.text_sensor:020]:   Icon: 'mdi:clock-start'
[11:34:37][C][uptime.sensor:030]: Uptime Sensor 'S31 4 Uptime'
[11:34:37][C][uptime.sensor:030]:   State Class: ''
[11:34:37][C][uptime.sensor:030]:   Unit of Measurement: 's'
[11:34:37][C][uptime.sensor:030]:   Accuracy Decimals: 0
[11:34:37][C][uptime.sensor:030]:   Icon: 'mdi:timer-outline'
[11:34:37][C][logger:189]: Logger:
[11:34:37][C][logger:190]:   Level: DEBUG
[11:34:37][C][light:097]: Light 'S31 4 Relay'
[11:34:37][C][status:034]: Status Binary Sensor 'S31 4 Status'
[11:34:37][C][status:034]:   Device Class: 'connectivity'
[11:34:37][C][cse7766:170]: CSE7766:
[11:34:37][C][cse7766:171]:   Update Interval: 60.0s
[11:34:37][C][cse7766:172]:   Voltage 'S31 4 Voltage'
[11:34:37][C][cse7766:172]:     Device Class: 'voltage'
[11:34:37][C][cse7766:172]:     State Class: 'measurement'
[11:34:37][C][cse7766:172]:     Unit of Measurement: 'V'
[11:34:37][C][cse7766:172]:     Accuracy Decimals: 1
[11:34:37][C][cse7766:173]:   Current 'S31 4 Current'
[11:34:37][C][cse7766:173]:     Device Class: 'current'
[11:34:37][C][cse7766:173]:     State Class: 'measurement'
[11:34:37][C][cse7766:173]:     Unit of Measurement: 'A'
[11:34:37][C][cse7766:173]:     Accuracy Decimals: 2
[11:34:37][C][cse7766:174]:   Power 'S31 4 Power'
[11:34:37][C][total_daily_energy:023]: Total Daily Energy 'S31 4 Daily Energy'
[11:34:37][C][ota:029]: Over-The-Air Updates:
[11:34:37][C][ota:030]:   Address: s31_4.local:8266
[11:34:37][C][ota:032]:   Using Password.
[11:34:37][C][mqtt:061]: MQTT:
[11:34:37][C][mqtt:063]:   Server Address: 192.168.1.180:1883 (192.168.1.180)
[11:34:37][C][mqtt:064]:   Username: ''
[11:34:37][C][mqtt:065]:   Client ID: 's31_4-a4cf12b7f4ed'
[11:34:37][C][mqtt:067]:   Discovery prefix: 'homeassistant'
[11:34:37][C][mqtt:068]:   Discovery retain: YES
[11:34:37][C][mqtt:070]:   Topic Prefix: 's31_4'
[11:34:37][C][mqtt:072]:   Log Topic: 's31_4/debug'
[11:34:37][C][mqtt:075]:   Availability: 's31_4/status'
[11:34:37][C][sntp:044]: SNTP Time:
[11:34:37][C][sntp:045]:   Server 1: XXREDACTEDXX
[11:34:37][C][sntp:046]:   Server 2: 'north-america.pool.ntp.org'
[11:34:37][C][sntp:047]:   Server 3: ''
[11:34:37][C][sntp:048]:   Timezone: 'PST8PDT7,M3.2.0/2,M11.1.0/2'
[11:34:37][C][mqtt.binary_sensor:018]: MQTT Binary Sensor 'S31 4 Status':
[11:34:37][C][mqtt.text_sensor:025]: MQTT Text Sensor 'S31 4 Uptime Human':
[11:34:37][C][mqtt.text_sensor:026]:   State Topic: 's31_4/sensor/s31_4_uptime_human/state'
[11:34:37][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Uptime':
[11:34:37][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_uptime/state'
[11:34:37][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Voltage':
[11:34:37][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_voltage/state'
[11:34:38][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Current':
[11:34:38][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_current/state'
[11:34:38][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Power':
[11:34:38][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_power/state'
[11:34:38][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Daily Energy':
[11:34:38][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_daily_energy/state'
[11:34:38][C][mqtt.binary_sensor:018]: MQTT Binary Sensor 's31_4_button':
[11:34:38][C][mqtt.binary_sensor:019]:   State Topic: 's31_4/binary_sensor/s31_4_button/state'
[11:34:38][D][debug:023]: ESPHome version 2021.8.2
[11:34:38][D][debug:025]: Free Heap Size: 28768 bytes
[11:34:38][D][debug:053]: Flash Chip: Size=1024kB Speed=40MHz Mode=DOUT
[11:34:38][D][debug:190]: Chip ID: 0x00B7F4ED
[11:34:38][D][debug:191]: SDK Version: 2.2.2-dev(38a443e)
[11:34:38][D][debug:192]: Core Version: 2_7_4
[11:34:38][D][debug:193]: Boot Version=31 Mode=1
[11:34:38][D][debug:194]: CPU Frequency: 80
[11:34:38][D][debug:195]: Flash Chip ID=0x001640EF
[11:34:38][D][debug:196]: Reset Reason: Software/System restart
[11:34:38][D][debug:197]: Reset Info: Software/System restart
[11:35:19][D][text_sensor:015]: 'S31 4 Uptime Human': Sending state '55s'
[11:35:19][D][sensor:131]: 'S31 4 Uptime': Sending state 55.04700 s with 0 decimals of accuracy
[11:35:19][D][cse7766:147]: Got voltage=116.7V current=1.5A power=155.6W
[11:35:19][D][sensor:131]: 'S31 4 Voltage': Sending state 116.73358 V with 1 decimals of accuracy
[11:35:19][D][sensor:131]: 'S31 4 Current': Sending state 1.45838 A with 2 decimals of accuracy
[11:35:19][D][sensor:131]: 'S31 4 Power': Sending state 155.55841 W with 1 decimals of accuracy
[11:35:19][D][sensor:131]: 'S31 4 Daily Energy': Sending state 2.16027 Wh with 3 decimals of accuracy
[11:36:19][D][text_sensor:015]: 'S31 4 Uptime Human': Sending state '1m 55s'

last entries before reboot:

[12:29:19][D][text_sensor:015]: 'S31 4 Uptime Human': Sending state '54m 55s'
[12:29:19][D][sensor:131]: 'S31 4 Uptime': Sending state 3295.04395 s with 0 decimals of accuracy
[12:29:19][D][cse7766:147]: Got voltage=114.7V current=1.6A power=172.6W
[12:29:19][D][sensor:131]: 'S31 4 Voltage': Sending state 114.74612 V with 1 decimals of accuracy
[12:29:19][D][sensor:131]: 'S31 4 Current': Sending state 1.62676 A with 2 decimals of accuracy
[12:29:19][D][sensor:131]: 'S31 4 Power': Sending state 172.58455 W with 1 decimals of accuracy
[12:29:19][D][sensor:131]: 'S31 4 Daily Energy': Sending state 156.92946 Wh with 3 decimals of accuracy
[12:30:19][D][text_sensor:015]: 'S31 4 Uptime Human': Sending state '55m 55s'
[12:30:19][D][sensor:131]: 'S31 4 Uptime': Sending state 3355.04590 s with 0 decimals of accuracy
[12:30:19][D][cse7766:147]: Got voltage=109.9V current=1.7A power=172.9W
[12:30:19][D][sensor:131]: 'S31 4 Voltage': Sending state 109.86960 V with 1 decimals of accuracy
[12:30:19][D][sensor:131]: 'S31 4 Current': Sending state 1.70081 A with 2 decimals of accuracy
[12:30:19][D][sensor:131]: 'S31 4 Power': Sending state 172.86736 W with 1 decimals of accuracy
[12:30:19][D][sensor:131]: 'S31 4 Daily Energy': Sending state 159.81029 Wh with 3 decimals of accuracy

reboot:

[12:31:03][I][mqtt:212]: MQTT Connected!
[12:31:03][C][sntp:022]: Setting up SNTP...
[12:31:03][I][app:106]: ESPHome version 2021.8.2 compiled on Aug 28 2021, 11:34:02
[12:31:03][C][status_led:019]: Status LED:
[12:31:03][C][status_led:020]:   Pin: GPIO13 (Mode: OUTPUT)
[12:31:03][D][sntp:061]: Synchronized time: Sat Aug 28 12:31:03 2021
[12:31:03][C][wifi:499]: WiFi:
[12:31:03][C][uart_esp8266:075]: UART Bus:
[12:31:03][C][uart_esp8266:080]:   RX Pin: GPIO3
[12:31:03][C][uart_esp8266:081]:   RX Buffer Size: 256
[12:31:03][C][uart_esp8266:083]:   Baud Rate: 4800 baud
[12:31:03][C][uart_esp8266:084]:   Data Bits: 8
[12:31:03][C][uart_esp8266:085]:   Parity: NONE
[12:31:03][C][uart_esp8266:086]:   Stop bits: 1
[12:31:03][C][uart_esp8266:088]:   Using hardware serial interface.
[12:31:03][C][gpio.output:010]: GPIO Binary Output:
[12:31:03][C][gpio.output:011]:   Pin: GPIO12 (Mode: OUTPUT)
[12:31:03][C][light:097]: Light 'S31 4 Relay'
[12:31:03][C][cse7766:170]: CSE7766:
[12:31:03][C][cse7766:171]:   Update Interval: 60.0s
[12:31:03][C][cse7766:172]:   Voltage 'S31 4 Voltage'
[12:31:03][C][cse7766:172]:     Device Class: 'voltage'
[12:31:03][C][cse7766:172]:     State Class: 'measurement'
[12:31:03][C][cse7766:172]:     Unit of Measurement: 'V'
[12:31:03][C][cse7766:172]:     Accuracy Decimals: 1
[12:31:03][C][captive_portal:148]: Captive Portal:
[12:31:04][C][web_server:152]: Web Server:
[12:31:04][C][web_server:153]:   Address: s31_4.local:80
[12:31:04][C][ota:029]: Over-The-Air Updates:
[12:31:04][C][ota:030]:   Address: s31_4.local:8266
[12:31:04][C][ota:032]:   Using Password.
[12:31:04][C][mqtt:061]: MQTT:
[12:31:04][C][mqtt:063]:   Server Address: 192.168.1.180:1883 (192.168.1.180)
[12:31:04][C][mqtt:064]:   Username: ''
[12:31:04][C][mqtt:065]:   Client ID: 's31_4-a4cf12b7f4ed'
[12:31:04][C][mqtt:067]:   Discovery prefix: 'homeassistant'
[12:31:04][C][mqtt:068]:   Discovery retain: YES
[12:31:04][C][mqtt:070]:   Topic Prefix: 's31_4'
[12:31:04][C][mqtt:072]:   Log Topic: 's31_4/debug'
[12:31:04][C][mqtt:075]:   Availability: 's31_4/status'
[12:31:04][C][sntp:044]: SNTP Time:
[12:31:04][C][sntp:045]:   Server 1:XXREDACTEDXX
[12:31:04][C][sntp:046]:   Server 2: 'north-america.pool.ntp.org'
[12:31:04][C][sntp:047]:   Server 3: ''
[12:31:04][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Voltage':
[12:31:04][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_voltage/state'
[12:31:04][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Current':
[12:31:04][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_current/state'
[12:31:04][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Power':
[12:31:04][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_power/state'
[12:31:04][C][mqtt.sensor:024]: MQTT Sensor 'S31 4 Daily Energy':
[12:31:04][C][mqtt.sensor:028]:   State Topic: 's31_4/sensor/s31_4_daily_energy/state'
[12:31:04][C][mqtt.binary_sensor:018]: MQTT Binary Sensor 's31_4_button':
[12:31:04][C][mqtt.binary_sensor:019]:   State Topic: 's31_4/binary_sensor/s31_4_button/state'
[12:31:04][D][debug:023]: ESPHome version 2021.8.2
[12:31:04][D][debug:025]: Free Heap Size: 28128 bytes
[12:31:04][D][debug:053]: Flash Chip: Size=1024kB Speed=40MHz Mode=DOUT
[12:31:04][D][debug:190]: Chip ID: 0x00B7F4ED
[12:31:04][D][debug:191]: SDK Version: 2.2.2-dev(38a443e)
[12:31:04][D][debug:192]: Core Version: 2_7_4
[12:31:04][D][debug:193]: Boot Version=31 Mode=1
[12:31:04][D][debug:194]: CPU Frequency: 80
[12:31:04][D][debug:195]: Flash Chip ID=0x001640EF
[12:31:04][D][debug:196]: Reset Reason: Software Watchdog
[12:31:04][D][debug:197]: Reset Info: Fatal exception:4 flag:3 (Software Watchdog) epc1:0x402594ee epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000000 depc:0x00000000
[12:31:28][D][cse7766:147]: Got voltage=116.2V current=1.3A power=147.7W
[12:31:28][D][sensor:131]: 'S31 4 Voltage': Sending state 116.20379 V with 1 decimals of accuracy
[12:31:28][D][sensor:131]: 'S31 4 Current': Sending state 1.32737 A with 2 decimals of accuracy
[12:31:28][D][sensor:131]: 'S31 4 Power': Sending state 147.70035 W with 1 decimals of accuracy
[12:31:28][D][sensor:131]: 'S31 4 Daily Energy': Sending state 161.14886 Wh with 3 decimals of accuracy

Additional information

No response

tschundler commented 2 years ago

Updated with logs. I took a look at the total_daily_energy code, and there isn't even division by anything other than constants, so this is very odd. But it's very reproducible if I comment out total_daily_energy sensor, > 1day uptme. Re-enable it, and I get reboots.

Maybe I'll look at the diff of the generated code

tschundler commented 2 years ago

Code diff looks as-expected

$ diff s31_4/src/esphome.h s31_4.dp/src/esphome.h
59a60
> #include "esphome/components/total_daily_energy/total_daily_energy.h"

$ diff s31_4/src/main.cpp s31_4.dp/src/main.cpp
46a47,48
> total_daily_energy::TotalDailyEnergy *total_daily_energy_totaldailyenergy;
> mqtt::MQTTSensorComponent *mqtt_mqttsensorcomponent_5;
491a494,518
>   // sensor.total_daily_energy:
>   //   platform: total_daily_energy
>   //   name: S31 4 Daily Energy
>   //   power_id: s31_4_power
>   //   disabled_by_default: false
>   //   mqtt_id: mqtt_mqttsensorcomponent_5
>   //   force_update: false
>   //   device_class: energy
>   //   state_class: measurement
>   //   last_reset_type: auto
>   //   id: total_daily_energy_totaldailyenergy
>   //   time_id: sntp_sntpcomponent
>   //   min_save_interval: 0s
>   total_daily_energy_totaldailyenergy = new total_daily_energy::TotalDailyEnergy();
>   App.register_component(total_daily_energy_totaldailyenergy);
>   App.register_sensor(total_daily_energy_totaldailyenergy);
>   total_daily_energy_totaldailyenergy->set_name("S31 4 Daily Energy");
>   total_daily_energy_totaldailyenergy->set_disabled_by_default(false);
>   total_daily_energy_totaldailyenergy->set_device_class("energy");
>   total_daily_energy_totaldailyenergy->set_state_class(sensor::STATE_CLASS_MEASUREMENT);
>   total_daily_energy_totaldailyenergy->set_last_reset_type(sensor::LAST_RESET_TYPE_AUTO);
>   total_daily_energy_totaldailyenergy->set_force_update(false);
>   mqtt_mqttsensorcomponent_5 = new mqtt::MQTTSensorComponent(total_daily_energy_totaldailyenergy);
>   App.register_component(mqtt_mqttsensorcomponent_5);
>   total_daily_energy_totaldailyenergy->set_parent(s31_4_power);
509a537,538
>   total_daily_energy_totaldailyenergy->set_time(sntp_sntpcomponent);
>   total_daily_energy_totaldailyenergy->set_min_save_interval(0);

tschundler commented 2 years ago

Next reboot logs.. another watchdog reset

13:57:44][D][text_sensor:015]: 'S31 4 Uptime Human': Sending state '48m 44s'
[13:57:44][D][sensor:131]: 'S31 4 Uptime': Sending state 2924.18799 s with 0 decimals of accuracy
[13:58:05][D][sensor:131]: 'S31 4 Power': Sending state 172.48734 W with 1 decimals of accuracy
[13:58:05][D][sensor:131]: 'S31 4 Daily Energy': Sending state 411.97653 Wh with 3 decimals of accuracy

..reboot..

[14:00:13][I][mqtt:212]: MQTT Connected!
...
[14:00:14][D][debug:196]: Reset Reason: Software Watchdog
[14:00:14][D][debug:197]: Reset Info: Fatal exception:4 flag:3 (Software Watchdog) epc1:0x40003b4d epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000000 depc:0x00000000
...
[14:00:54][D][text_sensor:015]: 'S31 4 Uptime Human': Sending state '49s'
...
[14:00:57][D][sensor:131]: 'S31 4 Power': Sending state 159.15079 W with 1 decimals of accuracy
[14:00:57][D][sensor:131]: 'S31 4 Daily Energy': Sending state 417.18314 Wh with 3 decimals of accuracy

Wh in the first reboot: ~157 Wh in the next reboot: ~262

So it's not a rollover thing, since those are very different amounts. Also it is hitting WDT, not something else

tschundler commented 2 years ago

Since I'm using NTP, I wonder if this is related to #2003 / #2299

...adding memory debugging. I wouldn't expect to see WDT for out of ram though. And why does it seem to die faster on the devices pulling more energy? (Maybe it's a coincidence and really related to packet drops? The device with the most power is closest to the WiFi router)

I'll add free memory logging.

tschundler commented 2 years ago

[17:22:42][V][mqtt:402]: Publish(topic='s31-5/sensor/s31_5_power/state' payload='157.4' retain=1)
[17:22:59][D][heap:128]: Free heap: 27144 bytes
[17:23:09][D][text_sensor:015]: 'S31 5 Uptime Human': Sending state '5h 8m 7s'
[17:23:09][V][mqtt:402]: Publish(topic='s31-5/sensor/s31_5_uptime_human/state' payload='5h 8m 7s' retain=1)
[17:23:09][V][sensor:037]: 'S31 5 Uptime': Received new state 18487.203125
[17:23:09][D][sensor:131]: 'S31 5 Uptime': Sending state 18487.20312 s with 0 decimals of accuracy
[17:23:09][V][mqtt:402]: Publish(topic='s31-5/sensor/s31_5_uptime/state' payload='18487' retain=1)
[17:25:30][I][mqtt:212]: MQTT Connected!
[17:25:30][C][sntp:022]: Setting up SNTP...

...so not a RAM thing. Though will try changing from NTP to HASS as time source

tschundler commented 2 years ago

Not NTP either - switched to homeassistant API as the time source, and still have regular WDT reboots.

Now trying disabling MQTT completely... maybe it is related to sending more data than it did before?

tschundler commented 2 years ago

With MQTT disabled, and only using Native API, the device is now up > 24h without reboot. So the problem seems to be related to MQTT. This also means I have a solution - don't use MQTT.

But there is still a bug, and somehow it affects some devices more that others. Maybe related to power usage, now I suspect more likely related to wifi. Bug might be with MQTT or might be with wifi itself.

I suppose I could tcpdump / wireshark so see if there are incomplete transmissions or something. I took a quick look, and the biggest packet sizes for MQTT writes are ~501b (excluding header, with daily power) vs 455 (excluding header, without daily power). So it is far from the MTU limit. Most of the content is debug messaging for the power monitoring. Looking at a device running the native protobuf API, packets are up to 449 bytes. ...is it a problem only with packets > ~500b?

Maybe next test is MQTT on, daily power on, debug logging off.

tschundler commented 2 years ago

MQTT on / debug logging off still failed, and the device rebooted. So it's not related to packet size. Also an experiment leaving API only running longer lead to a reboot.

tschundler commented 2 years ago

And it still seems very tied to how much energy is used, which makes it all the more confusing - one device that I forgot I enabled it on rebooted a few times today after being up for days but switched off. Now switched on, it rebooted once at ~380Wh & again ~90Wh later.

Maybe it's not related to how much power is used but rather how often the number is updated? No reboots when the number stays at 0?

tschundler commented 2 years ago

oops, I wasn't clear in the last comments.

Also an experiment leaving API only running longer lead to a reboot.

That was a reference to MQTT off. It still rebooted. It just took over a day (maybe by luck, who knows)

twasilczyk commented 5 months ago

I had the same problem: device didn't boot after flashing with example yaml snippet. Disabling time: and platform: total_daily_energy made it check into wifi and operate just fine. I didn't dig deeper to figure out which one of the two helped.

esphome / issues