ratgdo / esphome-ratgdo

ratgdo for ESPHome
GNU General Public License v2.0
357 stars 108 forks source link

device regularly becomes unresponsive for several minutes #246

Open marcone opened 8 months ago

marcone commented 8 months ago

I'm using the ratgdo 2.53i standalone, not integrated with any home automation setup. I flashed it with the ESPHome firmware and have a script that polls the REST API every few seconds to get the current state and automatically close the door when it's been accidentally left open.

What I'm noticing is that the ratgdo frequently stops responding, sometimes only briefly, sometimes for minutes at a time. pinging the device shows the same behavior: it'll regularly stop responding to pings for up to a few minutes, then resumes, and these interruptions coincide with the REST API becoming nonresponsive. I can ping the GDO itself as well as other devices on the same Wifi access point (I have a dedicated AP in the garage) without issue.

(as I was typing the above, it stopped responding again and stayed unresponsive for 6 minutes)

rlowens commented 8 months ago

How often does this happen? Can you plug in and capture the serial logs?

Also, have you disabled the Home Assistant "api:" on your firmware, since you aren't using Home Assistant? The device will automatically reboot if it cannot connect to Home Assistant after 15 minutes by default. https://esphome.io/components/api.html

To disable the api, you will need to compile your own firmware with ESPHome. You'll need to install ESPHome itself somewhere and then create a device .yaml for the ratgdo and compile and flash that new firmware.

Here's what the device .yaml could look like:

substitutions:
  name: ratgdo
  friendly_name: Garage
packages:
  ratgdo.esphome: github://ratgdo/esphome-ratgdo/v25iboard.yaml@main
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

#custom modifications using https://esphome.io/guides/configuration-types.html#remove
api: !remove
marcone commented 8 months ago

I did not edit any yaml files or build my own firmware, I just flashed it from https://ratgdo.github.io/esphome-ratgdo/ (choosing "security + 2.0" and "ratdgo v2.5x").

I do see messages about it rebooting in the log on the web UI, however the nonresponsive periods are much more frequent than that, e.g. just in the last half hour it was unresponsive from 8:29-8:31, 8:33-8:40, 8:43-8:44, 8:46-8:48, 8:50-8:51 and 8:54-8:55.

I'll see about getting serial logs. Might have to go buy a really long USB cable first.

PaulWieland commented 8 months ago

This is probably because the ESP Home firmware is rebooting because it's not connected to HA. Look at reboot_timeout on https://esphome.io/components/api.html

marcone commented 8 months ago

But doesn't that reboot only happen every 15 minutes? The nonresponsive periods happen way more frequently than that, and their duration varies a lot too.

marcone commented 7 months ago

I attached the ratgdo to a Raspberry Pi so I could capture serial logs while the ratgdo was attached to the opener. I've attached two logs: "ping.log" is the log of a script that pings the ratgdo every second. When it receives a response it logs "alive", and when it doesn't receive a ping response it logs "unreachable". "serial.log" is the serial log, with each line prefixed by the timestamp of the time it was read, so it can be correlated with the ping log.

Some things that stood out to me:

serial.txt ping.txt

chriscrowe commented 6 months ago

I have this issue too... Curious if you've found any resolution. For now I've dialed my reboot_timeout values way down to try to force the ratgdo to restart itself when it becomes unresponsive.

marcone commented 5 months ago

Haven't found a solution/workaround. Since it happens so frequently, starting shortly after boot, rebooting often wouldn't really help me much either, since then I'd just be waiting for it to finish its reboot and reconnect to the network.

marcone commented 5 months ago

On a whim I tried updating the firmware again (from 2024.4.2 to 2024.5.0) and it is so much worse now. The web interface isn't even usable anymore: every time I reload the page it takes over a minute to load just a partial mostly-empty page, another minute or more for the actual information to load, and the ratgdo is unresponsive to pings for most of this time.

jgstroud commented 5 months ago

this sounds similar to an issue I'm seeing with a couple devices running the homekit firmware. curious if you disconnect the device from the GDO if it becomes responsive again. in those cases just disconnecting from the gdo made it suddenly become fully responsive again.

calisro commented 5 months ago

Yeah i've been having this issue too. It constantly disconnects where as it used to be really stable. Its not rebooting. In the logs I see a disconnect and then reconnects in a short time.

pdbennett commented 4 months ago

I also been troubleshooting this issue - glad to find this thread as I’ve been pulling my hair out. Anyone make any progress resolving?

WillCodeForCats commented 4 months ago

I just set up a ratgdo today and was seeing the json error message followed by a reboot. It was crashing so often it warned it was going into safe mode. I was testing it with combinations of remotes and Home Assistant commands and it would crash after almost every open/close cycle.

Could not allocate memory for JSON document! Requested 504 bytes, largest free heap block: 504 bytes

Removing web_server: from config helped mine to stop crashing.

marcone commented 3 weeks ago

I just flashed version 2024.10.0, and so far it seems to be behaving as it should. I haven't seen any of those "Could not allocate memory for JSON document" failures yet, and instead it just reboots voluntarily every 15 minutes as described in rlowen's comment above. Would be nice if that was an option that can be turned on/off in the web interface. Alternatively, if someone could point me at a guide that explains how to build my own image, that'd be awesome.