Closed broth-itk closed 1 month ago
Happened again:
I saw the unit connected to WiFi. Immediately when I initiated a "Disconnect" from my wireless infrastucture, it reconnected and was properly available.
There was no need to reboot or similar.
Have there been any changes to the WiFi code in the last release? I don't remember having had the issue before.
Lets see what happens with new 24.8.1 release, I'll let you know. Maybe it's just a bug in the backend libs somewhere
It just happened again:
IP connection is down, no connection to wireless infrastructure... Red LED did blink each 5 seconds, indicating that OpenDTU was still running somehow.
After resetting power, all back to normal. Strange.
Has this been corrected with the latest version (wifi reconnect issue)?
I wonder how I can get the unit back online without being on site... hm
I think this might still be related to some MQTT buffer overload / heap fragmentation. Without further USB Serial Logs about the time the problem occurs, ie sometime before and starting to loose connection this is hard to debug.
Though the comments in #2185 by @Kroki0815 here https://github.com/tbnobody/OpenDTU/issues/2185#issuecomment-2269008410 and by @jstammi here https://github.com/tbnobody/OpenDTU/issues/2185#issuecomment-2269617579 might shed some light on your issue too.
First I'm going to install that latest update to see if it helps. As I'm on vacation right now this will be in 2 weeks since I need to power cycle. Maybe a short power cut might help ;-)
USB serial debugging is the next step.
Thanks!
Have you tried another esp32? I have experienced similar effects on different projects, even with simple stuff using esphome . Effect was observed on some boards, on some not using the same firmware. Most boards get back again when soft rebooted remotely once they appear again after short outage and run stable for a while afterwards. Some don't and need to be powered off. I think the quality of the chips may vary too much...
fwiw, I experienced the same failure mode, no mqtt enabled though.
Kicking / Blocking it from Wifi allowed it to reconnect and got it unstuck (no reboot required).
v24.8.5 "uptime":965588
@broth-itk are you back from your holidays and have you had time already to upgrade to latest version and do some serial logging ?
Follow the link to the documentation to setup for USB / serial logging: https://www.opendtu.solar/firmware/howto/serial_console/
@broth-itk hi Bernhard there is a working PR for remote logging in #1819 / #2292 though you may need to somehow build and flash the image as it is not merged into the master yet. Maybe this helps to monitor your OpenDTU and analyse this issue ?
@broth-itk hi Bernhard there is a working PR for remote logging in #1819 / #2292 though you may need to somehow build and flash the image as it is not merged into the master yet. Maybe this helps to monitor your OpenDTU and analyse this issue ?
Additionally newer versions export heap statistics under the ${prefix}/dtu/heap/
topic in case this is a memory issue.
@stefan123t @ranma Thanks for the PR and the syslog enhancement! This is very appreciated and will help a lot to gather informations form the unit.
I compiled the code & webapp and from what I can tell it looks fine. Tomorrow I am going to see how it behaves when there are more logs generated from the unit.
I am going to close this issue since it did not happen anymore since some update. Maybe it was related to the recent Wifi issue? heap monitoring is very valuable as well. This allows to track down a potential memory leak.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns.
What happened?
Yesterday OpenDTU stopped to publish data to MQTT. The web interface was somewhat laggy and eventually I managed to reboot the unit. Afterwards all started to work as normal.
This is the second or third time it happened. The first two events required a power cycle to get all back to normal.
To Reproduce Bug
No indication of the issue being reproducible. Looks like memory leak or similar.
Expected Behavior
Well, the system work with no outage :)
There is already another case where the implementation of a watchdog is discussed: https://github.com/tbnobody/OpenDTU/issues/693
Although I think the best would be to solve the root cause, a Watchdog would help to recover from these situations.
At the same time, remote logging would help to collect valuable system information like memory usage to track leaks, see https://github.com/tbnobody/OpenDTU/issues/1819
Install Method
Pre-Compiled binary from GitHub
What git-hash/version of OpenDTU?
v24.6.29
Relevant log/trace output
No response
Anything else?
No response
Please confirm the following