Egyras / HeishaMon

Panasonic Aquarea air-water H, J, K and L series protocol decrypt
217 stars 113 forks source link

Heishamon crash during connecting webserver. Version Alpha-0776a05, build #537 on Heishamon V4. #494

Open McMagellan opened 3 weeks ago

McMagellan commented 3 weeks ago

As mentioned last week, there are problems with the web server connection from version 3.5 Yesterday I was able to document two such events.

At 6:10 p.m. I established a connection to Heishamon in a new tab without any further connection. During setup the page looks as shown in the picture. The uptime still shows 19 hours and 45 minutes but the page is frozen.

IMG_8490

After closing this tab and opening a new tab, the page opens but with a restarted uptime.

Screenshot 2024-06-09 at 23-52-43 Heisha monitor

After switching to the console page, I could see that no rules were active.

IMG_8492

After I rebooted in the main menu, the rules worked again.

The rules worked correctly until the reset, as can be seen on the grafana graphic. When idle (no heating mode), the quiet level is briefly set as a heartbeat every 30 minutes.

Screenshot 2024-06-09 at 23-46-24 Basisgrafik Moritz an Raspi4 V1 (Panasonic Aquarea 5KW Jeisha Monoblock) - Dashboards - Grafana

At first glance, this malfunction has nothing to do with the Rules Engine. However, I think it is possible that there could be a conflict in memory usage (bigger rule) or interrupts.

Ideally, the current processes would be monitored during initialization on the page instead of ..Loading.... One detail: I use Firefox as a browser and if the page freezes, the right mouse button temporarily no longer works.

At 11:50 p.m. I repeated the process with the same behavior. See image. The uptime here ran up to 5 hours and 40 minutes, which corresponds exactly to the time interval to 6:10 p.m.

Screenshot 2024-06-09 at 23-52-05 Heisha monitor

This process is not always reproducible and appears to occur sporadically.

I have not tested the behavior without an active rule or when establishing a URL connection with the Set XX command.

stumbaumr commented 3 weeks ago

According to https://github.com/IgorYbema/HeishaMon/pull/121#issuecomment-2086564359 you could hook a serial console to the HeishaMon PCB and get logging just before the reboot. My HeisaMon is outside with the Jeisha - rather difficult to place the computer there - but maybe a Raspi could work...

McMagellan commented 3 weeks ago

If I were a developer,

i would try to implement some kind of internal error memory in the non-volatile area. This does not have to be permanent and could be activated in the settings as required. For example, around 1000 lines in which status messages with a time stamp can be written on a rolling basis. Then you could display the entire content via a browser tab by calling e.g. http://192.168.178.82/errorlog and save it as a txt file if necessary. It would be important to get information that arises when the web server is down. Following the historical data, the current reports could then be output live. All this without additional hardware.

But I'm not a developer.

geduxas commented 3 weeks ago

There is hardware, openserial (serial logger) it's device with SD card.. you easy could attach it to heishamon serial output and store everything on SD card..

McMagellan commented 2 weeks ago

Next day, next crash!

My rule development has grown to around 300 lines and because it got cold to 6° that night, I left the HP running. Last rule update with crash was around 00:16h. After that, no more console connection until the next morning at 7:20 a.m. The first call at 7:20 led to the already familiar crash behavior. In the screenshot you can see that the uptime is 7 hours and 3 minutes but the window is frozen. There were no MQTT reonnects either and the Wifi- connection is excellent. After restarting the tab, the uptime is also restarted and the rules are no longer running. After a reboot the rules worked again.

Screenshot 2024-06-12 at 07-20-35 Heisha monitor

In the Grafana graphic you can see that the rules worked perfectly the whole time until 7:20 (after I had previously removed all 32 print lines from the rulesset to save resources).

Screenshot 2024-06-12 at 09-37-57 Basisgrafik Max an Raspi3 V1 (Panasonic Aquarea 5KW Jeisha Monoblock) - Dashboards - Grafana

What could be the cause and how can I help troubleshoot the problem?

P.S.: SetCurves and the new trigger design work excellently. It would be a shame if my use of rules came to an end again like last year when the variables were lost for an unknown reason. (ver 3.2.3) There were no longer any problems with parsing.