Blueforcer / awtrix3

Custom firmware for the Ulanzi Smart Pixel clock or self made awtrix. Getting started is easy as 1-2-3
https://blueforcer.github.io/awtrix3/
Other
1.25k stars 108 forks source link

Possible memory leak leading to instability #144

Closed geofffarmer closed 1 year ago

geofffarmer commented 1 year ago

Bug report

Describe the bug

I've been using my TC001 with AWTRIX LIGHT for a while and I've found that if the device is left on for several hours it will become unstable. The animation may become jerky or the display freezes or goes off. Only a two button restart fixes it.

Additional information

To Reproduce

I can send you my automation that 'drives' my unit if that helps.

Expected behavior

The unit runs for a very long time without crashing.

Screenshots

Here is a screen shot from Home Assistant showing how the free memory runs down over time and how the free memory junps up on restart.

2023-05-30 18_39_19-Home Assistant

Blueforcer commented 1 year ago

Please post the raw JSON youre sending. Not the Automation.

geofffarmer commented 1 year ago

Thanks, @Blueforcer

Here is some output from the Mosquito MQTT integration in HA

publish 1.txt publish 2.txt publish 3.txt

Maybe I'm pushing messages to the display too quickly. Or, maybe it's because I'm removing custom apps that haven't been created yet. Maybe I need to rework my automation to only publish when a change occurs but, when I turn the display on I want it configured straight away - not wait for a change. Maybe I need to keep track of which apps I've published to see which ones I can remove.

Thanks for your help.

Blueforcer commented 1 year ago

The message frequency should be a problem many users publish sensor data wich updates every 1/2 seconds. I also made a benchmark with 10k messages per hour without problems. But maybe you're using a unpopular function wich causes a memory overflow. I will check it in the next few days

Blueforcer commented 1 year ago

why youre sending to the /apps topic without any changes. I highly recommend to send it once at boot. Or disable the native apps permanent via onscreen menu

Blueforcer commented 1 year ago

Also please check new V0.67

geofffarmer commented 1 year ago

Hi @Blueforcer

I've been looking at what causes memory to be leaked away. Firstly, sending to the /apps topic doesn't seem to be a problem (I am running 0.67)

What does seem to cause a loss of memory is the Transition control. Turning on Transition or turning off Transition (regardless of if Transition is on or off) seems to loose 20 bytes. My problem was that I was turning this off every 10 seconds, if the display was off, to save the app name being recorded in HA's history. I'm not doing this now but you may want to look at this.

Also, removing an app that is published, or when a notification is removed, sometimes seems to result in a loss of memory, but I'm still looking at this to get more detail.

Anyway, thanks for an awsome piece of firmware!

Blueforcer commented 1 year ago

how you control the transition?

geofffarmer commented 1 year ago

Hi @Blueforcer

Sorry for the delay. I'm using the call service switch.turn_on and switch.turn_off.

lExplLicit commented 1 year ago

Hi all, i am also facing this issue, when the clock is on for a while (1-2 Days) it looses the connection to my MQTT Server. I cannot send commands any more and the device needs to be restarted manually. I'm only using 6 custom apps and a few notifications which are sent out every 3 Minutes (Example: Gas Prices.. 3 Minutes Later.. Weather Forecast.. 3 minutes later.. Stock Prices). I'm doing this because i do not want to see this information permanently.

The RAM usage looks similar to the screenshot above: image

Blueforcer commented 1 year ago

Unfortually this doesnt help that much to find a bug. Hundeds of users uses awtrix light with more that 6 Apps without memory leak. so i assume youre using functions (also json keys) wich isnt used by most of other users. To find a bug its nessecary to get all raw jsons and api requests youre sending to awtrix so i can reproduce it. But it would be also easier, if you disable functions and automations one by one to localize the bug

lExplLicit commented 1 year ago

Alright, that makes sense. I will try to pin down the problem in the next few days

Totte23 commented 1 year ago

Facing the same issue…. Clock loses MQTT Connection After few hours… I am running 6 Custom Apps where i publish Text Updates Every 30 seconds… Only Eyes App is running from the Standard Apps… I tried a workaround By doing restart via MQTT Call every 10 Minutes But still Happens about Twice a day… I am Running Iobroker with MQTT Adapter that does Not Show any Clue in the log file…

could I See a log file on the ESP file System as well?

Is there some Kind of watchdog available that restarts the System every Time MQTT Connection is dead…

Totte23 commented 1 year ago

Facing the same issue…. Clock loses MQTT Connection After few hours… I am running 6 Custom Apps where i publish Text Updates Every 30 seconds… Only Eyes App is running from the Standard Apps… I tried a workaround By doing restart via MQTT Call every 10 Minutes But still Happens about Twice a day… I am Running Iobroker with MQTT Adapter that does Not Show any Clue in the log file…

could I See a log file on the ESP file System as well?

Is there some Kind of watchdog available that restarts the System every Time MQTT Connection is dead…

I am getting closer.... I switched of the standard app "eyes" and since that time (nearly 20 hours) no longer any issue with MQTT connection. Could there be an issue with "eyes" that could lead to this?

Blueforcer commented 1 year ago

@Totte23 If only the MQTT connection gets lost, but the rest is still running. I doubt this is a memory leakage

Totte23 commented 1 year ago

Alright, that makes sense. I will try to pin down the problem in the next few days

Have you been able to pin it down?

Totte23 commented 1 year ago

@Totte23 If only the MQTT connection gets lost, but the rest is still running. I doubt this is a memory leakage

Still facing the issue…no idea how to get closer…

lExplLicit commented 1 year ago

Alright, that makes sense. I will try to pin down the problem in the next few days

Have you been able to pin it down?

Unfortunately no, i have tried to completely wipe and reinstall awtrix-light. I tried to only use specific applications but still have the issue that the clock is not responding anymore after some time. For now, i restart it twice a day.

Totte23 commented 1 year ago

Just to let you know: I changed all code from mqtt to http POST methode. I will let you know if issue still occurs.

Just to let you know: I changed all code from mqtt to http POST methode. I will let you know if issue still occurs.

Totte23 commented 1 year ago

Just to let you know: I changed all code from mqtt to http POST methode. I will let you know if issue still occurs.

Just to let you know: I changed all code from mqtt to http POST methode. I will let you know if issue still occurs.

I am doing now all custom app updates via http post. Since that, no more issue with connection breakdown compared to MQTT. I used iobroker MQTT adapter. Maybe this is part of the problem. With http post no longer any issue.

Totte23 commented 1 year ago

how you control the transition?

I did several tests in the past days. I noticed that I run into this issue when automatic brightness control ist switched off....

lExplLicit commented 1 year ago

how you control the transition?

I did several tests in the past days. I noticed that I run into this issue when automatic brightness control ist switched off....

i have it turned on and still have the problem. next step is to try http instead of mqtt.

Blueforcer commented 1 year ago

I'm not sure if mqtt causes the problem. 99% of all Users uses mqtt without any issues. Did you try to remove one custom app after another? I'm thinking of a special CustomApp jsonkey wich causes this.

lExplLicit commented 1 year ago

I'm not sure if mqtt causes the problem. 99% of all Users uses mqtt without any issues. Did you try to remove one custom app after another? I'm thinking of a special CustomApp jsonkey wich causes this.

Strangely, even with all apps and notifications disabled, the ram usage gets higher and higher over time

Blueforcer commented 1 year ago

In this state, please have a look at the serial console.

Blueforcer commented 1 year ago

any news here?

lExplLicit commented 1 year ago

No sorry, I didn’t investigate further

baetzmr commented 1 year ago

I have a similar issue here with some waring messages in iobroker running the instance couple of hours. I use 8 custom apps showing some stats of my homeautomation. After a while i get a waring:

2023-08-11 07:09:07.516 - debug: awtrix-light.0 (22401) sending "POST" request to "/api/custom?name=pvpower" with data: {"text":"128 W","textCase":2,"background":"#000000","color":"#faf303","icon":"44625"}

2023-08-11 07:09:07.583 - warn: awtrix-light.0 (22401) received 500 response from /api/custom?name=pvpower with content: "ErrorParsingJson"

I recorded the free RAM is at 67kb at this moment. When i delete one or two custom apps the memory goes to 78kb and it will run without any warnings.

image

image

Is there a minimum free RAM limit?

Blueforcer commented 1 year ago

Any further informations about this topic?

lExplLicit commented 1 year ago

None from my side. Still restarting the clock twice a day because of ascending memory usage. Have you been able to reproduce this behavior @Blueforcer ? image

Blueforcer commented 1 year ago

Without being able to reproduce the problem, it is not possible for me to find a bug.

Blueforcer commented 1 year ago

Based on the lack of feedback, I assume that the bug was fixed in one of the latest versions. Please feel free to open the issue again if still persist, and able to reproduce it.

vogtmh commented 3 months ago

I had a similar issue, could successfully reproduce and fix it. Long story short: I didn't know about the blink or fade functionality included in Awtrix, so I used a loop in a Home Assistant automation to turn the lights on and off every second, using a custom component which defaulted to QoS of 1.

The solution was to remove the custom component and use the built-in mqtt.publish instead, defining a QoS of 0 for all commands. I also replaced the loops with single mqtt.publish commands, where I set the indicators to blink or fade.

All problems are gone since then. Thanks for this AWesome custom firmware! :)