veebch / btcticker

ePaper Cryptocurrency Ticker
GNU General Public License v3.0
313 stars 70 forks source link

Crash after too many reload cycles #37

Closed lukma99 closed 3 years ago

lukma99 commented 3 years ago

I have to restart the script every few days, because it crashes. Down here you can see the last lines of the log. It seems that an image is loaded too often and not closed properly.

INFO:root:Got price for the last 7 days from CoinGecko
DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13
DEBUG:PIL.PngImagePlugin:STREAM b'tEXt' 41 57
DEBUG:PIL.PngImagePlugin:STREAM b'pHYs' 110 9
DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 131 3226
INFO:root:Getting token Image from Image directory
INFO:root:[Errno 24] Too many open files: '/home/pi/btcticker-main_19032021/images/thebean.bmp'
veebch commented 3 years ago

This would explain the occasional out of memory message that people have been getting. I've just updated the code to close stuff after use.

Pull the latest version and see if it helps.

lukma99 commented 3 years ago

On startup:

INFO:root:Getting token Image from Image directory
DEBUG:root:e-Paper busy
DEBUG:root:e-Paper busy release
DEBUG:root:Horizontal
DEBUG:root:e-Paper busy
DEBUG:root:e-Paper busy release
DEBUG:PIL.Image:Error closing: 'Image' object has no attribute 'fp'
DEBUG:root:spi end
DEBUG:root:close 5V, Module enters 0 power consumption ...
Traceback (most recent call last):
  File "btcticker.py", line 370, in <module>
    main()
  File "btcticker.py", line 323, in main
    key1state = GPIO.input(key1)
RuntimeError: Please set pin numbering mode using GPIO.setmode(GPIO.BOARD) or GPIO.setmode(GPIO.BCM)

GPIO error is new. I had it in an older version as well but you fixed it a few weeks ago. EDIT: Checked the commit from Mar 17. Here I don't get any errors on startup

veebch commented 3 years ago

The epd sleep was the culprit. I've commented it out. I will add it back in a way that doesn't cause the bug. I think it is healthier for the epaper to snooze between updates

lukma99 commented 3 years ago

Now it doesn't crash, but it still gives this message at the end of every cycle:

DEBUG:PIL.Image:Error closing: 'Image' object has no attribute 'fp'
veebch commented 3 years ago

So my blanket application of image closes may have been overkill :) Updated!

lukma99 commented 3 years ago

No more error messages! I will leave it running for a week and close the issue, if it is really solved :)

veebch commented 3 years ago

Promising! Fingers crossed this works!

On Wed, 24 Mar 2021 at 10:50, Lukas @.***> wrote:

No more error messages! I will leave it running for a week and close the issue, if it is really solved :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/llvllch/btcticker/issues/37#issuecomment-805657141, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR5C4JVCJLLYMQAB2IWCOG3TFGYVLANCNFSM4ZVJ7UOA .

veebch commented 3 years ago

I just updated slightly, saving the screen image appears to do the garbage collection that avoids out of memory errors

zerzer04 commented 3 years ago

Im having the same issue as well.

Tried this approach with "deepcopy": https://stackoverflow.com/questions/29234413/too-many-open-files-error-when-opening-and-loading-images-in-pillow

Also will run it for a while to see if it worked out

lukma99 commented 3 years ago

Update: It still crashes after some time. But this time without a warning. Just Killed. Is it the same for you @zerzer04 ? I have set the refresh rate to half a minute to test it quicker. Works for approximately half a day and then I have to restart the script everytime.

Last lines of log before exit:

INFO:root:Got Live Data From CoinGecko
INFO:root:https://api.coingecko.com/api/v3/coins/bitcoin/market_chart/range?vs_currency=eur&from=1616627049&to=1617231849
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.coingecko.com:443
DEBUG:urllib3.connectionpool:https://api.coingecko.com:443 "GET /api/v3/coins/bitcoin/market_chart/range?vs_currency=eur&from=1616627049&to=1617231849 HTTP/1.1" 200 None
INFO:root:Got price for the last 7 days from CoinGecko
DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13
DEBUG:PIL.PngImagePlugin:STREAM b'tEXt' 41 57
DEBUG:PIL.PngImagePlugin:STREAM b'pHYs' 110 9
DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 131 2336
Killed
veebch commented 3 years ago

This sounds like it is being killed for using too much memory. If you run 'top' to monitor the process, you should be able to see whether it is getting bloated.

zerzer04 commented 3 years ago

Update: It still crashes after some time. But this time without a warning. Just Killed. Is it the same for you @zerzer04 ? I have set the refresh rate to half a minute to test it quicker. Works for approximately half a day and then I have to restart the script everytime.

Last lines of log before exit:

INFO:root:Got Live Data From CoinGecko
INFO:root:https://api.coingecko.com/api/v3/coins/bitcoin/market_chart/range?vs_currency=eur&from=1616627049&to=1617231849
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.coingecko.com:443
DEBUG:urllib3.connectionpool:https://api.coingecko.com:443 "GET /api/v3/coins/bitcoin/market_chart/range?vs_currency=eur&from=1616627049&to=1617231849 HTTP/1.1" 200 None
INFO:root:Got price for the last 7 days from CoinGecko
DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13
DEBUG:PIL.PngImagePlugin:STREAM b'tEXt' 41 57
DEBUG:PIL.PngImagePlugin:STREAM b'pHYs' 110 9
DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 131 2336
Killed

For me its working stable for 3 days already. Only changes to the code I made is replacing any "Image.open(file)" with "copy.deepcopy(Image.open(file))"...

veebch commented 3 years ago

It looks like PIL is the bit that is pushing it to the point where it is killed

lukma99 commented 3 years ago

Will try the same with deepcopy

veebch commented 3 years ago

I've also just updated with some manual cleanup of variables PIL uses. If that doesn't work, I'd guess it's a PIL issue. Using systemd to restart the script if it is killed will take care of things.

Morgawr commented 3 years ago

I've definitely had issues running the script in the last few days as it keeps thrashing my SD card due to hitting swap (it actually ended up killing one of the SD cards I had before I realized what was going on). I noticed with default settings I couldn't keep it on for longer than 1 day as the PI would just OOM and freeze and I'd have to restart.

Right now I just "fixed" the issue with a quick workaround: removed the while loop in the python script and just ran a while loop in bash instead that opens the python script for 1 single run and then automatically quits, but obviously with this solution the buttons don't really work.

I'll probably take a deeper look when I have more time if it isn't solved by then 👍

lukma99 commented 3 years ago

Getting this error when running dmesg -T| grep -E -i -B100 'killed process':

[Fri Apr  2 03:48:15 2021] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=python3,pid=22314,uid=1000
[Fri Apr  2 03:48:15 2021] Out of memory: Killed process 22314 (python3) total-vm:988808kB, anon-rss:873956kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:976kB oom_score_adj:0

Don't know if it is my system now or a common thing :/

zerzer04 commented 3 years ago

actually, the issue seem not with Pillow at all, but with display file descriptor.

lsof -p will show that each time the refresh happens, the script opens a new /dev/spidev0.0 file descriptor and doesn't close it

this is caused by "epd.Init_4Gray()" being called all the time prior to screen updates

calling epd.Init_4Gray() only once right before "while" loop seem solve the issue.

Im running now the script again with refresh rate 30sec to test this

**And Pillow actually closes files after Image.Open, so no issue there

veebch commented 3 years ago

I've moved the initialization out of the loop. Thanks!

TIL how to use lsof