derf / feh

a fast and light image viewer
https://feh.finalrewind.org
Other
1.51k stars 159 forks source link

Potential memory leak in feh 3.7 #622

Open sgreadly opened 3 years ago

sgreadly commented 3 years ago

I noticed today Feh stopped running. Checking why, I can see a lot of OOM messages, and feh's RSS seems to be high (slowly increasing too).

This is on a Raspberry Pi Zero W, so valgrind may not work as per my previous memleak issue #553 . The leak itself is also slower than the previous issue. This time ~30 days for it to trigger OOM.

`/Downloads/feh-3.7/src $ ./feh --version feh version 3.7 Compile-time switches: curl exif help verscmp xinerama

$ uptime 13:20:17 up 33 days, 17:55, 3 users, load average: 0.03, 0.05, 0.01 `

Edit: Can't seem to paste the domes output here without it messing up the newlines. Instead, attaching:

feh_mem.txt

Edit 2: I compiled 3.7.1 and enabled debug to see if it also has the same symptoms. But can you remind me how to log the debug logs you need?

Let me know if there's anything else you need.

Ta.

derf commented 3 years ago

Is it still the same command line as in the previous issue?

As for debug logs, you can enable them by passing the --debug switch to a debug build of feh.

sgreadly commented 3 years ago

Yes, same command.

export DISPLAY=:0; /home/pi/Downloads/feh/src/feh -qrYzZFD 120.0 --auto-rotate --zoom fill /home/pi/Pictures/frame/

And thanks, will enable debug (correctly this time) and monitor it until it happens again.

export DISPLAY=:0; /home/pi/Downloads/feh/src/feh -qrYzZFD 120.0 --auto-rotate --zoom fill /home/pi/Pictures/frame/ --debug > /var/tmp/feh.log

sgreadly commented 3 years ago

Sorry for the delay,

It seems to have ran out of memory on the 21st.

feh.log.gz feh-dmesg.log

The past 2 days htop shows RES at around 50MB, spiking up to 80MB then hovering around 70MB, down to 50MB again. Possibly when it's loading a new image.

However, swap is at 90% full and Mem is 103/240M

pi@frame:/var/tmp $ free -m total used free shared buff/cache available Mem: 239 101 50 0 88 89 Swap: 99 89 10

` CPU[||||||||||||||||||||||||||||||||||||| 37.7%] Tasks: 58, 56 thr; 1 running Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 102M/240M] Load average: 0.27 0.15 0.10 Swp[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||89.4M/100.0M] Uptime: 22 days, 00:46:44

PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 3490 pi 20 0 72308 49688 1488 S 21.6 20.2 2h53:17 /home/pi/Downloads/feh/src/feh -qrYzZFD 120.0 --auto-rotate --zoom fill /home/pi/Pictures/frame/ --debug`

I'm wondering if maybe one of the images is too big for it to load? Tipping over.

Edit: My biggest image is 65MB, and it seems to be loaded successfully a few times. For example,

pi@frame:~/Pictures/frame $ ls -lShr | tail -5 -rwxr-xr-x 1 pi pi 39M Dec 18 2020 IMG_9977.png -rwxr-xr-x 1 pi pi 44M Dec 18 2020 IMG_9649.png -rwxr-xr-x 1 pi pi 45M Dec 18 2020 IMG_9674.png -rwxr-xr-x 1 pi pi 59M Dec 18 2020 IMG_9771.png -rwxr-xr-x 1 pi pi 65M Dec 18 2020 IMG_9558.png

winwidget.c +814 winwidget_loadimage : filename /home/pi/Pictures/frame/IMG_9558.png imlib.c +347 feh_load_image : filename is /home/pi/Pictures/frame/IMG_9558.png, image is 0x7fe6c0 imlib.c +489 feh_load_image : Loaded ok winwidget.c +462 winwidget_render_image : winwidget_render_image resize 1 force_alias 0 im 10678x3676 winwidget.c +550 winwidget_render_image : sx: 2073 sy: 0 sw: 6535 sh: 3676 dx: 0 dy: 0 dw: 1920 dh: 1080 zoom: 0.293798 winwidget.c +555 winwidget_render_image : winwidget_render(): winwid->im_angle = 0.000000 winwidget.c +371 winwidget_update_title : winwid->name = feh [409 of 602] - /home/pi/Pictures/frame/IMG_9558.png

Though I don't seem to see a timestamp in the debug logs to help correlate with dmesg's

Edit2: Is there anyway to copy logs/lines of text here and let it accept the new lines? Very irritating..

sgreadly commented 3 years ago

@derf

Crashed again today (28th). Previous instances feh was still alive but this time it crashed - have to restart it. Attaching dmesg output of past few days and feh's debug log.

feh version 3.7.1 Compile-time switches: curl debug exif help verscmp xinerama

feh.log.gz feh-dmesg.log

sgreadly commented 2 years ago

@derf was wondering if you managed to look into this? Issue still persists :/

Anything I can help with or more data you need ? Ta.

derf commented 2 years ago

Sorry for the delay. I've been meaning to set up a raspberry pi to try and reproduce the issue (cause feh really shouldn't behave that way – running slideshows on raspis and similar its one of its main use cases, in fact), but couldn't get around to it yet.

Thanks for the debug log and dmesg output! Apart from the OOM kill, I don't see anything out of the ordinary in them, so it seems to be a pretty nasty memory leak.

Hopefully I'll have time to reproduce the issue over the next days. Please poke me again if you don't hear from me.

sgreadly commented 2 years ago

Thanks! No worries :) take your time, this bug is a slow one to trigger..

Note, you'll need to drop the CPU frequency on the Pi Zero W otherwise it'll randomly panic / crash.

I dropped mine to 900Mhz and it's been running fine since.

Add arm_freq=900 to your /boot/config.txt and reboot.

along with

derf commented 2 years ago

All right. I can reproduce OOM crashes on a RasPi 1. It doesn't look like a memory leak – my current assumption is that changes in feh, Imlib2, one of its loaders, or your files lead to an increase in memory consumption, making feh more susceptible to crashes on memory-constrained systems such as RasPi 0 / 1. I'll keep investigating.

A bit of background: When changing images, feh loads the new image, displays it, and only then releases memory for the previous image. So there's a small time frame during which feh needs to hold two raw images in memory. A raw image has three bytes per pixel, so that's already 120MB for a 40MP image. Looking at feh.log.gz, I see that the last crash happens right at the transition between two large images: IMG_0469.jpeg (4272x2848) → IMG_9771.png (11552x3694).

So once feh's randomized filelist encounters a pair of large images, it's too much.

kiwigeoff commented 7 months ago

Exactly the same issue has started to happen to me. I'm using FEH in a headless fashion to drive a photo frame and this bug is now causing the frame to cease displaying photos after several days of uptime Largest image size is 15M Folder contains 96 images Command Line: /usr/local/bin/feh --auto-zoom --quiet --full-screen --slideshow-delay 60.0 --auto-reload --randomize /home/photoframe/Pictures

Feh version: feh version 3.10 Compile-time switches: curl inotify verscmp xinerama