Beep6581 / RawTherapee

A powerful cross-platform raw photo processing program
https://rawtherapee.com
GNU General Public License v3.0
2.67k stars 306 forks source link

Multithreaded noise reduction leaks memory #4416

Open PkmX opened 6 years ago

PkmX commented 6 years ago

To reproduce:

On my system (Arch Linux, i7-8550U, 4 cores, 8 threads), each save makes Rawtherapee use 200~400 MB more memory until it gets killed by OOM. Setting number of threads to 1 produces no (or very little of) leaks.

Version: 5.3-656-gb8440087
Branch: makepkg
Commit: b8440087
Commit date: 2018-02-28
Compiler: cc 7.3.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V3.22.2
Lensfun: V0.3.2.0
Build type: Release
Build flags: -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=c++11  -Werror=unused-label -fopenmp -Werror=unknown-pragmas -Wall -Wno-unused-result -Wno-deprecated-declarations -O3 -DNDEBUG -ftree-vectorize
Link flags: -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now
OpenMP support: ON
MMAP support: ON
PkmX commented 6 years ago

Note: even with the number of threads set to 1, enabling the wavelet module will also leak about 200 MB of memory per save as well.

heckflosse commented 6 years ago

@PkmX Can not reproduce

Version: 5.3-657-gbad28bb0
Branch: rcd-speedup
Commit: bad28bb0
Commit date: 2018-02-27
Compiler: gcc 7.3.0
Processor: undefined
System: Windows
Bit depth: 64 bits
Gtkmm: V3.22.0
Lensfun: V0.3.2.0
Build type: Release
Build flags:  -std=c++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -Wall -Wno-unused-result -Wno-deprecated-declarations -O3 -DNDEBUG -ftree-vectorize
Link flags:  -march=native
OpenMP support: ON
MMAP support: ON
Floessie commented 6 years ago

Confirmed, though the leak varies. I'll try to find something with the sanitizer...

gaaned92 commented 6 years ago

Win10 RT 5.4-RC1

Cannot reproduce after processing 100 raw photos with noise reduction and use of 8 threads.

Floessie commented 6 years ago

No leak with ASAN in debug and release build. Rebuilding now to check RSS rather than using free...

Floessie commented 6 years ago

No leak regarding RSS here. Sorry, but there must be something different with your setup, @PkmX, and I was wrong with my first impression.

user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546 35.2 11.4 1720624 698264 pts/0  Sl+  15:27   0:14 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546  142 13.8 2088372 847636 pts/0  Sl+  15:27   2:41 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546  115 14.9 2124804 914816 pts/0  Sl+  15:27   4:04 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546 70.2 16.5 2165784 1009576 pts/0 Sl+  15:27   4:56 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546 31.7 16.3 2198568 1001604 pts/0 Sl+  15:27   6:09 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546 42.3 16.5 2231352 1010284 pts/0 Sl+  15:27   8:45 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546 44.6 17.4 2354260 1066284 pts/0 Sl+  15:27   9:40 ./release/rawtherapee
user@stretchtest:~$ ps aux --sort -rss | grep rawtherapee
user      6546 47.3 15.9 2296920 974952 pts/0  Sl+  15:27  10:48 ./release/rawtherapee

Best, Flössie

PkmX commented 6 years ago

Hmm, I can reliably reproduce it here and the RSS goes up after each save.

$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 8249 559692 rawtherapee
$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 8249 1354844 rawtherapee
$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 8249 1685356 rawtherapee
$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 8249 2436524 rawtherapee
$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 8249 2858164 rawtherapee

For the record, I'm applying noise reduction + wavelet to a 24MP JPEG image. The settings in the performance tab are:

Beep6581 commented 6 years ago

@PkmX can you upload the JPG and your options file using filebin.net?

PkmX commented 6 years ago

On mobile now. I can probably upload later. However I can reproduce the leak with any JPG or RAW, so the image likely doesn't matter. You can download the X-T2 compressed RAF (which I process these days) or Sony a6000's ARW from raw.pixls.us. I tested both and they all have the same leak.

heckflosse commented 6 years ago

@PkmX

Repeatedly export the image.

This can be done in two ways:

1) put n times to queue, then process the queue

or

2) saveas n times

I don't know how you export, but can you try the other way too? Mabye we just testing differently...

PkmX commented 6 years ago

@heckflosse Both methods leak memory from what I tested.

options file (this is the default generated by Rawtherapee), pp3 and RAF:

Start:

$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
17836 215344 rawtherapee

Add to queue 10 times and start processing:

  PID   RSS COMMAND
17836 2830928 rawtherapee

Process another 10 from queue again:

  PID   RSS COMMAND
17836 3656940 rawtherapee

For comparison, with number of threads = 1:

  PID   RSS COMMAND
 8486 212652 rawtherapee

After exporting 10:

  PID   RSS COMMAND
 8486 425828 rawtherapee
Floessie commented 6 years ago

@PkmX I tried it with your PP3 and options, but still no luck:

user@stretchtest:~$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 1337 742484 rawtherapee
user@stretchtest:~$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 1337 745612 rawtherapee
user@stretchtest:~$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 1337 748928 rawtherapee
user@stretchtest:~$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 1337 748924 rawtherapee
user@stretchtest:~$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 1337 751120 rawtherapee
user@stretchtest:~$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
 1337 749932 rawtherapee

As I assume you are familiar with compiling RT, could you give ASAN a try?

$ mkdir asan-test
$ cd asan-test
$ cmake .. -DWITH_LTO=OFF -DCMAKE_BUILD_TYPE=debug -DPROC_TARGET_NUMBER=2 -DBUILD_BUNDLE=ON -DBINDIR=. -DDATADIR=. -DCACHE_NAME_SUFFIX=5-dev -DWITH_SAN=address
$ ASAN_OPTIONS=new_delete_type_mismatch=0 ./debug/rawtherapee

There are always some small leaks from libfontconfig and the like, but we're interested in the big ones. :smile:

Best, Flössie

PkmX commented 6 years ago

ASan/LSan aren't really useful for this kind of leaks (only shows a few KBs of leak here and there). The memory is likely not leaked (there is a still a reference to it somewhere), but it is being unnecessarily retained thus increasing RSS usage over time.

I will see if I can find anything useful with valgrind --tool=massif.

PkmX commented 6 years ago

I can't reproduce the leak when using valgrind --tool=massif, and with some experiment I managed to figure out the culprit: glibc malloc's per-thread arenas + heap fragmentation.

I modified Rawtherapee to call malloc_stats after exporting, here is what I got after a few saves:

system bytes     = 1698066432
in use bytes     =  944735568

system bytes     = 2301599744
in use bytes     =  944242192

system bytes     = 2517671936
in use bytes     =  944738768

Notice that the in use bytes stays almost constant, but the total size requested from the system keeps increasing after every save. So glibc is not returning memory back to system as it should, and from Rawtherapee's point of view (or ASan/valgrind) the app is not leaking memory.

Taking a closer look, glibc allocates 64 arenas for my system (8 threads * 8 arenas per thread) and some of them have some hundred MBs allocated with little in use:

Arena 40:
system bytes     =  164974592
in use bytes     =     128736

It is not clear to me why glibc isn't actively trimming the extra space for these arenas. My speculation is that the particular allocation pattern by Rawtherapee triggers heap fragmentation, and this is amplified by the number of arenas. The size also happens to be smaller than the threshold to make glibc use mmap to allocate memory.

The workaround is setting MALLOC_ARENA_MAX=1 or MALLOC_MMAP_THRESHOLD_=4096 in the environment, and this shows no increase to RSS after each save:

$ ps -C rawtherapee -o pid,rss,comm
  PID   RSS COMMAND
15280 704216 rawtherapee

  PID   RSS COMMAND
15280 698940 rawtherapee

  PID   RSS COMMAND
15280 700412 rawtherapee

Any thoughts?

(For the record, I'm using glibc 2.26.)

ff2000 commented 6 years ago

Might be I'm in here, too. With the difference that 1) it also happens when just opening and closing an unedited RAW file with default settings (no NR, no sharpening, no wavelets, ...) will increase the memory usage:

# enter a directory
ps -C rawtherapee -o pid,rss,comm
12198 186060 rawtherapee

# open one image
12198 628208 rawtherapee

# close the image
12198 322536 rawtherapee

and 2) the workaround with those env variables (here: MALLOC_ARENA_MAX=1) is of limited use:

# enter dir
12459 173028 rawtherapee

# open one img
12459 612140 rawtherapee

# close img
12459 230904 rawtherapee

Furthermore it degrades performance quite severely.

I have to say that I had similar issues in the past with several applications. tmux collected memory like mad when resizing the xterm it was running in, the ENV vars for glibc didn't help, I had to use a patch for tmux to manually call malloc_trim. gcc used enormous amounts of memory, I had to do a fresh reboot into plain terminal when I wanted to do a Gentoo update. And clang used in QtCreator/YouCompleteMe/... as backend for completion was unusable... But all this magically settled at some point, all behave well. So I currently don't know what happens with RT - I don't think there is a (big) memory leak, would have been detected by others I think. Probably it's just the way it allocates memory that triggers the problem here on my machine. I currently use glibc-2.26, kernel 4.16.16-gentoo.

tribut commented 5 years ago

I'm seeing a quite severe increase in memory usage as well (using the Appimage, both 5.4 stable and -dev, on Ubuntu bionic). Additionally, the oom killer seems not to be able to deal with it, causing the whole system to freeze when it finally runs out of memory, with SysRq-REISUB being the only way out. I'm not sure whether it is related to multi-threaded NR, though.

thirtythreeforty commented 5 years ago

Preloading jemalloc also solves the glibc arena issue for me as well as tuning the env variable:

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 rtgui/rawtherapee

further suggesting that it is glibc's arenas causing the issue.

afontenot commented 5 years ago

@thirtythreeforty I got a segfault after running RawTherapee a while with jemalloc. I'm confused by that because I thought it was supposed to be a drop in replacement for malloc. Trying to see if I can recreate the segfault with the glibc malloc but so far nothing.

thirtythreeforty commented 5 years ago

Post the backtrace, if you can. Either the jemalloc or the RT devs will probably consider that a bug.

afontenot commented 5 years ago

Here it is; debug build of the latest -dev.

jemalloc bt.txt

Floessie commented 5 years ago

@afontenot Is it thread 1 that segfaulted?

The preloaded jemalloc hangs on exit here. Could not reproduce a SEGV, yet.

#0  0x00007ffff52a729c in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007ffff52a0714 in __GI___pthread_mutex_lock (mutex=0x7ffff0a04ec0) at ../nptl/pthread_mutex_lock.c:80
#2  0x00007ffff7d6c0ec in  () at /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#3  0x00007ffff7d93030 in  () at /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#4  0x00007ffff7d1846f in free () at /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#5  0x00007ffff6fd4568 in g_closure_unref () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#6  0x00007ffff6fef2a4 in g_signal_handlers_destroy () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#7  0x00007ffff6fd93dd in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#8  0x00007ffff6fd9da3 in g_object_unref () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#9  0x00007ffff75dcfe9 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#10 0x00007ffff6fd9e12 in g_object_unref () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#11 0x00007ffff76a4d50 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#12 0x00007ffff761b136 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#13 0x00007ffff6fd4b91 in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#14 0x00007ffff6fe86a6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#15 0x00007ffff6ff125e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#16 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#17 0x00007ffff782f30c in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#18 0x00007ffff6fdb5d8 in g_object_run_dispose () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#19 0x00007ffff76a4d50 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#20 0x00007ffff761b136 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#21 0x00007ffff6fd4b91 in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#22 0x00007ffff6fe86a6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#23 0x00007ffff6ff125e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#24 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#25 0x00007ffff782f30c in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#26 0x00007ffff6fdb5d8 in g_object_run_dispose () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#27 0x00007ffff6850ec8 in Gtk::Object::_release_c_instance() () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#28 0x00007ffff67af397 in Gtk::Grid::~Grid() () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#29 0x00007ffff67af429 in Gtk::Grid::~Grid() () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#30 0x0000555555857b60 in DiagonalCurveEditorSubGroup::~DiagonalCurveEditorSubGroup() (this=0x7fffd47dbb00, __in_chrg=<optimized out>) at /home/user/src/rawtherapee/rtgui/diagonalcurveeditorsubgroup.cc:380
#31 0x0000555555857c6c in DiagonalCurveEditorSubGroup::~DiagonalCurveEditorSubGroup() (this=0x7fffd47dbb00, __in_chrg=<optimized out>) at /home/user/src/rawtherapee/rtgui/diagonalcurveeditorsubgroup.cc:383
#32 0x0000555555845f9e in CurveEditorGroup::~CurveEditorGroup() (this=0x7fffd4d73a00, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/curveeditorgroup.cc:46
#33 0x000055555584604e in CurveEditorGroup::~CurveEditorGroup() (this=0x7fffd4d73a00, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/curveeditorgroup.cc:47
#34 0x0000555555b9f16c in ToneCurve::~ToneCurve() (this=0x7fffd4a6ef80, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/tonecurve.cc:221
#35 0x0000555555b9f304 in ToneCurve::~ToneCurve() (this=0x7fffd4a6ef80, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/tonecurve.cc:222
#36 0x000055555596c770 in ExpanderBox::~ExpanderBox() (this=0x7fffd3aaf4d0, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/guiutils.h:155
#37 0x000055555596c7ea in ExpanderBox::~ExpanderBox() (this=0x7fffd3aaf4d0, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/guiutils.h:156
#38 0x00007ffff6ed4b5b in g_datalist_clear () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#39 0x00007ffff6fd9e12 in g_object_unref () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#40 0x00007ffff75d01d0 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#41 0x00007ffff678e7a8 in Gtk::Container_Class::forall_vfunc_callback(_GtkContainer*, int, void (*)(_GtkWidget*, void*), void*) () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#42 0x00007ffff761b136 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#43 0x00007ffff6fd4b91 in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#44 0x00007ffff6fe86a6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#45 0x00007ffff6ff125e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#46 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#47 0x00007ffff782f30c in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#48 0x00007ffff6fdb5d8 in g_object_run_dispose () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#49 0x00007ffff75d01d0 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#50 0x00007ffff678e7a8 in Gtk::Container_Class::forall_vfunc_callback(_GtkContainer*, int, void (*)(_GtkWidget*, void*), void*) () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#51 0x00007ffff761b136 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#52 0x00007ffff6fd4b91 in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#53 0x00007ffff6fe86a6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#54 0x00007ffff6ff125e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#55 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#56 0x00007ffff782f30c in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#57 0x00007ffff6fd9da3 in g_object_unref () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#58 0x00007ffff76196b9 in gtk_container_remove () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#59 0x00007ffff7766250 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#60 0x00007ffff678f0a5 in Gtk::Container_Class::remove_callback_normal(_GtkContainer*, _GtkWidget*) () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#61 0x00007ffff6fd7fef in g_cclosure_marshal_VOID__OBJECTv () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#62 0x00007ffff6fd4eb6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#63 0x00007ffff6ff132d in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#64 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#65 0x00007ffff76196a6 in gtk_container_remove () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#66 0x00007ffff782f233 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#67 0x00007ffff6fdb5d8 in g_object_run_dispose () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#68 0x00007ffff77666b1 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#69 0x00007ffff6fd4b91 in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#70 0x00007ffff6fe86a6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#71 0x00007ffff6ff125e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#72 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#73 0x00007ffff782f30c in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#74 0x00007ffff6fdb5d8 in g_object_run_dispose () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#75 0x00007ffff77090de in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#76 0x00007ffff761b136 in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#77 0x00007ffff6fd4c7d in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#78 0x00007ffff6fe86a6 in  () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#79 0x00007ffff6ff125e in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#80 0x00007ffff6ff191f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#81 0x00007ffff782f30c in  () at /usr/lib/x86_64-linux-gnu/libgtk-3.so.0
#82 0x00007ffff6fdb5d8 in g_object_run_dispose () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#83 0x00007ffff6850ec8 in Gtk::Object::_release_c_instance() () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#84 0x00007ffff67cbffc in Gtk::Notebook::~Notebook() () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#85 0x00007ffff67cc069 in Gtk::Notebook::~Notebook() () at /usr/lib/x86_64-linux-gnu/libgtkmm-3.0.so.1
#86 0x0000555555bb1240 in ToolPanelCoordinator::~ToolPanelCoordinator() (this=0x7ffff0928000, __in_chrg=<optimized out>) at /home/user/src/rawtherapee/rtgui/toolpanelcoord.cc:292
#87 0x00005555557868c3 in BatchToolPanelCoordinator::~BatchToolPanelCoordinator() (this=0x7ffff0928000, __in_chrg=<optimized out>) at /home/user/src/rawtherapee/rtgui/batchtoolpanelcoord.h:30
#88 0x000055555578695c in BatchToolPanelCoordinator::~BatchToolPanelCoordinator() (this=0x7ffff0928000, __in_chrg=<optimized out>) at /home/user/src/rawtherapee/rtgui/batchtoolpanelcoord.h:30
#89 0x000055555592ebbd in FilePanel::~FilePanel() (this=0x7ffff090d880, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/filepanel.cc:163
#90 0x000055555592eca6 in FilePanel::~FilePanel() (this=0x7ffff090d880, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/filepanel.cc:164
#91 0x0000555555b4b507 in RTWindow::~RTWindow() (this=0x7ffff090cc80, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/rtwindow.cc:459
#92 0x0000555555b4b5e8 in RTWindow::~RTWindow() (this=0x7ffff090cc80, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at /home/user/src/rawtherapee/rtgui/rtwindow.cc:463
#93 0x00005555559fe54a in std::default_delete<RTWindow>::operator()(RTWindow*) const (this=0x7fffffffdaf8, __ptr=0x7ffff090cc80) at /usr/include/c++/8/bits/unique_ptr.h:81
#94 0x00005555559fe429 in std::unique_ptr<RTWindow, std::default_delete<RTWindow> >::~unique_ptr() (this=0x7fffffffdaf8, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/unique_ptr.h:274
#95 0x00005555559fdcb9 in main(int, char**) (argc=1, argv=0x7fffffffe1b8) at /home/user/src/rawtherapee/rtgui/main.cc:560
afontenot commented 5 years ago

@Floessie I think it was thread 1, but I'm not sure how I could confirm that if the backtrace itself doesn't. I don't do much debugging of multithreaded applications. If there's additional information I can give you by crashing it again, let me know. I can probably reproduce the crash.

Floessie commented 5 years ago

@afontenot Normally, when GDB catches a SEGV a bt or bt full will show the stack trace of the crashed thread.

If there's additional information I can give you by crashing it again, let me know. I can probably reproduce the crash.

That would be helpful. :+1:

afontenot commented 5 years ago

@Floessie I generated the backtrace with

thread apply all bt full

Which is what's suggested here: https://rawpedia.rawtherapee.com/How_to_write_useful_bug_reports

The backtrace for the thread 1 is the last one in my log. I think that's the one that got SEGV, given that gdb has it marked as the current thread. I can try to generate another crash though.

Beep6581 commented 5 years ago

info threads spits out a list, where one is marked with an asterisk - that is the one which crashed. In this case, it is thread 1.

  Id   Target Id                                           Frame 
* 1    Thread 0x7ffff11e03c0 (LWP 3329) "rawtherapee"      0x00007ffff7d9f94b in ?? () from /usr/lib/libjemalloc.so
  2    Thread 0x7ffff051a700 (LWP 3333) "rawtherapee"      futex_wait (val=24, addr=0x7ffff0c42f44) at /build/gcc/src/gcc/libgomp/config/linux/x86/futex.h:44
  3    Thread 0x7fffefd19700 (LWP 3334) "rawtherapee"      futex_wait (val=24, addr=0x7ffff0c42f44) at /build/gcc/src/gcc/libgomp/config/linux/x86/futex.h:44
...
Floessie commented 5 years ago

@Beep6581 Yep, this is what @afontenot already mentioned.

So the main() thread is deep down in destructing GTK objects when it hits a SEGV in libjemalloc (!). I honestly think, this isn't RT's fault.

@afontenot Another trace could either back or contradict it. I'd be glad if you could provoke another crash under GDB.

ff2000 commented 5 years ago

I need to further investigate on this issue. This gets so bad that I can't edit a single image in one go, I have to close RT and lose the whole editing history. It's a matter of minutes that RT eats up my 8 GB of RAM. So I had some fun yesterday and played some hours with RT in valgrind.

memleak.zip

This is with LocalLab branch. There seem to be some "definitely lost" bytes.

One thing I just now realised: Most memory seems to be lost when panning around at 100%. I just activated a whole lot of tools. Panning around will have a higher peak usage then but that drops down to about where I am with panning a non-edited file with neutral profile. That memory also won't be freeded up when the editor for that image is closed.

// edit: ah, RT with jemalloc doesn't crash when directly opening the editor (pass file via cmdline). This keeps memory usage down. So it could be just the tile management isn't friendly for glibc memory management.