Beep6581 / RawTherapee

A powerful cross-platform raw photo processing program
https://rawtherapee.com
GNU General Public License v3.0
2.77k stars 314 forks source link

RT 5.4-650-gc9e6a18f crashes on start #4721

Open vivo75 opened 6 years ago

vivo75 commented 6 years ago

Since a long time RT often (but not always) crashes on start in a very similar manner to issue #4117. Verbose option does not change behaviour, building with debug info collapse the heisenbug. LANG variable is set to "it_IT.UTF-8" Looking at issue #4140 made me try to change the initialization order, in my tests LensFun DB initialization has been brought outside the omp parallel sections and /before/ all other initializations.

This small change (which can impact startup times) seem to give a much more reliable RT startup.

Patch applied by rawtherape-9999 ebuild

--- ./rtengine/init.cc.orig     2018-08-06 19:02:57.522594783 -0000
+++ ./rtengine/init.cc  2018-08-06 19:04:41.391452565 -0000
@@ -48,20 +48,16 @@
     PerceptualToneCurve::init();
     RawImageSource::init();

-#ifdef _OPENMP
-#pragma omp parallel sections
-#endif
-{
-#ifdef _OPENMP
-#pragma omp section
-#endif
-{
     if (s->lensfunDbDirectory.empty() || Glib::path_is_absolute(s->lensfunDbDirectory)) {
         LFDatabase::init(s->lensfunDbDirectory);
     } else {
         LFDatabase::init(Glib::build_filename(baseDir, s->lensfunDbDirectory));
     }
-}
+
+#ifdef _OPENMP
+#pragma omp parallel sections
+#endif
+{
 #ifdef _OPENMP
 #pragma omp section
 #endif

Backtrace without debug info:

(gdb) bt full
#0  0x00007ffff6e7394c in gtk_tree_store_append () from /usr/lib64/libgtk-3.so.0
No symbol table info available.
#1  0x00007ffff4fca1fc in Gtk::TreeStore::append() () from /usr/lib64/libgtkmm-3.0.so.1
No symbol table info available.
#2  0x0000555555ca145f in LensProfilePanel::LFDbHelper::fillLensfunLenses() ()
No symbol table info available.
#3  0x0000555555ca203b in LensProfilePanel::LFDbHelper::LFDbHelper() [clone ._omp_fn.0] ()
No symbol table info available.
#4  0x00007ffff29f385e in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:120
        team = 0x55555630a600
        task = 0x55555630b020
        data = <optimized out>
        thr = <optimized out>
        pool = 0x5555562f7b30
        local_fn = 0x555555ca1f80 <LensProfilePanel::LFDbHelper::LFDbHelper() [clone ._omp_fn.0]>
        local_data = 0x7fffffffc820
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6
No symbol table info available.

Version information:

Version: 5.4-650-gc9e6a18f
Branch: 5.4-650-gc9e6a18f
Commit: c9e6a18f
Commit date: 2018-08-06
Compiler: x86_64-pc-linux-gnu-gcc 8.2.0
Processor: Intel(R)\ Xeon(R)\ CPU\ E5-2630\ v4\ @\ 2.20GHz
System: Linux
Bit depth: 64 bits
Gtkmm: V3.22.2
Lensfun: V0.3.2.0
Build type: Gentoo
Build flags: -O3 -march=corei7 -pipe -mindirect-branch=thunk -flto=4 -fuse-linker-plugin -fno-fat-lto-objects -grecord-gcc-switches -frecord-gcc-switches -std=c++11  -Werror=unused-label -fopenmp -Werror=unknown-pragmas -Wall -Wno-unused-result -Wno-deprecated-declarations  -ftree-vectorize
Link flags: -Wl,-O1,--sort-common,--hash-style=gnu,--as-needed,-z,now -O3 -march=corei7 -pipe -mindirect-branch=thunk -flto=4 -fuse-linker-plugin -fno-fat-lto-objects -grecord-gcc-switches -frecord-gcc-switches
OpenMP support: yes
MMAP support: ON

All threads backtrace:

(gdb) thread apply all bt

Thread 10 (Thread 0x7fffdf187700 (LWP 14358)):
#0  0x00007ffff7b029c3 in poll () from /lib64/libc.so.6
#1  0x00007ffff644f290 in g_main_context_iterate.isra () from /usr/lib64/libglib-2.0.so.0
#2  0x00007ffff64502d2 in g_main_loop_run () from /usr/lib64/libglib-2.0.so.0
#3  0x00007ffff2276896 in gdbus_shared_thread_func.lto_priv () from /usr/lib64/libgio-2.0.so.0
#4  0x00007ffff641cfbb in g_thread_proxy () from /usr/lib64/libglib-2.0.so.0
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 9 (Thread 0x7fffdf988700 (LWP 14357)):
#0  0x00007ffff7b029c3 in poll () from /lib64/libc.so.6
#1  0x00007ffff644f290 in g_main_context_iterate.isra () from /usr/lib64/libglib-2.0.so.0
#2  0x00007ffff644f35c in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
#3  0x00007ffff644f3a1 in glib_worker_main () from /usr/lib64/libglib-2.0.so.0
#4  0x00007ffff641cfbb in g_thread_proxy () from /usr/lib64/libglib-2.0.so.0
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x7fffe4886700 (LWP 14356)):
#0  futex_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:44
#1  do_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:67
#2  gomp_team_barrier_wait_end (bar=<optimized out>, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112
#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f386a in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:121
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7fffe5087700 (LWP 14355)):
#0  0x00007ffff6e7394c in gtk_tree_store_append () from /usr/lib64/libgtk-3.so.0
#1  0x00007ffff4fca1fc in Gtk::TreeStore::append() () from /usr/lib64/libgtkmm-3.0.so.1
#2  0x0000555555ca145f in LensProfilePanel::LFDbHelper::fillLensfunLenses() ()
#3  0x0000555555ca203b in LensProfilePanel::LFDbHelper::LFDbHelper() [clone ._omp_fn.0] ()
#4  0x00007ffff29f385e in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:120
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x7fffe5888700 (LWP 14354)):
#0  futex_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:44
#1  do_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:67
#2  gomp_team_barrier_wait_end (bar=<optimized out>, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112
#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f386a in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:121
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7fffe6089700 (LWP 14353)):
#0  futex_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:44
#1  do_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:67
#2  gomp_team_barrier_wait_end (bar=<optimized out>, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112
#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f386a in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:121
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7fffe688a700 (LWP 14352)):
#0  futex_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:44
#1  do_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:67
#2  gomp_team_barrier_wait_end (bar=<optimized out>, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112

#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f386a in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:121
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7fffe708b700 (LWP 14351)):
#0  futex_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:44
#1  do_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:67
#2  gomp_team_barrier_wait_end (bar=<optimized out>, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112
#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f386a in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:121
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fffe788c700 (LWP 14350)):
#0  futex_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:44
#1  do_wait (val=16, addr=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:67
#2  gomp_team_barrier_wait_end (bar=<optimized out>, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112
#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f386a in gomp_thread_start (xdata=<optimized out>)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:121
#5  0x00007ffff0891933 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff7b0e58f in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7ffff7f5ea40 (LWP 14346)):

#0  0x00007ffff29f5f3a in do_spin (val=16, addr=0x55555630a684)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/x86/futex.h:130
#1  do_wait (val=16, addr=0x55555630a684)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/wait.h:66
#2  gomp_team_barrier_wait_end (bar=0x55555630a680, state=16)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:112
#3  0x00007ffff29f60aa in gomp_team_barrier_wait_final (bar=bar@entry=0x55555630a680)
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/config/linux/bar.c:136
#4  0x00007ffff29f4c69 in gomp_team_end ()
    at /var/tmp/portage2/portage/sys-devel/gcc-8.2.0/work/gcc-8.2.0/libgomp/team.c:877
#5  0x0000555555c90ad2 in LensProfilePanel::LFDbHelper::LFDbHelper() ()
#6  0x0000555555ca4fe5 in LensProfilePanel::LensProfilePanel() ()
#7  0x00005555558414a1 in ToolPanelCoordinator::ToolPanelCoordinator(bool) [clone .constprop.649]
    ()
#8  0x0000555555e3fc2f in BatchToolPanelCoordinator::BatchToolPanelCoordinator(FilePanel*) ()
#9  0x0000555555d304ca in FilePanel::FilePanel() ()
#10 0x0000555555beb9e7 in RTWindow::RTWindow() ()
#11 0x0000555555ca0964 in (anonymous namespace)::create_rt_window() ()
#12 0x000055555581f33c in main ()
heckflosse commented 6 years ago

@vivo75

This small change (which can impact startup times) seem to give a much more reliable RT startup

Does it give a much more reliable or a reliable startup?

vivo75 commented 6 years ago

@heckflosse RT never crashed on start after the change, but I've started it no more than 10 times since then. However before the change it failed at least half the startups.

ff2000 commented 6 years ago

@vivo75 How do you build "with debug info"? I have "-ggdb" in C{XX}FLAGS in my make.conf and activated "splitdebug" in FEATURES. That AFAIK results in the same binary as when it was built without debug symbols but stores those in a different location - and tools can load them in case of a crash. I think you should be able to reproduce the crash and get the exact point of failure in RT sources.

Beep6581 commented 6 years ago

@ff2000 see http://rawpedia.rawtherapee.com/Linux#CMake

vivo75 commented 6 years ago

@ff2000 it's not the same binary, and it does not run at the same speed, this can be very important when race conditions are involved, and I suspect this is the case here my debug flags are:

CFLAGS="-O3 -march=corei7 -pipe -mindirect-branch=thunk -grecord-gcc-switches -frecord-gcc-switches -g3 -ggdb -ggnu-pubnames -fvar-tracking-assignments"
CXXFLAGS="-O3 -march=corei7 -pipe -mindirect-branch=thunk -grecord-gcc-switches -frecord-gcc-switches -g3 -ggdb -ggnu-pubnames -fvar-tracking-assignments"

my usual flags are:

FLTO="-flto=4 -fuse-linker-plugin -fno-fat-lto-objects"
FGRAPHITE=""
GCCDEBUG="-grecord-gcc-switches -frecord-gcc-switches"
CFLAGS="-O3 -march=corei7 -pipe -mindirect-branch=thunk ${FLTO} ${FGRAPHITE} ${GCCDEBUG}"
CXXFLAGS="${CFLAGS}"
ff2000 commented 6 years ago

@vivo75 My main point was to to add "splitdebug" to your FEATURES variable in make.conf. You end up with a stripped binary that should be the same as if it was built without debugging symbols.

vivo75 commented 6 years ago

@heckflosse , I had two crashes today, even with the applied patch, new version 5.4-660-g6d19fae0