Beep6581 / RawTherapee

A powerful cross-platform raw photo processing program
https://rawtherapee.com
GNU General Public License v3.0
2.77k stars 313 forks source link

Crash while changing chrominance denoising method #3410

Closed da-phil closed 8 years ago

da-phil commented 8 years ago
Branch: master
Version: 4.2.1024
Changeset: db6b43e3c0c6d18cae33d48e80935dc99a620be2
Compiler: cc 5.4.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V2.24.4
Build type: release
Build flags: -std=c++11 -std=gnu++11  -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG
Link flags:   
OpenMP support: ON
MMAP support: ON

OS: Ubuntu 16.04.1 LTS

Very rarely I'm able to crash RT when I change the chrominance denoising method between the entries, it's very similar to https://github.com/Beep6581/RawTherapee/issues/3379. Unfortunately I was only able to reproduce the crash with my release build. I immediately should have made a debug one...

Maybe the backtrace is still of help:

....
[New Thread 0x7ffeb083a700 (LWP 1254)]
[New Thread 0x7ffeb183c700 (LWP 1255)]
[New Thread 0x7ffeb103b700 (LWP 1256)]
[New Thread 0x7ffeb203d700 (LWP 1257)]
[New Thread 0x7ffeb5043700 (LWP 1258)]

Thread 69741 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeb183c700 (LWP 1255)]
0x0000000000a4844d in rtengine::ImProcFunctions::MadRgb(float*, int) ()
(gdb) backtrace
#0  0x0000000000a4844d in rtengine::ImProcFunctions::MadRgb(float*, int) ()
#1  0x0000000000a52b01 in rtengine::ImProcFunctions::WaveletDenoiseAllAB(rtengine::wavelet_decomposition&, rtengine::wavelet_decomposition&, float*, float (*) [3], float, bool, bool, bool) [clone ._omp_fn.25] ()
#2  0x00007ffff35d88ac in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#3  0x00007ffff31956fa in start_thread (arg=0x7ffeb183c700) at pthread_create.c:333
#4  0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Hopefully I'll be able to reproduce the crash with a debug build!

Floessie commented 8 years ago

@heckflosse With an 1:1 image area I get a datalen of 177786 here, so it's definitely OOB here. Shall I clamp the loop with && count < 65536 or is something else mathematically needed? I'd also like to convert that int* histo to a std::vector<> or std::array<>, okay?

Will have a look on the weekend.

Best, Flössie

Edit: Sorry, misread the code a bit. It could be OOB, if histo[median] yields low enough values for count to stay below datalen / 2 while median goes above 65535. Don't know without testing further, if this can happen. So it's better to clamp median then instead of count.

heckflosse commented 8 years ago

@Floessie Have a look at the loop which fills the histo[] array. After that loop the sum of all entries in histo[] will be = datalen, correct? That means in the while loop we won't reach an array access with median > 65535, correct? After the loop median can be 65536, but then the histo[] array will be accessed with median-1 which then also may not be larger than 65535. I still don't see where an OOB may happen here. Or do I miss something?

Ingo

heckflosse commented 8 years ago

@Floessie Another thing I don't understand:

#0  0x0000000000a4844d in rtengine::ImProcFunctions::MadRgb(float*, int) ()
#1  0x0000000000a52b01 in rtengine::ImProcFunctions::WaveletDenoiseAllAB(

There is no call to MadRgb inside WaveletDenoiseAllAB. It's inside ShrinkAllAB which is called inside WaveletDenoiseAllAB. Is that because it got optimized?

Ingo

Floessie commented 8 years ago

@heckflosse

Is that because it got optimized?

Hm, or the stack frame got corrupted. I placed a SEGV in MadRgb() (the name speaks for itself) and the release backtrace showed another but right path. But maybe the other path gets inlined.

I still don't see where an OOB may happen here. Or do I miss something?

Yes, you opened my eyes. No, you are right. So it can't be histo's fault, and as no members are accessed, DataList would be the next to have an eye on...

heckflosse commented 8 years ago

@Floessie It will access OOB if datalen == 0

heckflosse commented 8 years ago

@Floessie MadRgb() (the name speaks for itself). Before I optimized the denoise stuff the function also calculated the max of the histogram, but it wasn't used, so I removed it. At that time the function name was MadMax :)

Ingo

heckflosse commented 8 years ago

@Floessie I pushed a fix for the possible buffer underrun.

heckflosse commented 8 years ago

I changed the title to avoid pictures without noses ;)

da-phil commented 8 years ago

haha @heckflosse, I even wrote it in the text :-P

yesterday I was really tired, sorry for providing an inaccurate build info, I already copy and pasted the build info from my debug build I made after I could reproduce the crash during a gdb session of my release build, but then I couldn't reproduce it with the release build anymore, so I just changed 'debug' by 'release' in the build info field above. the actual build-flags of my release build were: Build flags: -std=c++11 -std=gnu++11 -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG so optimisation might have really played a role, because my debug build didn't have the -O3 optimisation flag.

da-phil commented 8 years ago

I could reproduce it again with a release build, it seems that it's not reproducible with debug builds

Backtrace:

Thread 27949 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffef3a71700 (LWP 17772)]
0x0000000000a4079d in rtengine::ImProcFunctions::MadRgb(float*, int) [clone .part.43] ()
(gdb) backtrace
#0  0x0000000000a4079d in rtengine::ImProcFunctions::MadRgb(float*, int) [clone .part.43] ()
#1  0x0000000000a52c16 in rtengine::ImProcFunctions::WaveletDenoiseAllAB(rtengine::wavelet_decomposition&, rtengine::wavelet_decomposition&, float*, float (*) [3], float, bool, bool, bool) [clone ._omp_fn.25] ()
#2  0x00007ffff35d88ac in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#3  0x00007ffff31956fa in start_thread (arg=0x7ffef3a71700) at pthread_create.c:333
#4  0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
heckflosse commented 8 years ago

@da-phil Can you reproduce it with the fix from dc4bbe9 ?

da-phil commented 8 years ago

@heckflosse jup, I pulled the new code before I continued testing.

Now I could reproduce the crash with a debug build, I really needed to add -O3 to the compiler flags in order to reproduce. Here's the backtrace:

Thread 208344 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeeca63700 (LWP 4290)]
0x0000000000a4101d in rtengine::ImProcFunctions::MadRgb (DataList=0x7fff184fb1f0, datalen=datalen@entry=17010, this=0xba050e8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2194
2194            histo[min(65535, abs(static_cast<int>(DataList[i])))]++;
(gdb) backtrace
#0  0x0000000000a4101d in rtengine::ImProcFunctions::MadRgb (DataList=0x7fff184fb1f0, datalen=datalen@entry=17010, this=0xba050e8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2194
#1  0x0000000000a52e36 in rtengine::ImProcFunctions::MadRgb (datalen=17010, DataList=<optimised out>, this=0xba050e8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2178
#2  rtengine::ImProcFunctions::ShrinkAllAB (madaab=0x0, madCalculated=false, madL=<optimised out>, denoiseMethodRgb=<optimised out>, autoch=<optimised out>, useNoiseCCurve=true, noisevar_ab=0.0199999996, noisevarchrom=0x7fff182ab660, dir=1, 
    level=4, buffer=0x7ffeeca62bc0, WaveletCoeffs_ab=..., WaveletCoeffs_L=..., this=0xba050e8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2759
#3  rtengine::ImProcFunctions::WaveletDenoiseAllAB () at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2620
#4  0x00007ffff35d88ac in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#5  0x00007ffff31956fa in start_thread (arg=0x7ffeeca63700) at pthread_create.c:333
#6  0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
heckflosse commented 8 years ago

@da-phil great, looking

heckflosse commented 8 years ago

@da-phil was that the result of bt full?

Can you please post the output of 'bt full' next time you can reproduce the crash? 'bt full' shows more info.

Ingo

da-phil commented 8 years ago

oh sh*t... that was the normal backtrace, not bt full :fearful: I've to try again to reproduce the crash...

da-phil commented 8 years ago

well, fortunately it's reasonably easy to make it crash ;)

there you go:

Thread 130432 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeed7d2700 (LWP 9233)]
0x0000000000a4101d in rtengine::ImProcFunctions::MadRgb (DataList=0x7fffb47ed838, datalen=datalen@entry=17010, this=0xb7440c8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2194
2194            histo[min(65535, abs(static_cast<int>(DataList[i])))]++;
(gdb) bt full
#0  0x0000000000a4101d in rtengine::ImProcFunctions::MadRgb (DataList=0x7fffb47ed838, datalen=datalen@entry=17010, this=0xb7440c8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2194
        histo = 0x7fff54034e50
        i = <optimised out>
        median = <optimised out>
        count = <optimised out>
        count_ = <optimised out>
#1  0x0000000000a52e36 in rtengine::ImProcFunctions::MadRgb (datalen=17010, DataList=<optimised out>, this=0xb7440c8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2178
No locals.
#2  rtengine::ImProcFunctions::ShrinkAllAB (madaab=0x0, madCalculated=false, madL=<optimised out>, denoiseMethodRgb=<optimised out>, autoch=<optimised out>, useNoiseCCurve=true, noisevar_ab=0.0199999996, noisevarchrom=0x7fffb42a35e0, dir=2, level=6, 
    buffer=0x7ffeed7d1bc0, WaveletCoeffs_ab=..., WaveletCoeffs_L=..., this=0xb7440c8) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2759
        sfaveabd = 0x7fff54013930
        H_ab = 126
        madab = <optimised out>
        sfaveab = 0x7fff54002e60
        WavCoeffs_L = 0x7fffb40e4c90
        eps = 0.00999999978
        W_ab = 135
        WavCoeffs_ab = <optimised out>
        mad_L = 68644.7812
#3  rtengine::ImProcFunctions::WaveletDenoiseAllAB () at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2620
        lvl = 6
        dir = 2
        buffer = {0x7fff54002de0, 0x7fff54013830, 0x7fff54024300}
        WaveletCoeffs_L = <optimised out>
        WaveletCoeffs_ab = <optimised out>
        noisevarchrom = <optimised out>
        madL = <optimised out>
        noisevar_ab = <optimised out>
        useNoiseCCurve = <optimised out>
        autoch = <optimised out>
        denoiseMethodRgb = <optimised out>
        this = <optimised out>
        maxlvl = <optimised out>
        maxWL = <optimised out>
        maxHL = <optimised out>
        memoryAllocationFailed = <optimised out>
#4  0x00007ffff35d88ac in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
No symbol table info available.
#5  0x00007ffff31956fa in start_thread (arg=0x7ffeed7d2700) at pthread_create.c:333
        __res = <optimised out>
        pd = 0x7ffeed7d2700
        now = <optimised out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140732882822912, 4148319383595073801, 0, 140732924771455, 8388608, 140736217103344, -4148923040227573495, -4148329061473107703}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, 
              cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimised out>
        pagesize_m1 = <optimised out>
        sp = <optimised out>
        freesize = <optimised out>
        __PRETTY_FUNCTION__ = "start_thread"
#6  0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
No locals.
heckflosse commented 8 years ago

@da-phil Can you give a clear instruction to crash? I'll run it through valgrind then (though that will take a whole day)

Ingo

da-phil commented 8 years ago

@heckflosse Actually I couldn't reproduce the crash with only changing between chrominance deno(i)sing modes, I need to open some magnification windows while changing. Open a file and do the following steps:

  1. create magnification window
  2. change chrominance denoising mode e to the "Auto multi-zones"
  3. create a new magnification window
  4. change to "Preview multi-zones"
  5. repeat from step 1 and close some of those magnification windows from time to time

I could reproduce a crash several times already by creating a new magnification window while the picture is re-rendered after changing the chrominance denoising mode.

I also noticed that you first need to create a magnification window in order for the "Automatic global" mode to change the chrominance parameters which will be shown when switching back to the "Manual" mode.

da-phil commented 8 years ago

It crashed again, this time even without magnification window, only turning "Auto multi-zones" mode on, wait a second, then turn "Preview multi-zones" mode on and save image. This time it crashed while saving the image. Here is the short form, I attached the bt full output as txt file: rt-crash-bt.txt

Thread 210 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc27fc700 (LWP 17450)]
0x0000000000a48523 in rtengine::ImProcFunctions::Mad (this=<optimised out>, DataList=<optimised out>, datalen=4011312) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2159
2159            histo[min(255, abs(static_cast<int>(DataList[i])))]++;
(gdb) bt 
#0  0x0000000000a48523 in rtengine::ImProcFunctions::Mad (this=<optimised out>, DataList=<optimised out>, datalen=4011312) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2159
#1  0x0000000000a5261d in rtengine::ImProcFunctions::ShrinkAllAB (madaab=0x0, madCalculated=false, madL=<optimised out>, denoiseMethodRgb=<optimised out>, autoch=<optimised out>, useNoiseCCurve=true, 
    noisevar_ab=0.0199999996, noisevarchrom=0x7fff70f77d50, dir=1, level=1, buffer=0x7fffc27f8a70, WaveletCoeffs_ab=..., WaveletCoeffs_L=..., this=0x7fffc27fb420)
    at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2757
#2  rtengine::ImProcFunctions::WaveletDenoiseAllAB () at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2620
#3  0x00007ffff35cfe5f in GOMP_parallel () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#4  0x0000000000a6005f in rtengine::ImProcFunctions::WaveletDenoiseAllAB (denoiseMethodRgb=<optimised out>, autoch=<optimised out>, useNoiseCCurve=<optimised out>, noisevar_ab=<optimised out>, 
    madL=<optimised out>, noisevarchrom=<optimised out>, WaveletCoeffs_ab=..., WaveletCoeffs_L=..., this=<optimised out>) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2601
#5  rtengine::ImProcFunctions::RGB_denoise () at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:1193
#6  0x00007ffff35cfe5f in GOMP_parallel () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#7  0x0000000000a4df4e in rtengine::ImProcFunctions::RGB_denoise (this=this@entry=0x7fffc27fb420, kall=kall@entry=2, src=src@entry=0x7fff70003ac0, dst=dst@entry=0x7fff70003ac0, calclum=<optimised out>, 
    calclum@entry=0x7fff70003c90, ch_M=ch_M@entry=0x7fff70002870, max_r=0x7fff700029b0, max_b=0x7fff70002a60, isRAW=true, dnparams=..., expcomp=expcomp@entry=1.5800769925117493, noiseLCurve=..., 
    noiseCCurve=..., chaut=@0x7fffc27fa400: 3.22674195e-38, redaut=@0x7fffc27fa470: -63.9108887, blueaut=@0x7fffc27fa4e0: 1.40129846e-45, maxredaut=@0x7fffc27fa550: -6.02895109e+24, 
    maxblueaut=@0x7fffc27fa5c0: 1.26116862e-44, nresi=@0x7fffc27fa730: 1.40129846e-45, highresi=@0x7fffc27fa7b0: -63.900116) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:824
#8  0x00000000009a3533 in rtengine::processImage (pjob=0xd1f6800, errorCode=<optimised out>, pl=0x28e9ee0, tunnelMetaData=<optimised out>, flush=<optimised out>)
    at /home/phil/code/rawtherapee-git/rtengine/simpleprocess.cc:714
#9  0x000000000059389f in sigc::internal::signal_emit0<rtengine::IImage16*, sigc::nil>::emit (impl=0xd1f85b0) at /usr/include/sigc++-2.0/sigc++/signal.h:689
#10 sigc::signal0<rtengine::IImage16*, sigc::nil>::emit (this=0xd1f82a0) at /usr/include/sigc++-2.0/sigc++/signal.h:2681
#11 ProgressConnector<rtengine::IImage16*>::workingThread (this=0xd1f82a0) at /home/phil/code/rawtherapee-git/rtgui/progressconnector.h:86
#12 0x00007ffff6d3d67d in ?? () from /usr/lib/x86_64-linux-gnu/libglibmm-2.4.so.1
#13 0x00007ffff6fd8bc5 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x00007ffff31956fa in start_thread (arg=0x7fffc27fc700) at pthread_create.c:333
#15 0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Floessie commented 8 years ago

Cannot make it crash with both of your methods. :(

@heckflosse Okay, so datalen doesn't seem to be the problem. So the question boils down to: Why is there a race (or something else) on DataList? From what I saw in the code, only that memoryAllocationFailed pattern makes me a bit uneasy. I don't know if those nested GOMP_parallel() calls are problematic, but creation and destruction of wavelet_decomposition instances seem to be placed right.

da-phil commented 8 years ago

@heckflosse & @Floessie Okay I guess I can reliably trigger the crash by 3 simple things:

  1. Set chrominance denoising method to "Manual" and set all 3 parameter below to 0
  2. Set chrominance denoising method to "Preview multi-zones"
  3. Save file (to JPG, TIF etc.)

I have the suspicion that those zeros are used for calculating the preview and lead to weird operations like division by zero or anything like that. Does that make sense?

In case the bug can only be triggered together with other settings I provided a *.pp3 profile which causes the crash: P8010001.ORF.pp3.zip

Here are two more backtraces in case they still help you:

Thread 91170 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffef27e4700 (LWP 6213)]
0x0000000000a48523 in rtengine::ImProcFunctions::Mad (this=<optimised out>, 
    DataList=<optimised out>, datalen=4011312)
    at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2159
2159            histo[min(255, abs(static_cast<int>(DataList[i])))]++;
(gdb) bt full
#0  0x0000000000a48523 in rtengine::ImProcFunctions::Mad (this=<optimised out>, 
    DataList=<optimised out>, datalen=4011312)
    at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2159
        i = <optimised out>
        histo = {4591, 4492, 4613, 4582, 4689, 4628, 4572, 4505, 4476, 4349, 4341, 4353, 4257, 
          4261, 4098, 4084, 4047, 3874, 3945, 3824, 3599, 3622, 3637, 3475, 3322, 3342, 3084, 
          3064, 3004, 2891, 2840, 2690, 2697, 2618, 2529, 2418, 2405, 2221, 2166, 2209, 1996, 
          1993, 1934, 1839, 1735, 1671, 1658, 1592, 1498, 1418, 1471, 1342, 1214, 1255, 1094, 
          1108, 1094, 1042, 1006, 934, 930, 833, 787, 808, 725, 690, 643, 662, 637, 590, 594, 553, 
          518, 524, 489, 432, 432, 445, 388, 372, 365, 348, 311, 296, 320, 280, 280, 288, 234, 
          245, 248, 211, 228, 169, 168, 164, 193, 156, 170, 130, 167, 140, 115, 126, 115, 110, 
          112, 109, 100, 84, 96, 82, 81, 54, 82, 81, 77, 69, 78, 75, 59, 51, 46, 49, 47, 44, 33, 
          50, 55, 35, 38, 42, 31, 29, 27, 18, 28, 20, 29, 23, 23, 16, 20, 14, 22, 18, 22, 16, 21, 
          17, 20, 10, 20, 8, 12, 9, 10, 10, 13, 4, 9, 12, 10, 9, 7, 5, 9, 8, 8, 10, 7, 4, 6, 8, 2, 
          4, 3, 4, 4, 4, 4, 2, 3, 1, 6, 5, 8, 1, 1, 4, 3, 7, 3, 1, 5, 2, 3, 2, 1, 3...}
        median = <optimised out>
        count = <optimised out>
        count_ = <optimised out>
#1  0x0000000000a5261d in rtengine::ImProcFunctions::ShrinkAllAB (madaab=0x0, madCalculated=false, 
    madL=<optimised out>, denoiseMethodRgb=<optimised out>, autoch=<optimised out>, 
    useNoiseCCurve=true, noisevar_ab=0.0199999996, noisevarchrom=0x7ffedc000030, dir=1, level=1, 
    buffer=0x7ffef27e3bc0, WaveletCoeffs_ab=..., WaveletCoeffs_L=..., this=0x7ffeee7db420)
    at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2757
        sfaveabd = 0x7fff24f4f700
        H_ab = 2316
        madab = <optimised out>
        sfaveab = 0x7fff24002130
        WavCoeffs_L = 0x7fff1405c030
        eps = 0.00999999978
        W_ab = 1732
        WavCoeffs_ab = <optimised out>
        mad_L = 6460.82617
#2  rtengine::ImProcFunctions::WaveletDenoiseAllAB ()
    at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2620
        lvl = 1
        dir = 1
        buffer = {0x7fff240020b0, 0x7fff24f4f600, 0x7fff25e9cbd0}
        WaveletCoeffs_L = <optimised out>
        WaveletCoeffs_ab = <optimised out>
        noisevarchrom = <optimised out>
        madL = <optimised out>
        noisevar_ab = <optimised out>
        useNoiseCCurve = <optimised out>
        autoch = <optimised out>
        denoiseMethodRgb = <optimised out>
        this = <optimised out>
        maxlvl = <optimised out>
        maxWL = <optimised out>
        maxHL = <optimised out>
        memoryAllocationFailed = <optimised out>
#3  0x00007ffff35d88ac in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
No symbol table info available.
#4  0x00007ffff31956fa in start_thread (arg=0x7ffef27e4700) at pthread_create.c:333
        __res = <optimised out>
        pd = 0x7ffef27e4700
        now = <optimised out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140732966782720, -4926994844410220113, 0, 
                140732899624703, 8388608, 140733529451904, 4927545710359835055, 
                4926984694572347823}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, 
            data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimised out>
        pagesize_m1 = <optimised out>
        sp = <optimised out>
        freesize = <optimised out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

And another one:

Thread 92117 "rawtherapee" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffef57fa700 (LWP 3644)]
0x0000000000a48523 in rtengine::ImProcFunctions::Mad (this=<optimised out>, 
    DataList=<optimised out>, datalen=4011312)
    at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2159
2159            histo[min(255, abs(static_cast<int>(DataList[i])))]++;
(gdb) bt full
#0  0x0000000000a48523 in rtengine::ImProcFunctions::Mad (this=<optimised out>, DataList=<optimised out>, datalen=4011312) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2159
        i = <optimised out>
        histo = {19849, 19611, 18945, 18030, 16964, 15639, 14304, 13252, 12455, 11285, 10450, 9650, 8945, 8049, 7698, 7025, 6493, 5997, 5761, 5277, 4922, 4534, 4315, 4026, 3843, 3472, 3303, 3126, 2982, 2779, 2640, 2479, 2335, 2241, 2076, 1929, 1801, 1815, 1692, 1531, 
          1504, 1368, 1272, 1240, 1147, 1092, 1080, 943, 907, 893, 810, 769, 686, 683, 635, 598, 562, 522, 459, 491, 456, 425, 397, 348, 300, 307, 305, 268, 245, 224, 241, 230, 204, 178, 183, 174, 166, 161, 140, 142, 95, 95, 105, 87, 81, 95, 75, 85, 69, 63, 46, 49, 50, 
          68, 48, 32, 33, 37, 26, 28, 32, 18, 21, 28, 29, 18, 19, 23, 15, 15, 24, 10, 10, 6, 8, 15, 8, 2, 9, 10, 3, 5, 8, 5, 4, 8, 8, 7, 1, 1, 8, 3, 4, 2, 3, 3, 4, 3, 0, 2, 2, 1, 0, 0, 3, 3, 0, 3, 2, 2, 2, 1, 5, 0, 0, 0, 1, 1, 3, 2, 0, 2, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 
          0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 3, 0, 0, 0, 1, 0, 0...}
        median = <optimised out>
        count = <optimised out>
        count_ = <optimised out>
#1  0x0000000000a5261d in rtengine::ImProcFunctions::ShrinkAllAB (madaab=0x0, madCalculated=false, madL=<optimised out>, denoiseMethodRgb=<optimised out>, autoch=<optimised out>, useNoiseCCurve=true, noisevar_ab=0.0199999996, noisevarchrom=0x7ffedc000030, dir=1, 
    level=0, buffer=0x7ffef57f9bc0, WaveletCoeffs_ab=..., WaveletCoeffs_L=..., this=0x7ffef47f7420) at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2757
        sfaveabd = 0x7fff98f4fd40
        H_ab = 2316
        madab = <optimised out>
        sfaveab = 0x7fff98002770
        WavCoeffs_L = 0x7fffc0002430
        eps = 0.00999999978
        W_ab = 1732
        WavCoeffs_ab = <optimised out>
        mad_L = 3854.9502
#2  rtengine::ImProcFunctions::WaveletDenoiseAllAB () at /home/phil/code/rawtherapee-git/rtengine/FTblockDN.cc:2620
        lvl = 0
        dir = 1
        buffer = {0x7fff980026f0, 0x7fff98f4fc40, 0x7fff99e9d210}
        WaveletCoeffs_L = <optimised out>
        WaveletCoeffs_ab = <optimised out>
        noisevarchrom = <optimised out>
        madL = <optimised out>
        noisevar_ab = <optimised out>
        useNoiseCCurve = <optimised out>
        autoch = <optimised out>
        denoiseMethodRgb = <optimised out>
        this = <optimised out>
        maxlvl = <optimised out>
        maxWL = <optimised out>
        maxHL = <optimised out>
        memoryAllocationFailed = <optimised out>
#3  0x00007ffff35d88ac in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
No symbol table info available.
#4  0x00007ffff31956fa in start_thread (arg=0x7ffef57fa700) at pthread_create.c:333
        __res = <optimised out>
        pd = 0x7ffef57fa700
        now = <optimised out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733017204480, -2843665688640782642, 0, 140733000402687, 8388608, 140736420072016, 2843090642867420878, 2843639631854141134}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, 
              cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimised out>
        pagesize_m1 = <optimised out>
        sp = <optimised out>
        freesize = <optimised out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007ffff2ecbb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
heckflosse commented 8 years ago

@da-phil Thanks for the new backtraces. Unfortunately all the useful information is optimized out :(

@Floessie Now we at least know that the crash occurs at this line:

histo[min(255, abs(static_cast<int>(DataList[i])))]++;

There are only two possible reasons for the crash:

  1. size of Datalist is smaller than datalen
  2. min(255, abs(static_cast<int>(DataList[i]))) gives a value outside [0;255]

To check 1. I placed a printf where the Datalist is allocated and one on Mad(). Allocation of Datalist was always >= datalen

For 2. : @da-phil reports that it's only reproducible using -o3. May it be a compiler bug in gcc 5.4? I use gcc 5.3 and can't reproduce. What do you @Floessie use?

Beep6581 commented 8 years ago

I can reproduce the crash. softproofing branch, I ran RT, Preferences > Performance & Quality, set "Tool mode" to "Expert". Restart RT. File Browser, right-click on amsterdam.pef, "Processing profile operations > Clear", then "Processing profile operations > Apply > (Neutral)", open photo. Enable Noise Reduction, change "Chrominance - Master" from 15 to 0. Change "Method" to "Preview multi-zones", ctrl+s, save immediately as JPEG to /tmp/, OK, crash.

softproofing - RelWithDebInfo https://bpaste.net/show/04fa8f4fbeb0

I'm compiling a master debug build now.

Beep6581 commented 8 years ago

master - debug https://bpaste.net/show/3c47c9c09df4

da-phil commented 8 years ago

@heckflosse Now I can also crash it with a debug build without optimization turned on. Now I also hope the interesting stuff is not optimized out. Here's the build info:

Branch: master
Version: 4.2.1026
Changeset: b04596ec605e7c4febc09c9559b65020306ab249
Compiler: cc 5.4.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V2.24.4
Build type: debug
Build flags: -std=c++11 -std=gnu++11  -Werror=unused-label -fopenmp -Werror=unknown-pragmas -g
Link flags:   
OpenMP support: ON
MMAP support: ON

Backtrace: https://bpaste.net/show/736abf8fe441

heckflosse commented 8 years ago

@Beep6581 Imho your crash is a different one! Can you reproduce it also in master branch?

da-phil commented 8 years ago

@heckflosse can you tell me where exactly you would put those printf statements or even better create a new "testing" branch which can be pulled by @Beep6581 and me to see the output before we crash RT?

heckflosse commented 8 years ago

@da-phil : I'm already inspecting your latest bt full. Thanks for your effort!!!

Ingo

heckflosse commented 8 years ago

@da-phi If I'm not able to detect the reason for the crash by your latest bt (which has a lot more informations) I'll tell you where to put the printfs tomorrow, ok? Give me some time, please!

Ingo

Floessie commented 8 years ago

@heckflosse I'm on 6.1.1.

Morgan's backtrace is interesting, because a NaN is used as index. As you've already said, this is a different problem, but consider the case, that DataList also holds a NaN. Converting that to an int is undefined behaviour and maybe GCC 5 accounts for that by not taking any precautions when it comes to abs(). Speaking of abs(): Are we sure, that this calls abs(int) and not std::abs(float) for some weird reason (e.g. wrong include)?

HTH, Flössie

heckflosse commented 8 years ago

@Beep6581 I can reproduce your crash.

heckflosse commented 8 years ago

3ddb417 fixes the crash @Beep6581 reported

heckflosse commented 8 years ago

@Floessie Now I think I know what happened. As you already wrote, converting NaN to an int is undefined behaviour. If it's done by SSE units the result is 0x80000000 (-2147483648). For this value the result of abs(int) is undefined too. On gcc 5.3 abs(-2147483648) returns -2147483648. This leads to the OOB access. Maybe the crash @da-phil reported is also fixed with my commit. We'll see.

Floessie commented 8 years ago

@heckflosse Ingo, your conclusions are fantastic! :+1: I failed to realize, that 2147483648 won't fit into an int. Now I'm excited, if that really was the case, but it's very likely.

da-phil commented 8 years ago

Commit 3ddb417 seemed to fix the crash, well done @heckflosse!

However I still wonder why the backtrace of @Beep6581 looks different to mine, although the cause for the crash was the same!?

heckflosse commented 8 years ago

@da-phil That's a thing I also wonder about. But do we know whether you used the same settings? Even a small difference in settings can make the crash appear at a different line of code.

heckflosse commented 8 years ago

@da-phil There's at least one thing which could explain the different backtraces: In case you denoised non raw files for example whereas @Beep6581 denoised a raw file

da-phil commented 8 years ago

@heckflosse actually I denoised a raw file too. how did you come to the conclusion I'd denoise a non-raw file? :D

heckflosse commented 8 years ago

@da-phil I didn' come to that conclusion, but it could explain the different backtrace

heckflosse commented 8 years ago

I close the issue now because it seems to be solved by 3ddb417