Closed piratenpanda closed 2 years ago
perhaps the issue is related to your "-dirty" build.
remove the build directory and the executables directory and rebuild.
Does not change the version name and darktable instantly crashed afterwards so I'm afraid that's not the problem.
just had another crash where bilateral.cc was the last item:
#0 0x00007fff70c20908 in process._omp_fn.1(void) ()
at /home/panda/Downloads/dtcompile/darktable/src/iop/bilateral.cc:224
pos = {16.9730854, 180.494263, -nan(0x400000), 1.06619692, 33.872673}
val = {-nan(0x400000), 0.213239372, 0.169363365, 1}
i = 83
in = 0xffc000003f049e60
thread = 3
index = 18428729679490842625
j = 456156224
ivoid = <optimized out>
roi_in = 0x7fffc5fd6520
ch = <optimized out>
Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0xffc00000ffc00010
#1 0x00007ffff74cf6c6 in gomp_thread_start (xdata=<optimized out>)
at /build/gcc/src/gcc/libgomp/team.c:125
team = 0x7fffb001f310
task = 0x7fffb001fad8
data = <optimized out>
pool = 0x7fffb001f240
local_fn = 0x7fff70c207f0 <process._omp_fn.1(void)>
local_data = 0x7fffc5fd5fa0
#2 0x00007ffff753c259 in start_thread () at /usr/lib/libpthread.so.0
#3 0x00007ffff21e45e3 in clone () at /usr/lib/libc.so.6
if the version name still says "dirty", nothing was accomplished and you have not confirmed that the problem remains
I removed /opt/darktable and did a complete fresh git checkout, submodule init and changed to the CR3 branch. What else do you suggest to not have remains from older builds. I don't see what I can do tbh
something is failing if you still have a "dirty" build.
It also shows "dirty" for https://aur.archlinux.org/packages/darktable-cr3-git so I don't know where this should come from
It’s dirty because of switch to CR3 rawspeed branch.
This issue did not get any activity in the past 60 days and will be closed in 365 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.
Still happens with a clean compilation from latest git and happens frequently when using the surface blur module
0x00007fffa5380908 in process._omp_fn.1(void) () at /home/panda/Downloads/dtcompile/darktable/src/iop/bilateral.cc:224
224 float pos[5] = { i * sigma[0], j * sigma[1], in[0] * sigma[2], in[1] * sigma[3], in[2] * sigma[4] };
(gdb) bt full
#0 0x00007fffa5380908 in process._omp_fn.1(void) ()
at /home/panda/Downloads/dtcompile/darktable/src/iop/bilateral.cc:224
pos = {223.440048, 37.7170868, nan(0x400000), 16.0017223, 14.6885347}
val = {nan(0x400000), 0.0800086111, 0.0734426752, 1}
i = -1077966591
in = 0x7fc000007fc00010
thread = 1
index = 2143289345
j = 1722680320
ivoid = <optimized out>
roi_in = 0x7fffcd7d7900
ch = <optimized out>
Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x7fc000007fc00010
#1 0x00007ffff73026c6 in gomp_thread_start (xdata=<optimized out>)
at /build/gcc/src/gcc/libgomp/team.c:125
team = 0x7fffb801d950
task = 0x7fffb801df68
data = <optimized out>
pool = 0x7fffb801d880
local_fn = 0x7fffa53807f0 <process._omp_fn.1(void)>
local_data = 0x7fffcd7d7380
#2 0x00007ffff72ce259 in start_thread () at /usr/lib/libpthread.so.0
#3 0x00007ffff1eed5e3 in clone () at /usr/lib/libc.so.6
I am wondering if this might be related to the change #6158.
Before that change, omp_get_max_threads() was called here and space for that many threads was allocated within the PermutohedralLattice object. Then in the following for loop omp_get_thread_num() would be called and return a value from 0 to omp_get_max_threads() - 1.
But after the #6158 change, omp_get_num_procs() is called instead, from dt_get_num_threads(). While that value would typically be the same as omp_get_max_threads(), it's not guaranteed. If it does return a value less than omp_get_max_threads, for whatever reason, then not enough space will be allocated within the PermutohedralLattice object on the stack, and some nearby local variables on the stack will be overwritten resulting in a crash similar to the ones above.
I can force a crash like this by running darktable with the command line parameter "-t N" to set the max threads. For example I have a 6-core hyperthreaded CPU so omp_get_num_procs and omp_get_max_threads both return 12. If I run darktable with "-t 13", it crashes when I enable the surface blur module.
Reverting this change makes dt crash as soon as I enable the surface blur module but with related/same errors.
#0 0x00007fff9a28698f in process._omp_fn.1(void) () at /home/panda/Downloads/dtcompile/darktable/src/iop/bilateral.cc:227
i = 1311
in = 0x7fff1c927200
thread = 2
index = 754460
j = 2145180196
ivoid = <optimized out>
roi_in = 0x7fdcda247fdcda24
ch = <optimized out>
sigma = {0.413977683, 0.413977683, 200, 200, 200}
lattice =
{nData = 1160720, nThreads = 4, scaleFactor = 0x7fffb403ee00, canonical = 0x7fffb40416a0, replay = 0x7fff44670010, hashTables = 0x7fffb4043b88}
Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x120
lattice = #1 0x00007ffff73026c6 in gomp_thread_start (xdata=<optimized out>) at /build/gcc/src/gcc/libgomp/team.c:125
team = 0x7fffb401d950
task = 0x7fffb401e040
data = <optimized out>
pool = 0x7fffb401d880
local_fn = 0x7fff9a286800 <process._omp_fn.1(void)>
local_data = 0x7fffcdfd8380
#2 0x00007ffff72ce259 in start_thread () at /usr/lib/libpthread.so.0
#3 0x00007ffff1eed5e3 in clone () at /usr/lib/libc.so.6
Could this potentially be an openMP issue?
Doesn't seem to happen without OpenCL. At least I haven't managed to get it to crash without it
As per https://discuss.pixls.us/t/amd-opencl-problems-in-surface-blur-darktable-module/28507/13 it seems like NaNs in RCD which cause green artifacts for me will make the surface blur module crash. I'll open another issue for that
Describe the bug/issue While working on pictures, switching to another picture or just opening the first image, darktable crashes with the same error for me since a while in Permutohedral.h:485 (see attached stacktrace)
To Reproduce I really don't know how to reproduce, it seems to happen randomly to my obsservation. When restarting darktable and running the same steps as what lead to the crash does not crash darktable again.
Stack trace
Platform
Additional context