darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.71k stars 1.14k forks source link

coredump with llvm/clang #17578

Closed mabod closed 3 weeks ago

mabod commented 3 weeks ago

Describe the bug

darktable 4.8.1 on Endeavouros build with clang/llvm is crashing.

Steps to reproduce

On endeavouros resp. arch :

  1. In /etc/makepkg.conf add
    
    export CC=clang
    export CXX=clang++
    CFLAGS="-march=x86-64-v3 -O2 -pipe -fno-plt -fexceptions \
       -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security \
        -fstack-clash-protection -fcf-protection" # -fpermissive

CXXFLAGS="$CFLAGS -Wp,-D_GLIBCXX_ASSERTIONS" LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -fuse-ld=lld" LTOFLAGS="-flto=auto" RUSTFLAGS="-C opt-level=2 -C target-cpu=native"


2. `makepkg -si`

3. open darktable and open an image  in darkroom

### Expected behavior

_No response_

### Logfile | Screenshot | Screencast

╰─# coredumpctl info 1024678 PID: 1024678 (darktable) UID: 1000 (matthias) GID: 1000 (matthias) Signal: 11 (SEGV) Timestamp: Wed 2024-10-02 12:06:18 CEST (36s ago) Command Line: darktable Executable: /usr/bin/darktable Control Group: /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-8b8822d0-a6a> Unit: user@1000.service User Unit: vte-spawn-8b8822d0-a6a6-4da3-bfa7-28dbdb6a23a9.scope Slice: user-1000.slice Owner UID: 1000 (matthias) Boot ID: f9c7f7cbbe9b4b31a1b4e0f645726786 Machine ID: 4bd88beaa35549b5922de02c8064cbf1 Hostname: rakete Storage: /var/lib/systemd/coredump/core.darktable.1000.f9c7f7cbbe9b4b31a1b4e0f645726786.1024678.1727863578000000.zst> Size on Disk: 110.3M Message: Process 1024678 (darktable) of user 1000 dumped core.

            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Module [dso] without build-id.
            Stack trace of thread 1024690:
            #0  0x0000000000000000 n/a (n/a + 0x0)
            #1  0x0000000000000000 n/a (n/a + 0x0)
            ELF object binary architecture: AMD x86-64

Console output: [darktable-crash-console-output.txt](https://github.com/user-attachments/files/17227934/darktable-crash-console-output.txt)

Backtrace file: [darktable_bt_6ANRU2.txt](https://github.com/user-attachments/files/17227917/darktable_bt_6ANRU2.txt)

### Commit

_No response_

### Where did you obtain darktable from?

downloaded from www.darktable.org

### darktable version

4.8.1

### What OS are you using?

Linux

### What is the version of your OS?

Endeavouros (rolling)

### Describe your system?

System: Kernel: 6.6.53-273.1-tkg-eevdf arch: x86_64 bits: 64 Desktop: GNOME v: 47.0 Distro: EndeavourOS Machine: Type: Desktop System: Gigabyte product: X570 AORUS ULTRA v: -CF serial: Mobo: Gigabyte model: X570 AORUS ULTRA serial: UEFI: American Megatrends LLC. v: F38 date: 03/22/2024 CPU: Info: 12-core model: AMD Ryzen 9 5900X bits: 64 type: MT MCP cache: L2: 6 MiB Speed (MHz): avg: 3597 min/max: 550/4951 cores: 1: 3597 2: 3597 3: 3597 4: 3597 5: 3597 6: 3597 7: 3597 8: 3597 9: 3597 10: 3597 11: 3597 12: 3597 13: 3597 14: 3597 15: 3597 16: 3597 17: 3597 18: 3597 19: 3597 20: 3597 21: 3597 22: 3597 23: 3597 24: 3597 Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Navi 23 [Radeon RX 6650 XT / 6700S 6800S] driver: amdgpu v: kernel Device-2: Logitech Brio 100 driver: snd-usb-audio,uvcvideo type: USB Display: x11 server: X.org v: 1.21.1.13 with: Xwayland v: 24.1.2 driver: X: loaded: amdgpu dri: radeonsi gpu: amdgpu resolution: 1: 2560x1440~60Hz 2: 1920x1080~60Hz API: EGL v: 1.5 drivers: kms_swrast,radeonsi,swrast platforms: gbm,x11,surfaceless,device API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.2.3-arch1.1 renderer: AMD Radeon RX 6650 XT (radeonsi navi23 LLVM 18.1.8 DRM 3.54 6.6.53-273.1-tkg-eevdf)



### Are you using OpenCL GPU in darktable?

Yes

### If yes, what is the GPU card and driver?

AMD Radeon RX 6650 XT

### Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

_No response_
ralfbrown commented 3 weeks ago

Unfortunately there is no line-number information in the backtrace, but it seems to be crashing while firing off parallel threads in OpenMP to process the for() loop in dt_view_image_get_surface. Can you try running

OMP_THREAD_LIMIT=1 OMP_NUM_THREADS=1 darktable -t 1

(this will be slow since parallelized code will only be using one core)

mabod commented 3 weeks ago

No, that does not work.

╰─# OMP_THREAD_LIMIT=1 OMP_NUM_THREADS=1 darktable -t 1
     0.0011 [dt_init --threads] using 1 threads for openmp parallel sections

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[New LWP 1091371]
...
backtrace written to /tmp/darktable_bt_VEVIU2.txt
zsh: segmentation fault (core dumped)  OMP_THREAD_LIMIT=1 OMP_NUM_THREADS=1 darktable -t 1
ralfbrown commented 3 weeks ago

That means the problem is almost certainly an incompatibility between one of your custom compilation options and the OpenMP library, since we've ruled out mulitple threads somehow clobbering each other. Start by building with CFLAGS and LDFLAGS commented out and (if that works) adding back one option at a time.

mabod commented 3 weeks ago

I did compile with one compiler option at a time and the coredump occurs with option "-fno-plt". This is reproducible.

This is what gcc documentation says about it (https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#index-fno-plt)

-fno-plt

    Do not use the PLT for external function calls in position-independent code. Instead, load the 
callee address at call sites from the GOT and branch to it. This leads to more efficient code by 
eliminating PLT stubs and exposing GOT loads to optimizations. On architectures such as 32-bit 
x86 where PLT stubs expect the GOT pointer in a specific register, this gives more register 
allocation freedom to the compiler. Lazy binding requires use of the PLT; with -fno-plt all external 
symbols are resolved at load time.

    Alternatively, the function attribute noplt can be used to avoid calls through the PLT for specific 
external functions.

    In position-dependent code, a few targets also convert calls to functions that are marked to not 
use the PLT to use the GOT instead.

But I did not find this option in the clang documentation

mabod commented 3 weeks ago

Now that I have a working llvm compiled darktable I did a benchmark and checked the time spend in pixelpipe with gcc vs. clang.

clang creates the slower binary with average= 3.520 secs +/- 0.064 secs (time in pixelpipe) gcc binary is faster: average= 2.997 secs +/- 0.002 secs (time in pixelpipe)

Anyways, you can close this issue if you want or you can use it for further clang discussion. I am good.

victoryforce commented 3 weeks ago

clang creates the slower binary with average= 3.520 secs +/- 0.064 secs (time in pixelpipe) gcc binary is faster: average= 2.997 secs +/- 0.002 secs (time in pixelpipe)

Yes, and for the darktable code, such a relationship of speeds has been observed for a long time. clang tends to compile slightly faster with the same compile options, but always the clang binary is noticeably (outside the statistical error) slower.

Anyways, you can close this issue if you want or you can use it for further clang discussion. I am good.

You can further discuss clang here regardless of the issue status, but since this is not a darktable bug, I'm closing the issue.