acts-project / traccc

Demonstrator tracking chain on accelerators
Mozilla Public License 2.0
29 stars 50 forks source link

Clusterization Bugfix, main branch (2024.09.21.) #708

Closed krasznaa closed 2 months ago

krasznaa commented 2 months ago

This PR is just here to help in the explanation of a bug that I'll open an issue about shortly.

stephenswat commented 2 months ago

Looks like you found it. :+1:

krasznaa commented 2 months ago

Indeed. No thanks to cuda-gdb with that one... :frowning: Even though in Debug mode we don't ask for any optimizations from nvcc, the compiled code kept behaving very weirdly still... :confused:

It was the debug SYCL build, ran on the host CPU, that let me finally understand the issue. As in that one gdb-oneapi was actually giving me meaningful values for the variables in the problematic thread. :thinking:

So we may not want to get rid of SYCL all too soon, after everything that's been said recently.

stephenswat commented 2 months ago

Closes #709.

stephenswat commented 2 months ago

Indeed. No thanks to cuda-gdb with that one... 😦 Even though in Debug mode we don't ask for any optimizations from nvcc, the compiled code kept behaving very weirdly still... 😕

Yes I have been seeing this too, cuda-gdb (and to a lesser extent compute-sanitizer) have been very unhelpful lately. I wonder if some update on NVIDIA's end broke the tools or if we're somehow using the wrong compiler flags? :thinking:

krasznaa commented 2 months ago

Maybe we need to use -O0 explicitly now? :thinking: Since CMake doesn't do that automatically.

Host compilers of course don't do any optimizations unless asked for it, but maybe nvcc now adopted icpx-es behaviour, that if you don't ask for anything, it tries to use some aggressive optimization to "help you"... :confused: