Open ax3l opened 7 years ago
Using CUDA_KEEP_FILES=ON
and CUDA_SHOW_CODELINES=ON
reveals the error is triggered somewhere around src/picongpu/include/plugins/PhaseSpace/PhaseSpaceFunctors.hpp:131 struct FunctorBlock
(careful, maybe it's also the ptx lines below that: see snippet.txt)
grep -B 200 -A 500 "_ZN5PMacc6nvidia16gpuEntryFunctionINS_[... see above ...]_" build_picongpu/nvcc_tmp/main.ptx > cut.txt
: snippet.txt
quick-hack: reduce shared memory of phase space functor to 16KB via maxShared = 16*1024
(this should reduce the p resolution in terms of bins for the user-selected range by a factor 2 from 1024 to 512 bins)
@PrometheusPi reports that compiling the Bremsstrahlung
example with -G
even works without the above hack.
We have two ways to mitigate this issue:
a) find the underlying issue which part of cuSTL can not be optimized anymore with -G
b) reduce the phase space size in momentum in -G
device-side debug mode (somewhat related to #469)
Compiling with
-DCUDA_NVCC_FLAGS_DEBUG="-g;-G"
unravels the following compile issue:The entry function in the default LWFA example is: