Closed weqoll closed 1 year ago
Dear @weqoll 0.6.0 0.5.0 is an ancient version, I suggest testing at least 0.6.0 or the current dev branch. The dev will soon be released as 0.7.0. To switch the version mostly minor param changes are required.
Note that we do not support tat the moment to compile PIConGPU with spack, you can still manage your dependencies but we are currently not updating our spack recipe.
A possible problem with why this error is showing up could be that you compile for the wrong compute architecture. A driver issue is possible too. Running out of memory is possible too because you run the Xserver on all your GPUs too. If the Xserver is active all kernels running longer than 13 seconds under Linux will be killed which is crashing the simulation. I suggest disabling the Xserver and using this machine via a terminal without a GUI only. The XServer is the most likely root of your issue and could explain why some simulations passed and other crashing.
@weqoll any status updates? Did you encounter further issues? Did it work? If the suggestions by @psychocoderHPC resolved your problems, please remember to close the issue ;), thanks!
Sorry about holding this issue in such state, I wasn't able to work it around for some time. I'll reopen this issue when such if necessary.
Hello everyone!
My program based on PIConGPU 0.5.0 sometimes get error message like this during calculations:
Main issue with the lowest line of this message about encountered illegal instructions. This error reproduces randomly. There are no common things between the calculations in which I came across the error.
Firstly I headed with this one during long-time calculations, so I thought about hardware issues with temperature of my GPUs. Trying to debug this thing I can't get any successful results in discovering root of my issue. Moreover, now this error reproduces in fast calculations, such as displayed above.
GPUs state during the terminating:
Main thing that I can get is probabilty of memory overflow. However, three identical calculations I launched before and they ended successfully. Is there something about memory leaks?
My program calculate the interaction within laser pulse and neutral argon gas with ionization during pulse propagation. Ionization implemented in PIC-code with ADKLin model. PIConGPU version is 0.5.0
Could you help me with this one? Maybe you have some experience with debugging such issues. Thanks for your help in advance!
Best regards, Egor Astashkin