STEllAR-GROUP / octotiger

Astrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees
http://octotiger.stellar-group.org/
Boost Software License 1.0
48 stars 17 forks source link

Work around segfaults on Intel GPUs #486

Closed G-071 closed 5 months ago

G-071 commented 5 months ago

I encountered segfaults within the OneAPI runtime when trying to run Octo-Tiger on an Intel GPU Max 1100. This seems to happen when we call too many kernels asynchronously (or in parallel) when the first kernel is not yet finished (which is basically normal behavior for Octo-Tiger as it launches a massive amount of compute kernels). My best guess is that something gets initialized within the runtime and needs to be done by the time more kernels are being called.

An easy workaround is to simply call some empty (and synchronous) dummy kernels right at the beginning of Octo-Tiger. Curiously, this is required once per library (hydrolib and octolib) -- however, this workaround resolves the issue entirely, and we can finally run Octo-Tiger properly on the Intel GPUs! Similar workarounds might be required for other HPX applications trying to use OneAPI though (i.e. have one synchronous kernel at the start).