Closed jglaser closed 6 months ago
Hi @jglaser, apologies for the lack of response. Just want to confirm if this issue has been fixed with latest ROCm 6.0.2 (HIP 6.0.32831). Thanks,
Closing the issue as it is stale and also no response from @jglaser. Please re-open if this issue still exists with latest ROCm 6.0.2 (HIP 6.0.32831). Thanks.
When shared libraries containing GPU kernels are loaded (in python using
import ..
) after the GPU context has already been initialized and kernels have already been launched, HIP is unaware of the newly loaded kernels. The following python script (requiring HOOMD-blue (hip branch) and hoomd-benchmarks (next branch))demonstrates this issue. This may not be a minimal reproducer but it is as minimal as I can currently provide.I get the following error (after some output confirming that the first part of the script executes successfully)
If I uncomment the highlighted line, i.e., load the library
hpmc
containing the additional kernel symbols before executing any other kernel, the error goes away.I observed this behavior with a
Vega 20
GPU on a custom HIP branch, but it should also be reproducible with HIP 2.10 ormaster
.