Closed matinraayai closed 1 week ago
@matinraayai Internal ticket has been created to investigate this issue. Thanks!
Hello @matinraayai,
Thanks for the clear issue report! However, the way this code is written is intentional.
If you notice this line:
executables[((ExecutableImpl*)executable)->id()] = nullptr;
in the DestroyExecutable function, you will see that the executables array is indexed into with the id() of the deleted object. So, if you removed this entry from the array entirely, you would need to reindex the ids of all subsequent entries before the next time this function is used. The entry is instead set to nullptr to avoid this reindexing step.
Does encountering a nullptr when iterating over the executables cause a problem?
Hello @matinraayai,
Thanks for the clear issue report! However, the way this code is written is intentional.
If you notice this line:
executables[((ExecutableImpl*)executable)->id()] = nullptr;
in the DestroyExecutable function, you will see that the executables array is indexed into with the id() of the deleted object. So, if you removed this entry from the array entirely, you would need to reindex the ids of all subsequent entries before the next time this function is used. The entry is instead set to nullptr to avoid this reindexing step.
Is indexing over a vector the best way to go? It seems a set or an unordered map might do a better job here; The only usages I've seen for executables
are either iterations over all entries, or removal of an entry (which the map data structures make easier). But this is an implementation detail.
Does encountering a nullptr when iterating over the executables cause a problem?
I believe so; The vendor loader API doesn't say anywhere that it might return hsa_executable_t{0}
and the user has to check to make sure the handles are valid.
It seems all other loader functions that iterate over executables, they check if the iteration element is a nullptr
before proceeding; I think the same check should be added to the executable iteration function of the loader API.
The vendor loader API doesn't say anywhere that it might return hsa_executable_t{0} and the user has to check to make sure the handles are valid. It seems all other loader functions that iterate over executables, they check if the iteration element is a nullptr before proceeding; I think the same check should be added to the executable iteration function of the loader API.
This is a good point, I am checking in with our internal team to see if this nullptr check can be patched in, or some documentation added.
Hi @matinraayai, a nullptr check has been added for consistency with other loader functions that iterate over executables - you'll see this code change in a future release of ROCm.
Thanks for bringing this up!
Problem Description
The destruction of an HSA
Executable
usinghsa_executable_destroy
will result in the loader deleting theExecutable
object previously allocated. However, instead of erasing the pointer in thestd::vector
ofExecutable
s currently loaded, it equates it to anullptr
: https://github.com/ROCm/ROCR-Runtime/blob/3f6ffc5b1167a43dc5e169db85655182a4c5947c/src/loader/executable.cpp#L242-L257After the destruction of any number of Executables, if one invokes the
hsa_ven_amd_loader_iterate_executables
function to iterate over all executables in HSA, the iteration also returns thenullptr
that was placed instead of the destroyed executables:https://github.com/ROCm/ROCR-Runtime/blob/3f6ffc5b1167a43dc5e169db85655182a4c5947c/src/loader/executable.cpp#L259-L276
This does not seem to be expected behavior. cc: @kzhuravl
Operating System
22.04
CPU
N/A
GPU
AMD Instinct MI100
Other
No response
ROCm Version
ROCm 6.0.0
ROCm Component
ROCR-Runtime
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response