Replicated Heap for the HIP module

StanfordLegion / legion

The Legion Parallel Programming System

https://legion.stanford.edu

Apache License 2.0

676 stars 145 forks source link

Replicated Heap for the HIP module #1476

Open eddy16112 opened 1 year ago

eddy16112 commented 1 year ago

The hipHostRegister can not guarantee the same virtual address between host and gpu memory. However, the current implementation of ReplicatedHeap of the CUDA module relies on the UVA, so we can not easily copy paste the implementation from the CUDA module into the HIP module. This will affect the MultiAffineAccessor for now.

muraj commented 1 year ago

@eddy16112 please keep in mind that NVIDIA CUDA cannot guarantee the same CPU and GPU address either, it just happens to be the case for the majority of Linux systems. See CU_DEVICE_ATTRIBUTE_CAN_USE_HOST_POINTER_FOR_REGISTERED_MEM for more information. We should not assume the pointer passed to cuMemHostRegister is the same as the gpu one for portability reasons.

eddy16112 commented 1 year ago

Yeah, I think we need to check if the addresses are identical, and if not, we will need to raise an error.

muraj commented 1 year ago

But you nor the user can control this, you could get intermittent errors. Why do these need to be the same address? Why not just do the translation before using the address on the gpu? You cannot assume the same address across non-local CPU processors, so can we not use the same logic between local GPU and CPU processors?

eddy16112 commented 1 year ago

That will be a question for @streichler , AFAIK, without unified address, it will be much more complicated to implement the replicated heap.

elliottslaughter commented 6 months ago

It appears that unified addresses are supported with ROCm 5.6.0 and above.

elliottslaughter commented 6 months ago

Is this resolved or is there more to do here?