ROCm / HIP

HIP: C++ Heterogeneous-Compute Interface for Portability
https://rocmdocs.amd.com/projects/HIP/
MIT License
3.76k stars 539 forks source link

IPC Support #1005

Closed FinnStokes closed 5 months ago

FinnStokes commented 5 years ago

I am attempting to port existing code in my research group from CUDA to HIP so that we can run it on AMD devices. Using hipify-clang, the conversion went quite smoothly, but when I try to compile it with hipcc I get a number of errors about the IPC calls, for example:

kernel_gpu.hip.cpp:255:26: error: variable has incomplete type 'ihipIpcEventHandle_t'
    ihipIpcEventHandle_t sent_eventhandle[buf_n];
                         ^
/opt/rocm/hip/include/hip/hcc_detail/hip_runtime_api.h:92:8: note: forward declaration of 'ihipIpcEventHandle_t'
struct ihipIpcEventHandle_t;
       ^
kernel_gpu.hip.cpp:256:26: error: variable has incomplete type 'ihipIpcEventHandle_t'
    ihipIpcEventHandle_t recvd_eventhandle[buf_n];
                         ^
/opt/rocm/hip/include/hip/hcc_detail/hip_runtime_api.h:92:8: note: forward declaration of 'ihipIpcEventHandle_t'
struct ihipIpcEventHandle_t;
       ^
kernel_gpu.hip.cpp:284:53: error: invalid application of 'sizeof' to an incomplete type 'ihipIpcEventHandle_t'
                comm_sendrecv((void *)&(sent_eventhandle[buf_xl]),sizeof(ihipIpcEventHandle_t),Nxp, (void *)&(recvd_eventhandle[buf_xr]), sizeof(ihipIpcEventHandle_t),Nxm,0);
                                                                  ^     ~~~~~~~~~~~~~~~~~~~~~~
/opt/rocm/hip/include/hip/hcc_detail/hip_runtime_api.h:92:8: note: forward declaration of 'ihipIpcEventHandle_t'
struct ihipIpcEventHandle_t;
       ^

Is IPC not supported by the hcc backend of HIP? https://rocm.github.io/ROCmMultiGPU.html suggests that it should be supported, but these errors, and the following section of hip_runtime_api.h seem to indicate otherwise:

//TODO: IPC implementation

#define hipIpcMemLazyEnablePeerAccess 0

typedef struct ihipIpcMemHandle_t *hipIpcMemHandle_t;

//TODO: IPC event handle currently unsupported
struct ihipIpcEventHandle_t;
typedef struct ihipIpcEventHandle_t *hipIpcEventHandle_t;

Or is it just the IPC event handle that is unsupported, and I should refactor the code to not rely on that part of the API?

emankov commented 5 years ago

@bensander, it looks like IPC is currently not supported by HIP. So. I've just erroneously added such support in hipify-clang.

philomat commented 3 years ago

I am encountering a similar problem with IPC support, also while porting a CUDA application to HIP. Is there any progress on this issue?

ppanchad-amd commented 6 months ago

@emankov Is IPC supported on the latest ROCm 6.1.0 (HIP 6.1)? Thanks!

emankov commented 6 months ago

IPC is supported to some degree by HIP and HIPIFY tools. hipIpcEventHandle_t is supported by both since HIP 3.5.0. As for correctness, + @mangupta.