Request: implement hipOccupancyMaxPotentialBlockSize for AMD GPUs

ROCm / HIP

HIP: C++ Heterogeneous-Compute Interface for Portability

https://rocmdocs.amd.com/projects/HIP/

MIT License

3.54k stars 518 forks source link

Request: implement hipOccupancyMaxPotentialBlockSize for AMD GPUs #924

Open kingcrimsontianyu opened 5 years ago

kingcrimsontianyu commented 5 years ago

Occupancy calculator API is an invaluable asset in CUDA. Unfortunately hipOccupancyMaxPotentialBlockSize is only exposed to Nvidia GPUs for the time being. It would be immensely helpful if it is implemented for AMD GPUs.

whchung commented 5 years ago

I second this request.

Created an internal ticket SWDEV-180694 to track it. It'd be highly desirable to have this API implemented so machine learning frameworks can properly schedule available GPU resources efficiently.

whchung commented 5 years ago

relevant code in TensorFlow:

https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/core/util/gpu_launch_config.h#L165

Without this function implemented in HIP the grid / block size selection on AMD hardware would always be sub-optimal.

jeffdaily commented 4 years ago

I think this can be closed as of ROCm 2.7?
https://github.com/ROCm-Developer-Tools/HIP/blob/854768787ee9bbd6ed22b3e8fd0f139955a57e6a/src/hip_module.cpp#L1015

jedbrown commented 2 years ago

The HIP implementation is not comparable to the corresponding CUDA function, which takes a function so that the dynamic shared memory can be a function of the block size.

Cc: @nbeams

nbeams commented 2 years ago

I would further clarify that we would like a HIP version for the driver API function cuOccupancyMaxPotentialBlockSize, which I believe corresponds to cudaOccupancyMaxPotentialBlockSizeVariableSMem in the runtime API.

jedbrown commented 2 years ago

I see this was left as a TODO in https://github.com/ROCm-Developer-Tools/HIP/pull/1943/files#diff-9ec4991aeca8528b60eaf6d00b089eecda171d49742e348561c957c5fa2000feR1328-R1342

@gargrahul Can you suggest a workaround?

0x0015 commented 2 months ago

Hello, I was wondering if this is still being worked on? It's been 2 years since last update here, and unless I have pretty bad user error, it's still not working (somehow breaking calls that occur before I even call it)

ppanchad-amd commented 2 months ago

@kingcrimsontianyu @0x0015 Please test with latest ROCm 6.1.0 (HIP 6.1)? Thanks!

ppanchad-amd commented 4 days ago

@0x0015 Have you tried with the latest ROCm 6.1.2? Thanks!