Open kerwenwwer opened 2 months ago
Hi,
This would be very welcome indeed! The reason to use cublas is 2-fold:
Now my question is: Do you happen to know whether the hipblas is as efficient on NVidia as cublas is? One way would be to simply benchmark. If it is not, we might have to maintain 2 codepaths: hipblas for AMD and cublas for NVidia. It would of course be much cleaner if the same implementation was optimal for both vendors.
Thank you for your reply. I believe that hipblas is quite similar to CUDA in terms of stress pressure on GPU cards. What I've done so far is simply port your benchmark flow from CUDA API to HIP API. The main changes I made were:
nvidia-smi
to rocm-smi
)cuParamSetSize
before launching a kernel function; instead, we use cuLaunchKernel
.The results of my current tests using hipblas with the same algorithm show that it can easily fully utilize the entire GPU on AMD MI210. So I think that we use hipblas and cublas simultaneously is not the problem (the way to write the code is basically the same, the only difference is the API name).
However, there are some considerations for merging the code base:
Given these factors, if you want to merge the code bases, you may need to refactor gpu-burn. A better solution might involve separating the control and compute functions and using a build system that can accommodate both CUDA and HIP compilations.
Hello,
I've created an AMD-compatible fork of gpu-burn using HIP (Heterogeneous-Compute Interface for Portability). This version supports both NVIDIA and AMD GPUs, expanding the tool's utility across different hardware platforms.
Key features of the AMD-compatible version:
You can find the AMD-compatible version here: https://github.com/kerwenwwer/amd-gpu-burn
I'm interested in discussing the possibility of merging these changes back into the main repository to provide official support for both NVIDIA and AMD GPUs. I think it's best for the community 😁