ROCm / rocSPARSE

Next generation SPARSE implementation for ROCm platform
https://rocm.docs.amd.com/projects/rocSPARSE/en/latest/
MIT License
116 stars 56 forks source link

Question about reporting performance of AMD MI250 Accelerator #398

Closed pmpakos closed 3 months ago

pmpakos commented 3 months ago

Hello I do not have any issue with rocSPARSE, but I do not know where else to ask...

When writing a simple program and running it on an AMD MI250 GPU, it runs on 1 out of 2 GCDs of the MI250. In rocm-smi I can see two "separate" GPUs existing, and only one of them is occupied during running the benchmark.

Should I rewrite my program to run on the two GCDs, when running a performance benchmark for this GPU, or leave it as it is, underutilizing the available hardware?

What is the direction that you (the developers of rocSPARSE) follow, when reporting performance of rocSPARSE?

Thank you.

ntrost57 commented 3 months ago

Typically, we benchmark using a single GCD. But you could also use 2 GCDs if you prefer.