Closed harrisonsz closed 4 weeks ago
I've tested on RX7900XTX, there is still no improvement.
Hi @harrisonsz, internal ticket has been created to investigate this issue. Thanks!
Hi @harrisonsz, thanks for pointing this out! Unfortunately, this is a known issue. I actually see a performance loss with the graph version of your code on ROCm 6.2 with a 7900XTX! Scaling the problem up to N = 1024 1024 100, the graph version outperforms the stream version by only 2%.
While I don't think we have any public-facing documentation about this, hipGraph
currently does not provide as much of an advantage as CUDA graphs. We're working on improving this, although I am not aware of any definite timelines. I'll reach out to our internal teams to see if they have any additional information.
A quick update from the internal team: we have been making a lot of good progress with hipGraph performance, but we've been focused on the MI300 so many of the improvements at the moment are only seen there for now, and not on Radeon systems like yours or my repro system.
Thank you for your reply. Do you have plans to also improve hipGraph on Radeon systems in the future?
Checking with the internal team to find out what our plans are on the Radeon front, I'll update here when I have that information.
We have plans for hipGraph
performance improvement in general, but nothing targeting specific architectures, so how much of this improvement will be seen on Radeon systems is unknown at this time. I expect hipGraph
performance on Radeon to improve in the future, although not necessarily as fast or as much as on MI cards.
Problem Description
GPU: RX6400 (I cannot find this model in all given GPU options)
I was trying to use hipGraph instead of hipStream to accelerate some computation. I find that the difference between performance using stream and graph is minor. I've tested the same program in a cuda manner using Nvidia's GPU and there was significant improvement, so I know for certain that my program was correctly written. My program run on Rocm 5.6.0, then I upgraded it to 5.7.0 and there was no difference in terms of performance. I wonder in which version of Rocm there is some optimization on hipGraph. Also, since I'm using a relatively outdated amd GPU - RX6400, I wonder if hipGraph can only have siginificant influence on some certain models.
Operating System
Ubuntu 22.04.3 LTS(Jammy Jellyfish)
CPU
11th Gen Intel(R) Core(TM) i5-11400
GPU
AMD Radeon VII
ROCm Version
ROCm 5.7.0
ROCm Component
clr, HIP
Steps to Reproduce
I wrote two simple programs to test performance. One uses stream, and another uses graph. I made them txt because github doesn't allow me to upload cpp files. Simply change them to cpp, compile and run the two programs to see the output. hip_only_stream.txt hip_using_graph.txt
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response