Open vetter opened 10 years ago
Here is mike's email from 2/3/14:
Hi Jeff,
Mayank Daga, who was sitting across from you during the meeting on Thursday, has been working on the SHOC benchmark suite and noticed that the wave-front/warp size is set to 32 irrespective of the underlying platform. This ends up favoring performance on NVIDIA GPUs, but hurting performance on AMD GPUs and APUs, since NVIDIA GPUs have a warp size of 32 and AMD GPUs have a wavefront size of 64. As an example, changing the wave-front size from 32 to 64 increased the performance by almost 80% for SPMV on the AMD Tahiti discrete GPU.
There is a simple code-change/fix to handle this issue. OpenCL exposes an API called clGetDeviceInfo() which enables one to query the vendor of the device using the CL_DEVICE_VENDOR parameter. The vendor-name can be queried before setting the wave-front size, as shown below:
If Vendor = “AMD” VECTOR_SIZE = 64; else if Vendor = “NV” VECTOR_SIZE = 32;
This should help make the SHOC benchmarks run well on both types of architectures and enable better performance comparisons. It should also be feasible to extend the code above for other architectures.
Do you think it would be fine to make this change? Mayank has offered to help make the change, if you are interested.
Thanks for considering this request. I have copied Mayank and Jonathan on this email in case you have any questions.
Mike
Has this issue been addressed yet or is help needed fixing it? best -
peter, we are working on it but have not released any changes yet (until we do some performance testing).
Jeffrey S. Vetter | +1-865-356-1649tel:+1-865-356-1649 | http://ft.ornl.gov/~vetter | Sent from my mobile
On Jul 28, 2014, at 3:28 AM, "Peter Steinbach" notifications@github.com<mailto:notifications@github.com> wrote:
Has this issue been addressed yet or is help needed fixing it? best -
Reply to this email directly or view it on GitHubhttps://github.com/vetter/shoc/issues/32#issuecomment-50307495.
I have addressed this issue (commit 0b05ea5dca) for Spmv as suggested by some of the AMD folks. We may need to check some of the other kernels more carefully to see if they would also be affected.
Are these findings http://www.phoronix.com/scan.php?page=article&item=gpu-pro-opencl&num=2 Related to this issue, i.e. That other parts of shoc require checking for AMD specific upgrades?
If I understand from the comments in the article referring to the FFT SP test, it seems like Michael has set SHOC to run with -s 1 which likely underutilizes newer, larger GPUs. So while we probably should set the default size to be "-s 4" for this test case, it doesn't seem to be specific to AMD GPUs.
ok, apparently Michael will look into this as well. Just out of curiosity, why is this issue still open after almost 2 years?
see vetter's email from AMD