ddemidov / vexcl

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
http://vexcl.readthedocs.org
MIT License
702 stars 82 forks source link

Workaround for AMD SI GPUs #255

Closed ddemidov closed 6 years ago

ddemidov commented 6 years ago

Closes #254

The workaround (described in #254) may be enabled by either turning the VEXCL_AMD_SI_WORKAROUND cmake option ON, or defining VEXCL_AMD_SI_WORKAROUND preprocessor macro directly.

When enabled, and if the device name is 'Hainan', it launches a dummy kernel after the main one.

skn123 commented 6 years ago

Hainan name should not be mandatory. There may be other names also pertaining to AMD SI GPU's. It would be better to leave it to the prerogative of the user to check his/her graphic card before enabling that. BTW, how do I access this change? I can seem to do a git reset --hard for this commit. It says naths@naths-HP-Laptop-15-bs1xx:~/srcs/vexcl$ git reset --hard c42098745f5869d9d52f0b1760c53f44f95554d8 fatal: Could not parse object 'c42098745f5869d9d52f0b1760c53f44f95554d8'

https://wiki.gentoo.org/wiki/AMDGPU If you want to go with a naming convention.

ddemidov commented 6 years ago

how do I access this change?

This should work:

git fetch origin
git checkout amd-si-workaround

Hainan name should not be mandatory. There may be other names also pertaining to AMD SI GPU's.

We need to somehow select the affected GPUs though. What if you have both an SI and non-SI GPUs installed on the same system? Or, you package your compiled binary and you don't know what GPUs your users will have?

codecov[bot] commented 6 years ago

Codecov Report

Merging #255 into master will decrease coverage by 0.39%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #255     +/-   ##
=========================================
- Coverage   88.51%   88.11%   -0.4%     
=========================================
  Files         129      128      -1     
  Lines       15241    15235      -6     
=========================================
- Hits        13491    13425     -66     
- Misses       1750     1810     +60
Impacted Files Coverage Δ
vexcl/backend/opencl/source.hpp 0% <ø> (ø) :arrow_up:
vexcl/backend/opencl/kernel.hpp 57.77% <100%> (+1.96%) :arrow_up:
vexcl/fft/kernels.hpp 84.76% <0%> (-13.58%) :arrow_down:
tests/fft.cpp 95.94% <0%> (-4.06%) :arrow_down:
tests/scan_by_key.cpp 93.84% <0%> (-3.08%) :arrow_down:
vexcl/fft/plan.hpp 89.1% <0%> (-2.57%) :arrow_down:
vexcl/backend/jit/compiler.hpp 84.61% <0%> (-0.57%) :arrow_down:
vexcl/backend/jit/kernel.hpp 96.66% <0%> (-0.21%) :arrow_down:
vexcl/sort.hpp 97.66% <0%> (-0.17%) :arrow_down:
vexcl/backend/jit/device_vector.hpp 100% <0%> (ø) :arrow_up:
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c0f9213...1ce08cb. Read the comment docs.

ddemidov commented 6 years ago

Ok, I went ahead and removed the check for the device name. Now the workaround is always enabled as long as VEXCL_AMD_SI_WORKAROUND is defined.