Closed GoogleCodeExporter closed 9 years ago
[deleted comment]
Also I am aware of issue 27, but it seems it does not apply to me (sorry for
the double post)
Original comment by alex.kar...@gmail.com
on 19 Feb 2012 at 8:59
Thanks for reporting this. We need to update the pragma handling for NVidia
runtime. Clearly the cl_amd_fp64 is an AMD specific flag.
Sadly someone with access to a recent NVidia card and driver will have to debug
this.
My suggestion is that whoever steps up to fix this, jusy walk through (debugger
in eclipse) the code which extracts the capabilities and determine which
capability the NVidia driver/card reports for fp64 and adds appropriate tests
and pragma's to the OpenCLWriter. This is probably a 20 minute fix for someone
with the appropriate hardware.
Gary
Original comment by frost.g...@gmail.com
on 19 Feb 2012 at 5:15
Alex I just reread issue 27. Why do you think that it does not apply?
Original comment by frost.g...@gmail.com
on 19 Feb 2012 at 5:19
Actually I am probably mistaken about issue 27. I thought the cl_amd_fp64 thing
was already fixed and the issue was only regarding NaN.
I got the source code now and looking into what you suggested. Can you tell me
in which class the code extracts GPU capabilities?
Original comment by alex.kar...@gmail.com
on 19 Feb 2012 at 5:43
Sure take a look at KernelRunner. There is a method that extracts the
capabilities of the device as a single string, then separates it into a
HashSet. There are helpers to check for specific required extensions (FP64
should be one). If you look at the OpenCL writer it uses this information to
initiate creating specific pragmas.
My guess is we can remove the AMD specific one (required prior to OpenCL 1.1)
with a more general one. Take a look. If I get a chance to try an NVidia card
tommorrow I will take a look.
Gary
Original comment by frost.g...@gmail.com
on 19 Feb 2012 at 11:10
Thanks, that is very helpful :). I found the capabilitiesSet so I will see what
I can get from there.
Also, just a short note, Steve's suggestion (issue 27):
KernelWriter.java
-
writePragma("cl_amd_fp64", true);
+
writePragma("cl_khr_fp64", true);
makes doubles run correctly, but I assume this has undesired side-effects for
other cards.
Original comment by alex.kar...@gmail.com
on 19 Feb 2012 at 11:29
Actually that might just be the fix. So this change works for you regarding
NVIDIA card?
I will try it with AMD. I suspect it might just be as simple as this.
Gary
Original comment by frost.g...@gmail.com
on 20 Feb 2012 at 12:54
Yes, that seems to do the trick for GTX480 and GTX580.
Original comment by alex.kar...@gmail.com
on 20 Feb 2012 at 1:02
This change also works for AMD cards (post OpenCL 1.1) so I will make this
change.
Gary
Original comment by frost.g...@gmail.com
on 20 Feb 2012 at 8:11
Alex I just checked in the changes for this, basically Steve's suggestion for
issue 27.
This was added to SVN R#285
We will leave issue 27 open because it also refers to the Nan issues.
Can you validate R#285?
Original comment by frost.g...@gmail.com
on 20 Feb 2012 at 8:22
Just checked latest svn (R#288) and it works :).
Original comment by alex.kar...@gmail.com
on 21 Feb 2012 at 12:03
Awesome. Thanks to Stephen for the fix.
Original comment by frost.g...@gmail.com
on 21 Feb 2012 at 3:21
Original issue reported on code.google.com by
alex.kar...@gmail.com
on 19 Feb 2012 at 8:57