Closed GoogleCodeExporter closed 9 years ago
cl_amd_fp64 is easy enough to fix:
KernelWriter.java
-
writePragma("cl_amd_fp64", true);
+
writePragma("cl_khr_fp64", true);
The NaNs are a bit more tricky though. Seems something strange going on with
the ByteBuffer to Double conversion (ByteBuffer.d8()), since some floating
point numbers convert OK, but others result in NaN.
e.g.
"double thing = -0.356563782;"
byte buffer = [-65, -42, -47, -16, -27, -88, 50, 91]
u8() returns -441961893
d8() returns NaN
but
"double another = 0.356;"
byte buffer = [6, 63, -42, -56, -76, 57, 88, 16]
u8() returns 4600084745787281506;
d8() returns 0.356
Original comment by stevecoo...@gmail.com
on 6 Dec 2011 at 8:34
[deleted comment]
Clearly from the #pragma name/value our AMD roots are showing.
Hopefully we can replace the amd specific pragma with a more neutral one. This
might just be just a string change in the pragma mapping.
Not sure about the NaN. Possibly when the pragma is correct the OpenCL compiler
will actually take a NaN (possibly but unlikely).
I don't think this will be hard, but it will require an NVidia card to
debug/test. Unfortunately that is not me at present.
Original comment by frost.g...@gmail.com
on 6 Dec 2011 at 1:22
Not sure if you saw my later comment but even with pragma fixed there are some
nans. This doesn't appear to be driver related since its in the java parsing
code. Could you try to reproduce with an amd card ?
Original comment by stevecoo...@gmail.com
on 6 Dec 2011 at 2:34
Looks like we should be using
http://docs.oracle.com/javase/6/docs/api/java/lang/Double.html#doubleToRawLongBi
ts%28double%29
Instead of
http://docs.oracle.com/javase/6/docs/api/java/lang/Double.html#doubleToLongBits%
28double%29
To preserve NaN
Original comment by frost.g...@gmail.com
on 6 Dec 2011 at 10:46
Actually the code in question is converting bits to double (not the other way
round). Not sure what the issue is with your ByteBuffer code, but I would
suggest you just use the nio libraries, e.g. the following change fixed this
particular issue:
double d8(int _offset) {
return (java.nio.ByteBuffer.wrap(bytes, _offset, 8).getDouble());
}
You could probably replace your whole ByteBuffer code with the nio one.
Original comment by stevecoo...@gmail.com
on 7 Dec 2011 at 1:38
The changes suggested above by Steve
KernelWriter.java
-
writePragma("cl_amd_fp64", true);
+
writePragma("cl_khr_fp64", true);
Have been made in the trunk SVN R#285
This will be left open to track the Nan stuff.
Gary
Original comment by frost.g...@gmail.com
on 20 Feb 2012 at 8:24
Steve
I know you submitted this a while back. I just took another look.
We fixed the pragma previously but clearly there was still an issue in
ByteBuffer.d8() as you indicated above. I finally took some time and found the
issue.
See this from the fix for issue #50.
I found the culprit and submitted a patch for this #r467
It was as Steve pointed out in his diagnosis of #27 something weird in
ByteBuffer.d8() which converts double constants from the classfile into the
disassembler.
Actually the fix was in ByteBuffer.u8() (we pull a long in and convert it's
bits to a double)
double d8(int _offset) {
return (Double.longBitsToDouble(u8(_offset)));
}
The previous version was
long u8(int _offset) {
return (((long) u4(_offset)) << 32 | u4(_offset + 4));
}
Sadly for values when the lower 32 bits had the most significant bit set this
was sign extending and causing the top 32 bits to all be 1's.
Now we have
long u8(int _offset) {
return ((u4(_offset)&0xffffffffL) << 32) |(u4(_offset + 4)&0xffffffffL);
}
This stops the sign extension from taking place.
This was my test case.
public class NaNTest{
public static void main(String[] args) {
Kernel k = new Kernel(){
@Override public void run() {
double gauss = (1.0e-10);
}
};
k.execute(1024);
}
}
Original comment by frost.g...@gmail.com
on 17 May 2012 at 4:32
Original issue reported on code.google.com by
stevecoo...@gmail.com
on 6 Dec 2011 at 3:48