Closed GoogleCodeExporter closed 9 years ago
here is the hardware report:
[ATI Radeon HD 6750M] [Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz]
CL_DEVICE_ADDRESS_BITS 32 64
CL_DEVICE_AVAILABLE true true
CL_DEVICE_COMPILER_AVAILABLE true true
CL_DEVICE_ENDIAN_LITTLE false true
CL_DEVICE_ERROR_CORRECTION_SUPPORT false false
CL_DEVICE_EXECUTION_CAPABILITIES Kernel Kernel
NativeKernel
CL_DEVICE_EXTENSIONS cl_APPLE_gl_sharing cl_khr_fp64
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_APPLE_gl_sharing
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE 0 64
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE 0 6291456
CL_DEVICE_GLOBAL_MEM_CACHE_TYPE None ReadWriteCache
CL_DEVICE_GLOBAL_MEM_SIZE 536870912 6442450944
CL_DEVICE_HOST_UNIFIED_MEMORY false false
CL_DEVICE_IMAGE2D_MAX_HEIGHT 8192 8192
CL_DEVICE_IMAGE2D_MAX_WIDTH 8192 8192
CL_DEVICE_IMAGE3D_MAX_DEPTH 0 2048
CL_DEVICE_IMAGE3D_MAX_HEIGHT 0 2048
CL_DEVICE_IMAGE3D_MAX_WIDTH 0 2048
CL_DEVICE_IMAGE_SUPPORT false true
CL_DEVICE_LOCAL_MEM_SIZE 32768 16384
CL_DEVICE_LOCAL_MEM_TYPE Local Global
CL_DEVICE_MAX_CLOCK_FREQUENCY 150 2200
CL_DEVICE_MAX_COMPUTE_UNITS 5 8
CL_DEVICE_MAX_CONSTANT_ARGS 8 8
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE 65536 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE 134217728 1610612736
CL_DEVICE_MAX_PARAMETER_SIZE 1024 4096
CL_DEVICE_MAX_READ_IMAGE_ARGS 0 128
CL_DEVICE_MAX_SAMPLERS 128 16
CL_DEVICE_MAX_WORK_GROUP_SIZE 1024 1
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS 3 3
CL_DEVICE_MAX_WORK_ITEM_SIZES 1024
1024
1024 1
1
1
CL_DEVICE_MAX_WRITE_IMAGE_ARGS 0 8
CL_DEVICE_MEM_BASE_ADDR_ALIGN 32768 1024
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE 128 128
CL_DEVICE_NAME ATI Radeon HD 6750M Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR n/a n/a
CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE n/a n/a
CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT n/a n/a
CL_DEVICE_NATIVE_VECTOR_WIDTH_INT n/a n/a
CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG n/a n/a
CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT n/a n/a
CL_DEVICE_OPENCL_C_VERSION OpenCL C 1.0 OpenCL C 1.0
CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR 16 16
CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE 0 2
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT 4 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT 4 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG 2 2
CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT 8 8
CL_DEVICE_PROFILE FULL_PROFILE FULL_PROFILE
CL_DEVICE_PROFILING_TIMER_RESOLUTION 37 1
CL_DEVICE_QUEUE_PROPERTIES ProfilingEnable ProfilingEnable
CL_DEVICE_SINGLE_FP_CONFIG InfNaN
RoundToNearest Denorm
InfNaN
RoundToNearest
CL_DEVICE_TYPE GPU CPU
CL_DEVICE_VENDOR AMD Intel
CL_DEVICE_VENDOR_ID 16915200 16909312
CL_DEVICE_VERSION OpenCL 1.0 OpenCL 1.0
CL_DRIVER_VERSION 1.0 1.0
CL_PLATFORM_EXTENSIONS
CL_PLATFORM_NAME Apple Apple
CL_PLATFORM_PROFILE FULL_PROFILE FULL_PROFILE
CL_PLATFORM_VENDOR Apple Apple
CL_PLATFORM_VERSION OpenCL 1.0 (Dec 26 2010 12:52:21) OpenCL 1.0 (Dec
26 2010 12:52:21)
Out of order queues support false false
cl_khr_byte_addressable_store false false
cl_khr_gl_sharing false false
cl_nv_compiler_options false false
cl_nv_device_attribute_query false false
Original comment by haku...@gmail.com
on 22 Jul 2011 at 2:32
Original comment by olivier.chafik@gmail.com
on 25 Jul 2011 at 8:10
Hello Paul,
Thanks for your detailed report :-)
The issue appears to come from CLDevice.getKernelsDefaultByteOrder(), which
didn't take care of the device's endianness (some old hack did, but for some
reason I commented it out at some point...).
I've committed a change that might fix the issue (revision #2227) and uploaded
a new 1.0-SNAPSHOT for both the JNA and BridJ versions of JavaCL (the latter is
still being uploaded as I write this, but the JNA version is already available
:-)).
It would be great if you could test it and let me know how it goes as I don't
have access to any ATI-powered computer right now (I'm on extended
vacations...).
Cheers
--
zOlive
Original comment by olivier.chafik@gmail.com
on 25 Jul 2011 at 9:10
Precision : the 1.0-SNAPSHOT version is available through Maven or here (look
for the "-shaded"-suffixed jar) :
http://nativelibs4java.sourceforge.net/maven/com/nativelibs4java/javacl/1.0-SNAP
SHOT/
Original comment by olivier.chafik@gmail.com
on 25 Jul 2011 at 9:21
Hi Olivier,
When i updated the javacl to the file in the following link:
http://nativelibs4java.sourceforge.net/maven/com/nativelibs4java/javacl-jna/1.0-
SNAPSHOT/
(i had to go looking for the jna version, as that's what we were using and it's
a big project with dependencies in some old asm.jar which Bridj was angry about)
It seems that after trying to get it to work on 2 computers, my compiler just
does not want to open the archive file...
Sorry for the troubles again,
Paul
Original comment by haku...@gmail.com
on 26 Jul 2011 at 7:42
Hi Paul,
Sorry for the delay in re-uploading the fix (I'm on lengthy vacations...).
The JAR was indeed apparently corrupted (so much for on-the-go deployments from
cybercafés :-S), I've uploaded it again.
In any case, please note that it's relatively straightforward to build the
latest SVN version from sources : http://code.google.com/p/javacl/wiki/Build
(the files you're interested in will be in libraries/OpenCL-JNA/JavaCL/target
after the build completes)
Please let me know if you face other issues...
Cheers
--
zOlive
Original comment by olivier.chafik@gmail.com
on 2 Aug 2011 at 4:28
Hi Olivier,
I went and ran the same tests on the same computer as the original report,
when I ran the test from
http://pastebin.com/VmjfW4TB
with a minor change of all the ByteOrder.LITTLE_ENDIAN replaced with
OpenCLSingletonState.getContext().getByteOrder() since it is more appropriate.
the output I get is as follows:
Using ATI Radeon HD 6750M
The max memory of this device is: 134217728
x should be 1 and is 4
y should be 2 and is 3
z should be 3 and is 2
w should be 4 and is 1
return value for if integer was 1 0
the results for the second test from
http://pastebin.com/bwQAZWvH
(note in this case i'm not using NIOUtils, and using standard javacl's
defaults...nor is there any input...actually)
Using ATI Radeon HD 6750M
The max memory of this device is: 134217728
The output is
0.0, 0.0, 0.0, 0.0, 8.9776E-41, 7.370973E-39,
The xx used for multiplication was
8.9776E-41, 8.9776E-41, 8.9776E-41, 8.9776E-41,
The constant (a) used for multiplication was
4.6006E-41, 9.0E-44, 2.3049E-41, 4.6007E-41,
If there's any more information you'd like please feel free to ask.
Original comment by haku...@gmail.com
on 4 Aug 2011 at 3:33
So, I'm not sure what's changed since Aug 4 (I don't think we've updated the
javacl jar since then), but this problem appears to be resolved...:
[System.out] - The output is
[System.out] - 125.0, 250.0, 375.0, 500.0, 125.0, 12500.0,
[System.out] - The xx used for multiplication was
[System.out] - 125.0, 125.0, 125.0, 125.0,
[System.out] - The constant (a) used for multiplication was
[System.out] - 1.0, 2.0, 3.0, 4.0,
Additionally, our real code is now running fine. So, I guess this appears to
be resolved?
(I work with hakuliu).
Original comment by yode...@gmail.com
on 31 Aug 2011 at 4:11
[deleted comment]
Hi Paul,
This is excellent news, thanks for the feedback !
I didn't deploy anything new since August 2nd, but maybe Maven got confused
with the corrupted jar somehow and didn't update the way it should have...
Please let me know if you run into this issue again on other platforms : I'll
reopen this ticket (and of course, feel free to open tickets for other issues
!).
Cheers
--
zOlive
(edited message: I couldn't see hakuliu's name in the report, with those
annoying abridged emails :-))
Original comment by olivier.chafik@gmail.com
on 31 Aug 2011 at 11:15
Original issue reported on code.google.com by
haku...@gmail.com
on 22 Jul 2011 at 2:21