Closed canhongluori closed 1 year ago
The easiest way to use JOCL schould be to set up a Maven project, and add a dependency to JOCL as
<dependency>
<groupId>org.jocl</groupId>
<artifactId>jocl</artifactId>
<version>2.0.4</version>
</dependency>
If this does not help, maybe you can provide further information about your setup, how to try to use JOCL, or what does not work exactly.
I have done this, but when I try to run the test HistogramNVIDIA the test failed
What exactly is the error message? Or does it just crash? Or does it just say "TEST FAILED"? (It's difficult. Try to imagine you had to respond to such an issue ...)
Sorry sir, I forgot to put my test message. Here is my test message Starting...
Initializing data... Initializing OpenCL... Allocating OpenCL memory...
Initializing 256-bin OpenCL histogram... ...loading Histogram256.cl ...creating histogram256 program ...building histogram256 program ...creating histogram256 kernels ...allocating internal histogram256 buffer Running 256-bin OpenCL histogram for 1048576 bytes... Validating OpenCL results... ...reading back OpenCL results ...histogram256CPU() 256-bin histograms do not match!!!
Shutting down 256-bin OpenCL histogram...
TEST FAILED !!! Shutting down...
Besides, I when I try to run JOCLSample_2_0_SVM, I also failed. Here is my test message Using device NVIDIA GeForce RTX 3060 Laptop GPU, version 3.0 At 0 got 0.0, setting 0.0 At 1 got 1.0, setting 2.0 At 2 got 4.0, setting 4.0 At 3 got 9.0, setting 6.0 At 4 got 16.0, setting 8.0 At 5 got 25.0, setting 10.0 At 6 got 36.0, setting 12.0 At 7 got 49.0, setting 14.0 At 8 got 64.0, setting 16.0 At 9 got 81.0, setting 18.0 Exception in thread "main" org.jocl.CLException: CL_INVALID_KERNEL_ARGS at org.jocl.CL.checkResult(CL.java:825) at org.jocl.CL.clEnqueueNDRangeKernel(CL.java:20954) at org.jocl.samples.JOCLSample_2_0_SVM.main(JOCLSample_2_0_SVM.java:109)
The JOCLSample_2_0_SVM
does not work for me either. It did work once, so I'll have to re-read the specification, and maybe see whether something in OpenCL changed that might explain this. Maybe SVM is not or no longer supported by NVIDIA...?
Regarding the HistogramNVIDIA
sample: There doesn't seem to be any crash or so. So I could only make guesses until now.
Do other samples work in general? (I'd just like to know whether there's something fundamentally wrong, or whether NVIDIA just messed up something in that example...)
What happens when you add the output
System.out.println("CPU: " + Arrays.toString(h_HistogramCPU));
System.out.println("GPU: " + Arrays.toString(h_HistogramGPU));
before the line https://github.com/gpu/JOCLSamples/blob/40d5f8c2cc49017d7f5b13b882a7386fe4ddee2a/src/main/java/org/jocl/samples/HistogramNVIDIA.java#L117 ?
I have added the these code behind “System.out.println(PassFailFlag != 0 ? "256-bin histograms match\n" : "256-bin histograms do not match!!!\n" );” the results are as follows: Starting...
Initializing data... Initializing OpenCL... Allocating OpenCL memory...
Initializing 256-bin OpenCL histogram... ...loading Histogram256.cl ...creating histogram256 program ...building histogram256 program ...creating histogram256 kernels ...allocating internal histogram256 buffer Running 256-bin OpenCL histogram for 1048576 bytes... Validating OpenCL results... ...reading back OpenCL results ...histogram256CPU() 256-bin histograms do not match!!!
CPU: [4147, 4058, 4022, 4268, 4098, 4113, 4113, 4075, 4072, 4178, 4008, 4155, 4014, 4015, 4088, 4179, 4055, 4074, 4136, 4071, 4007, 4053, 4110, 4077, 4147, 4046, 4193, 4163, 4026, 4073, 4075, 4129, 4117, 4035, 4063, 4080, 4053, 4173, 4106, 4122, 4116, 4094, 4121, 4037, 4242, 4266, 4103, 4056, 4100, 4132, 4142, 4109, 4161, 4156, 4057, 4103, 4113, 4094, 4058, 4006, 4004, 4220, 4030, 4025, 4035, 4062, 4111, 4117, 4061, 4132, 4032, 4030, 4014, 4114, 4021, 3974, 4136, 4084, 4140, 4083, 4094, 4030, 4015, 4071, 4255, 4074, 4106, 4003, 4112, 3968, 4117, 4034, 4114, 4032, 4165, 4108, 4088, 4151, 4099, 4116, 4194, 3936, 4157, 4133, 4173, 4135, 4167, 4156, 4017, 4118, 3925, 4152, 4006, 4108, 4129, 4041, 4202, 4204, 4107, 4061, 4142, 4078, 4061, 4112, 4192, 4142, 4137, 4121, 4111, 4223, 4135, 4043, 3968, 4178, 4089, 4089, 4040, 4121, 4125, 4077, 4146, 4074, 4109, 4169, 4011, 4035, 4064, 4150, 4101, 4113, 4073, 4106, 4050, 4044, 4041, 3970, 4148, 4063, 4060, 4071, 4132, 4160, 4100, 4065, 4020, 3908, 4137, 4069, 4181, 4142, 4120, 4109, 4051, 4123, 4140, 4082, 4078, 3982, 3997, 4061, 4058, 4076, 4074, 4143, 4112, 4116, 4068, 4128, 4080, 4073, 4075, 3987, 4142, 4195, 4271, 4060, 3969, 4137, 4168, 4087, 4112, 4047, 4159, 4099, 4041, 4040, 4175, 4100, 4125, 4106, 4063, 3990, 4114, 4100, 4140, 4174, 4114, 4078, 4131, 4157, 4115, 4061, 4110, 4062, 4162, 4072, 4050, 4138, 4117, 4128, 4056, 4158, 4030, 4209, 4116, 4190, 4052, 4149, 4052, 4151, 4133, 4113, 4139, 4156, 4158, 4055, 4011, 4106, 4008, 4072, 4030, 4169, 4158, 4003, 4152, 4061] GPU: [4088, 4006, 3979, 4223, 4047, 4076, 4059, 4043, 4020, 4138, 3973, 4108, 3976, 3977, 4043, 4143, 4011, 4032, 4082, 4023, 3964, 4014, 4068, 4026, 4103, 4002, 4150, 4125, 3997, 4033, 4035, 4077, 4066, 3993, 4018, 4031, 4006, 4130, 4074, 4085, 4073, 4051, 4076, 4003, 4191, 4209, 4059, 4017, 4049, 4093, 4088, 4065, 4112, 4110, 4028, 4059, 4065, 4051, 4012, 3968, 3969, 4182, 3983, 3978, 3984, 4006, 4065, 4071, 4024, 4083, 3994, 3979, 3972, 4069, 3990, 3929, 4095, 4048, 4096, 4046, 4041, 3991, 3978, 4021, 4210, 4025, 4064, 3956, 4071, 3928, 4077, 3993, 4073, 3993, 4127, 4064, 4050, 4112, 4062, 4075, 4141, 3905, 4111, 4095, 4128, 4086, 4126, 4103, 3966, 4069, 3892, 4108, 3963, 4069, 4090, 4004, 4161, 4158, 4050, 4021, 4107, 4025, 4016, 4070, 4153, 4093, 4098, 4079, 4059, 4174, 4095, 4005, 3935, 4135, 4051, 4042, 3998, 4085, 4079, 4024, 4095, 4032, 4075, 4115, 3984, 3980, 4024, 4096, 4052, 4072, 4039, 4066, 4015, 4000, 3996, 3934, 4112, 4027, 4013, 4016, 4082, 4115, 4060, 4020, 3985, 3864, 4095, 4026, 4140, 4093, 4083, 4075, 3999, 4067, 4099, 4031, 4041, 3932, 3966, 4023, 4016, 4015, 4038, 4094, 4073, 4083, 4029, 4076, 4037, 4030, 4032, 3950, 4094, 4156, 4223, 4023, 3929, 4096, 4128, 4047, 4061, 3997, 4104, 4052, 4003, 3995, 4133, 4078, 4076, 4060, 4006, 3939, 4059, 4048, 4093, 4124, 4066, 4034, 4081, 4113, 4068, 4019, 4070, 4014, 4112, 4030, 4005, 4100, 4076, 4085, 4010, 4111, 3992, 4165, 4077, 4144, 4020, 4107, 4018, 4106, 4089, 4062, 4090, 4105, 4105, 4021, 3974, 4067, 3979, 4027, 3990, 4123, 4111, 3962, 4100, 4012] Shutting down 256-bin OpenCL histogram...
TEST FAILED !!! Shutting down...
One interseting thing is that my computer is nnvidia RTX3060 laptop but When I try to run HistogramAMD the test passed
Besides the JOCLSample_2_0_SVM and HistogramNVIDIA ,the JOCLSimpleLWJGL also failed, here is the error message
Exception in thread "AWT-EventQueue-0" java.lang.UnsatisfiedLinkError: no lwjgl64 in java.library.path: [C:\Program Files\Java\jdk-11.0.15\bin, C:\WINDOWS\Sun\Java\bin, C:\WINDOWS\system32, C:\WINDOWS, F:\Tecplot\Tecplot 360 EX 2020 R1\bin, C:\Program Files\Microsoft MPI\Bin\, C:\Program Files\Common Files\Oracle\Java\javapath, C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin, C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\libnvvp, C:\Windows\system32, C:\Windows, C:\Windows\System32\Wbem, C:\Windows\System32\WindowsPowerShell\v1.0\, C:\Windows\System32\OpenSSH\, C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common, C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR, C:\Program Files\NVIDIA Corporation\Nsight Compute 2022.1.1\, F:\MATLAB\R2022b\runtime\win64, F:\MATLAB\R2022b\bin, C:\WINDOWS\system32, C:\WINDOWS, C:\WINDOWS\System32\Wbem, C:\WINDOWS\System32\WindowsPowerShell\v1.0\, C:\WINDOWS\System32\OpenSSH\, F:\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.33.31629\bin\Hostx64\x64, C:\Program Files\CMake\bin, F:\Git\cmd, C:\Users\sibong\AppData\Local\Microsoft\WindowsApps, F:\Microsoft VS Code\bin, C:\Program Files\Java\jdk-11.0.15\bin, C:\Program Files\Java\jdk-11.0.15\jre\bin, C:\Users\sibong\AppData\Local\JetBrains\Toolbox\scripts, D:\apache-maven-3.8.6\bin, C:\Users\sibong\AppData\Local\Microsoft\WindowsApps, .]
at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2662)
at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:827)
at java.base/java.lang.System.loadLibrary(System.java:1871)
at org.lwjgl.Sys$1.run(Sys.java:72)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at org.lwjgl.Sys.doLoadLibrary(Sys.java:66)
at org.lwjgl.Sys.loadLibrary(Sys.java:87)
at org.lwjgl.Sys.
Regarding LWJGL: You have to add the right LWJGL DLL to your library path. That's how LWJGL worked back then. An aside: LWJGL also has OpenCL support. Maybe you'll find that easier to use.
Regarding the histogram: Look how cleverly designed the NVIDIA histogram kernel is: https://github.com/gpu/JOCLSamples/blob/master/src/main/resources/kernels/Histogram256.cl With constants. And inlined functions. And __attribute__
modifiers for the kernel. And some volatile
sprinkled in here and there.
I have no idea what that kernel is doing, but apparently, it contains a bug. Depending on what you want to do, you could just ignore that, or create a pull request to fix it. I'm not going to try and debug code that was written by someone at NVIDIA.
At present, I want to use JOCL to do GPU parallel computing. Is there any problem
Can you compile and execute the "OpenCL 64-bin and 256-bin Histogram" sample from https://developer.nvidia.com/opencl and say whether it passes on your GPU?
I don't compile "OpenCL 64-bin and 256-bin Histogram" sample, Can you tell me how to complie it, use cmake-gui or vs2019
You can compile it with Visual Studio 2008 (or newer).
I'll attach the compiled binary here. This is an EXE file. You should not execute EXE files that someone just posts on the internet. You should not trust anybody. But I'll attach it here, and leave it to you.
Beyond that: When you want to use JOCL for GPU-based computations, then just do this. The fact that there may be a bug in a kernel that was taken from the NVIDIA samples does not matter at all.
Or to put it that way:
How should I fix this bug? Could this issue be marked as "resolved" when I just delete that sample? Just drop me a note, I can delete it at any time. I don't care.
Sorry for late reply,I have execute the exe that you give to me, but the test about HistogramNVIDIA still failed
The output of that EXE should basically be the same as for the "HistogramNVIDIA" example.
If the output of the EXE says that the computation "failed", then this shows that there is a bug in the NVIDIA sample, and you should send a bug report to NVIDIA.
If the output of the EXE says that the test "passed", then something is wrong in the HistogramNVIDIA sample. But I don't have any way (and certainly no time) to analyze and fix that right now.
I'd say that you should ignore the fact that the Histogram sample does not work. If you can not ignore that, then I'd recommend you to not use JOCL, but https://www.lwjgl.org/ instead. It also has support for OpenCL, and you can do GPU-based computations with LWJGL. Look at their samples to get started: https://github.com/LWJGL/lwjgl3/tree/master/modules/samples/src/test/java/org/lwjgl/demo/opencl
Thank you for helping me. I checked. I really don't need the histogram production function in JOCL. There may be a problem with NVIDIA's OpenCL API. The executable you gave me didn't find any gaps in the success of the output test
The executable you gave me didn't find any gaps in the success of the output test
Does this mean that the EXE computes the correct result?
If so, I'll have to look whether they changed something in the kernel. The results should be equal for the EXE and for JOCL.
Here is the log clGetPlatformID... Get the Device info and select Device...
Using Device 0: NVIDIA GeForce RTX 3060 Laptop GPU
D:\chrome down load\oclHistogram_compiled_win64\oclHistogram.exe Starting...
Initializing data... Initializing OpenCL... Allocating OpenCL memory...
Initializing 64-bin OpenCL histogram... ...loading Histogram64.cl from file
So as you can see, the log did not tell me whether it is success or not
Oh dear. Apparently it tries to load the .CL files at runtime. (And... does not print an error message when it does not find them... 🙄 )
Can you add the attached files into the same directory as the .EXE file, and try it again?
The output should be something like this:
[oclHistogram.exe] starting...
clGetPlatformID...
Get the Device info and select Device...
# of Devices Available = 1
Using Device 0: NVIDIA GeForce RTX 2070 SUPER
# of Compute Units = 40
oclHistogram.exe Starting...
Initializing data...
Initializing OpenCL...
Allocating OpenCL memory...
Initializing 64-bin OpenCL histogram...
...loading Histogram64.cl from file
...creating histogram64 program
...building histogram64 program
...creating histogram64 kernels
...allocating internal histogram64 buffer
Writing ptx to separate file: Histogram64.ptx ...
Running 64-bin OpenCL histogram for 67108864 bytes...
Validating 64-bin histogram OpenCL results...
...reading back OpenCL results
...histogram64CPU()
...comparing the results
...64-bin histograms match
Shutting down 64-bin OpenCL histogram
Initializing 256-bin OpenCL histogram...
...loading Histogram256.cl
...creating histogram256 program
...building histogram256 program
...creating histogram256 kernels
...allocating internal histogram256 buffer
Writing ptx to separate file: Histogram256.ptx ...
Running 256-bin OpenCL histogram for 67108864 bytes...
Validating 256-bin histogram OpenCL results...
...reading back OpenCL results
...histogram256CPU()
...comparing the results
...256-bin histograms match
Shutting down 256-bin OpenCL histogram
Shutting down...
[oclHistogram.exe] test results...
PASSED
> exiting in 3 seconds: 3...2...1...done!
The point is:
When it says 'FAILED', then ... there's a bug in the sample code.
When it says PASSED
for you, then I'll have to check why it does not work with JOCL.
(In the best case, it's just a bug in the kernel, and I "only" have to replace https://github.com/gpu/JOCLSamples/blob/master/src/main/resources/kernels/Histogram256.cl with the latest one from the sample, maybe with minor adjustments. But if that's not the reason, I'd have to spend more time for analyzing that, and ... that has low priority for me right now...)
The test failed, oh No
That's good news for me. So the reason for the test failure is not in JOCL, but in the kernel that was provided by NVIDIA.
What that means for you ... I'm not sure. But I'm a bit curious now: When you run the test in JOCL (from the comment above ) multiple times, does it always print the same values in the
GPU: [4088, 4006, 3979, 4223, ...
output, or are these values different each time that you run it?
The value are same each time I run
One could try to dive deeper into this. Maybe some of the magic constants (WARP_COUNT
, WARP_SIZE
...) in the NVIDIA Kernel are not suitable for a "new" GPU like the RTX 3060? In any case: When the problem also appears in the .EXE that was compiled from the official NVIDIA sample, then this is not an issue of JOCL.
I have install NVIDIA OpenCL sdk and have dll file, but I can not use it on IDEA