PiRSquared17 / aparapi

Automatically exported from code.google.com/p/aparapi
Other
0 stars 0 forks source link

OpenCL fails to compile when a Kernel is executed inside a loop #135

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a Kernel inside a for loop that iterates at least 49 times

for (int i = 0; i < 1000; i++) {
    System.out.println(i);
    Kernel kernel = new Kernel() {
        @Override
        public void run() {
        }
    };
    kernel.execute(1024);
}

What is the expected output? What do you see instead?

Expected:
1
2
3
...
...
997
998
999

Observed:
1
2
3
...
48
49
Dec 03, 2013 8:02:27 PM com.amd.aparapi.KernelRunner warnFallBackAndExecute
WARNING: Reverting to Java Thread Pool (JTP) for class pkg1.Main$1: OpenCL 
compile failed
50
Dec 03, 2013 8:02:27 PM com.amd.aparapi.KernelRunner warnFallBackAndExecute
WARNING: Reverting to Java Thread Pool (JTP) for class pkg1.Main$1: OpenCL 
compile failed

What version of the product are you using? On what operating system?

Aparapi_2013_01_23_windows_x86_64.zip
Windows 7 x64

Original issue reported on code.google.com by shortk...@gmail.com on 4 Dec 2013 at 1:06

GoogleCodeExporter commented 9 years ago
Two comments.  
One obvious, please don't offended, but I just want to point out that if your 
run method is *really* empty (you might just be hiding the gory details ;) ) it 
probably will not compile.  I have never tested this.  If you were just showing 
the structure (and your run method is indeed doing work) then ignore this 
comment ;)

The second is that you want to hoist your kernel creation out of the loop. 

    Kernel kernel = new Kernel() {
        @Override
        public void run() {
        }
    };

for (int i = 0; i < 1000; i++) {
    System.out.println(i);
    kernel.execute(1024);
}

Otherwise your performance will suck.  We create OpenCl code for each new 
instance of Kernel we see. So we have to create and compile your OpenCL 1000 
times above.

Also you not disposing any of the kernels in this tight loop so it is storing 
buffers, queues and all sorts of stuff behind the scenes. 

Now you may have found a resource leak (when we do try to compile 1000 times!) 
and that is what is failing.  Try reusing your kernel.  See if that still give 
you the same error. 

Gary

Original comment by frost.g...@gmail.com on 4 Dec 2013 at 3:32

GoogleCodeExporter commented 9 years ago
clBuildProgram() is not thread-safe as I remember, so java may be doing some 
auto-optimization by JIT-compiler and those parallel clBuildProgram()s are 
holding each other. Because they are being created and I suspect calling 
clBuildPRogram(). My Navier-Stokes advector/diffusor had 50-60 kernels and 
building them all were taking 4 seconds and opencl did not let me multithread 
them. 

Original comment by huseyin....@gmail.com on 14 Jan 2014 at 12:56