uygn / aparapi

Automatically exported from code.google.com/p/aparapi
Other
0 stars 0 forks source link

Explicit write ignored if field is declared constant #115

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
When using explicit writes, a field annotated Constant is not updated when 
Kernel.put() is called. In Aparapi.cpp updateWriteEvents line 573, there is a 
comment that says: "the default behavior for Constant buffers is also that 
there is no write enqueued unless explicit". So I think the write to a constant 
field should be enqueued if an explicit write is requested. I've fixed this 
locally, but it seems to be related to two separate bugs.

When using the pre-built package (appears to use old sources), the bug is that 
the constant field is not updated despite an explicit write request. When 
building from the latest trunk, an additional bug is that the field is not 
considered constant (which make the problem appear to go away by not treating 
it as constant).

When I downloaded the trunk and compiled both the jar and dll, the problem went 
away. However, on closer inspection, the field was no longer being treated as 
Constant (likewise, annotating with Local seemed to be ignored as well). This 
seems to be a bug in com.amd.aparapi.internal.kernel.KernelRunner, due to using 
the wrong annotations. The class imports the Local and Constant annotations 
from com.amd.aparapi.opencl.OpenCL, when I think they should be imported from 
com.amd.aparapi.Kernel. Once this was changed, using the annotations to 
indicate a local/constant field seemed to work. The workaround for this would 
be to use the suffix instead of the annotation.

Remove:
  import com.amd.aparapi.opencl.OpenCL.Constant;
  import com.amd.aparapi.opencl.OpenCL.Local;
Add:
  import com.amd.aparapi.Kernel.Constant;
  import com.amd.aparapi.Kernel.Local;

After this was fixed, the original problem re-appeared (constant fields not 
being updated when explicit write requested). This seems to be a problem in 
Aparapi.cpp, where the write enqueue is skipped if the field is marked 
constant. Line 720 (processArgs):
    if (arg->needToEnqueueWrite() && !arg->isConstant())

After adding a check for an explicit write, the field seemed to be updated:
    if (arg->needToEnqueueWrite() && (!arg->isConstant() || arg->isExplicitWrite()))

The workaround here is to not declare a field as Constant if you need to update 
it between executions.

(unrelated note, I had some issues getting it to compile under Windows 64-bit 
due to sizeof long = 4. I have a set of changes to fix this, using long long in 
some places, or %p where addresses are being printed, if this is of any 
interest).

Original issue reported on code.google.com by paul.mi...@gmail.com on 11 Jun 2013 at 5:30

GoogleCodeExporter commented 9 years ago
Paul, thanks for your work root causing (and offering a fix for this). 

I am away on vacation, but will take a deeper look at this when I return. 

Thanks again for your contribution. 

Original comment by frost.g...@gmail.com on 11 Jun 2013 at 11:16

GoogleCodeExporter commented 9 years ago
No problem, this is only my 2nd/3rd day playing with GPU computing so I've
been pretty excited about it. It's taken a lot of work but I'm finally
seeing a significant speed-up over my plain non-GPU code.

Somewhat related, I added a simple fix for issue 61 (well, a big speed
boost in JTP mode anyway), and an RFE in issue 116 based on other changes
I've made. If I can help let me know :)

-Paul

Original comment by paul.mi...@gmail.com on 12 Jun 2013 at 2:11

GoogleCodeExporter commented 9 years ago
Ran across the same naming confusion problem in KernelWriter.java, the import 
specified the annotations from the OpenCL class instead of the Kernel class, 
which manifested itself as the error "clSetKernelArg() (local) failed invalid 
arg size". The generated OpenCL program declared them as __global instead of 
__local. As before, using the suffix is a workaround.

Original comment by paul.mi...@gmail.com on 12 Jun 2013 at 10:34

GoogleCodeExporter commented 9 years ago
Yes, I apologize for that bug, it was most likely my bug when we refactored the 
code into separate packages.

As a note, I am not entirely satisfied with how annotations were handled, even 
after the refactoring. Unfortunately, there are insufficient guidelines posted 
on how to properly name, package and manage Java annotations...everyone seems 
to do it differently. In an effort to keep annotations compatible, they were 
left with the same names, only moved around a little. I considered enforcing a 
new rule that all annotations needed to be uniquely named, so these kinds of 
confusions didn't happen, but thought maybe that would be better handled with a 
later patch.

Any thoughts on this?

Also a patch for your work would be great :) Thanks!

Original comment by ryan.lam...@gmail.com on 13 Jun 2013 at 4:57

GoogleCodeExporter commented 9 years ago
I don't think I can generate an accurate patch for KernelRunner.java, because I 
have numerous other changes to the file (this bugfix, changes for issue 61, and 
my own extensions for rfe/issue 116). For KernelWriter.java, I can generate a 
patch that includes the main change for issue 117 as well.

Regarding naming confusion, my "rule" is just not to import nested/inner 
classes, because their names are often somewhat reliant on their containing 
class (e.g. you have an inner class named "Data", which is unambiguous within 
the containing class but overly broad outside of it).

Original comment by paul.mi...@gmail.com on 13 Jun 2013 at 7:24

Attachments:

GoogleCodeExporter commented 9 years ago
Okay, here's a patch for KernelRunner.java with my extensions removed. It also 
includes an improvement for issue 61.

Original comment by paul.mi...@gmail.com on 13 Jun 2013 at 7:37

Attachments:

GoogleCodeExporter commented 9 years ago
Okay, here's a patch for Aparapi.cpp. This includes the minor change mentioned 
in issue 117, and changes I made to get it to compile under 64-bit Windows 
(fprintf and long long stuff).

Original comment by paul.mi...@gmail.com on 13 Jun 2013 at 7:53

Attachments:

GoogleCodeExporter commented 9 years ago
Paul

Thanks for these.  I plan on walking through these (started with the JTP 
changes) they look very promising. 

Ryan
I know what you mean WRT annotations.  I kind of like making them inner 
classes, but (as you will have discovered) I have tried multiple approaches. 
Any suggestions would indeed be welcome. 

Gary

Gary

Original comment by frost.g...@gmail.com on 17 Jun 2013 at 10:19

GoogleCodeExporter commented 9 years ago
Regarding the JTP change, it's optimized for the case where kernels are created 
for re-use, as opposed to repeated creation/disposal. In the case of the later, 
some of the benefit is lost because the thread pool is not re-used as 
effectively. It will be re-used for multiple passes, but not across kernels.

If using it across multiple kernel instantiations is desired, the thread pool 
could be made a static field (with synchronization for thread safety). The 
shutdown() would be removed from dispose(), and the call to 
newCachedThreadPool() could be supplied a ThreadFactory that supplied daemon 
threads (so that the JVM would not need to wait for the thread pool to expire 
before exiting naturally).

Original comment by paul.mi...@gmail.com on 18 Jun 2013 at 4:41

GoogleCodeExporter commented 9 years ago
Paul I have committed your patches for KernelRunner.java and Aparapi.cpp above, 
and yes  I do see the JTP performance improvement in some workloads.  This is a 
great fix thanks. 

Patches were committed as #r1280

I am still reviewing the kernelWriter change WRT this and issue #117

Gary

Original comment by frost.g...@gmail.com on 18 Jun 2013 at 6:26