Open GoogleCodeExporter opened 9 years ago
Paul, thanks for your work root causing (and offering a fix for this).
I am away on vacation, but will take a deeper look at this when I return.
Thanks again for your contribution.
Original comment by frost.g...@gmail.com
on 11 Jun 2013 at 11:16
No problem, this is only my 2nd/3rd day playing with GPU computing so I've
been pretty excited about it. It's taken a lot of work but I'm finally
seeing a significant speed-up over my plain non-GPU code.
Somewhat related, I added a simple fix for issue 61 (well, a big speed
boost in JTP mode anyway), and an RFE in issue 116 based on other changes
I've made. If I can help let me know :)
-Paul
Original comment by paul.mi...@gmail.com
on 12 Jun 2013 at 2:11
Ran across the same naming confusion problem in KernelWriter.java, the import
specified the annotations from the OpenCL class instead of the Kernel class,
which manifested itself as the error "clSetKernelArg() (local) failed invalid
arg size". The generated OpenCL program declared them as __global instead of
__local. As before, using the suffix is a workaround.
Original comment by paul.mi...@gmail.com
on 12 Jun 2013 at 10:34
Yes, I apologize for that bug, it was most likely my bug when we refactored the
code into separate packages.
As a note, I am not entirely satisfied with how annotations were handled, even
after the refactoring. Unfortunately, there are insufficient guidelines posted
on how to properly name, package and manage Java annotations...everyone seems
to do it differently. In an effort to keep annotations compatible, they were
left with the same names, only moved around a little. I considered enforcing a
new rule that all annotations needed to be uniquely named, so these kinds of
confusions didn't happen, but thought maybe that would be better handled with a
later patch.
Any thoughts on this?
Also a patch for your work would be great :) Thanks!
Original comment by ryan.lam...@gmail.com
on 13 Jun 2013 at 4:57
I don't think I can generate an accurate patch for KernelRunner.java, because I
have numerous other changes to the file (this bugfix, changes for issue 61, and
my own extensions for rfe/issue 116). For KernelWriter.java, I can generate a
patch that includes the main change for issue 117 as well.
Regarding naming confusion, my "rule" is just not to import nested/inner
classes, because their names are often somewhat reliant on their containing
class (e.g. you have an inner class named "Data", which is unambiguous within
the containing class but overly broad outside of it).
Original comment by paul.mi...@gmail.com
on 13 Jun 2013 at 7:24
Attachments:
Okay, here's a patch for KernelRunner.java with my extensions removed. It also
includes an improvement for issue 61.
Original comment by paul.mi...@gmail.com
on 13 Jun 2013 at 7:37
Attachments:
Okay, here's a patch for Aparapi.cpp. This includes the minor change mentioned
in issue 117, and changes I made to get it to compile under 64-bit Windows
(fprintf and long long stuff).
Original comment by paul.mi...@gmail.com
on 13 Jun 2013 at 7:53
Attachments:
Paul
Thanks for these. I plan on walking through these (started with the JTP
changes) they look very promising.
Ryan
I know what you mean WRT annotations. I kind of like making them inner
classes, but (as you will have discovered) I have tried multiple approaches.
Any suggestions would indeed be welcome.
Gary
Gary
Original comment by frost.g...@gmail.com
on 17 Jun 2013 at 10:19
Regarding the JTP change, it's optimized for the case where kernels are created
for re-use, as opposed to repeated creation/disposal. In the case of the later,
some of the benefit is lost because the thread pool is not re-used as
effectively. It will be re-used for multiple passes, but not across kernels.
If using it across multiple kernel instantiations is desired, the thread pool
could be made a static field (with synchronization for thread safety). The
shutdown() would be removed from dispose(), and the call to
newCachedThreadPool() could be supplied a ThreadFactory that supplied daemon
threads (so that the JVM would not need to wait for the thread pool to expire
before exiting naturally).
Original comment by paul.mi...@gmail.com
on 18 Jun 2013 at 4:41
Paul I have committed your patches for KernelRunner.java and Aparapi.cpp above,
and yes I do see the JTP performance improvement in some workloads. This is a
great fix thanks.
Patches were committed as #r1280
I am still reviewing the kernelWriter change WRT this and issue #117
Gary
Original comment by frost.g...@gmail.com
on 18 Jun 2013 at 6:26
Original issue reported on code.google.com by
paul.mi...@gmail.com
on 11 Jun 2013 at 5:30