lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
287 stars 94 forks source link

start tagging kernels #1427

Closed jcosborn closed 3 months ago

jcosborn commented 7 months ago

This is an initial set of changes to start tagging kernel operations that require special resources (e.g. shared memory) or other treatment (e.g. block synchronization). This initial version only enables tagging for ThreadLocalCache and thread_array.

weinbe2 commented 4 months ago

Well, first of all, this is a very impressive design/implementation, and I like it as a whole.

I can see the approach going on here---doing all of the work at compile time---and I appreciate it for what it is. My other point of reference is the way Kokkos hands around its "scratch" memory pointers, putting all of the onus on the user to get it right, which has cost me a lot of time and sanity.

I'm good on all of the plumbing work (sans one max vs sum question, but since the code isn't breaking I'm sure there's something my brain can't handle right now). Nearly all of my comments were related to my thoughts on conventions and some requests for more detailed documentation. I emphasized conventions so much because they bounced around a bit in the different files---which I understand is just a byproduct of the initial development process---it just made it hard to follow what was going on at first, so I'd want to agree of this type of thing before it gets merged!

jcosborn commented 4 months ago

Thanks for taking time to review @weinbe2. I think I've addressed all the comments except for the issue of whether to make an alias for *this or pass it directly. I can change it to an alias if there is an overall preference to do so.

I added more documentation in the headers. Let me know if there is anything missing or could be made clearer.

I also merged in the latest develop so it should mostly ready now, pending final review.

weinbe2 commented 4 months ago

Thanks @jcosborn , I think the documentation you added more than suffices. It's made both the underlying assumptions and proper usage clearer. The PR is good to go in my book.

As for:

[...] except for the issue of whether to make an alias for *this or pass it directly.

I noted this elsewhere, but to make the thread easier to follow: I'm ultimately fine leaving it as *this.

@maddyscientist , do you want to do a last review?

maddyscientist commented 3 months ago

No objections from me with this PR. @weinbe2 did you do any run-time sanity checks as part of your review?

weinbe2 commented 3 months ago

cscs-ci run

maddyscientist commented 3 months ago

cscs-ci run