NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines
Other
5.66k stars 971 forks source link

[FEA] Passing the copy type as a parameter to `copy` #1672

Open ZelboK opened 3 months ago

ZelboK commented 3 months ago

Kind of a small change. So I was looking at https://github.com/NVIDIA/cutlass/issues/1231

and I was wondering if it made sense to refactor the code so that it will accept the type of copy they want to rely on like so copy(gmem_tiled_copy, SM80_CP_ASYNC_CACHEALWAYS<float>{}, tAgA, tAsA) so that way you can have the user be aware of what's going on internally. Right now if you use DefaultCopy for example it'll dispatch to that instruction which requires the code to have more synchronization which is a little unintuitive.

You might be able to get the copy type inferred from gmem_tiled_copy but passing it as an explicit parameter is probably much easier.

I'm wondering if there's any flaws/problems with doing this? Perhaps the default should always be copy(gmem_tiled_copy, DefaultCopy{}, tAgA, tAsA).

I could work on a PR for implementing this if it's beneficial. Would like to know ahead of time if this is even desired or if there are problems with this approach before I commit my time to a PR though.

thakkarV commented 3 months ago

You can use UniversalCopy as a means of always dispatching to the SIMT load/store rather than cp async copies. A deeper refactor of this automatic dispatch is planned but won't land anytime soon.

thakkarV commented 3 months ago

If you would like to work on a PR, that would be lovely! @ccecka CC

thakkarV commented 3 months ago

I suspect much of this can be accomplished by just renaming UniversalCopy to be the new DefaultCopy so that it never goes through automatic LDGSTS route again

ZelboK commented 3 months ago

Sure, I'll push up a PR today. I'll see if just replacing the defaultcopy to be universalcopy is enough.

github-actions[bot] commented 2 months ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.