Open jeromeku opened 2 months ago
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
What is your question? I'm looking to define a GEMM that does the following (in pseudocode):
That is, the epilogue should a) compute the column-wise
2-norm
ofD
and b) storeF
to global, no need to storeD
. (2-norm being thesqrt
of thesum of squares
alongaxis=1
).What's the most appropriate epilogue type for this pattern specific for
Ampere
?EVT
would fit this well -- are there examples for this NOT forstream-k
? Always get compilation errors when trying to instantiate an EVT device GEMM forAmpere
(see #1459).