codeplaysoftware / cutlass-fork

CUDA Templates for Linear Algebra Subroutines
Other
8 stars 20 forks source link

Define GmemTiledCopyA/B as TiledCopy in CollectiveMma<IntelPVC,... #153

Closed joeatodd closed 1 week ago

joeatodd commented 1 week ago

This matches what EVT test suite expects, and seems to be consistent with sm80 impls.

This change would also imply updating the pvc_gemm*cpp examples.

joeatodd commented 1 week ago

Wrong approach!