codeplaysoftware / cutlass-fork

CUDA Templates for Linear Algebra Subroutines
Other
8 stars 20 forks source link

SM80 Collective Builder #139

Open AD2605 opened 2 months ago

AD2605 commented 2 months ago

A collective Builder API for Ampere GEMM for different data types. Only suppports LinearCombination as the epilogue at the moment