ROCm / composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
https://rocm.docs.amd.com/projects/composable_kernel/en/latest/
Other
297 stars 113 forks source link

Add instances for grouped conv fwd 3d with ConvScale for bf8@fp8->fp8 #1369

Closed geyyer closed 2 months ago

geyyer commented 3 months ago
andriy-ca commented 2 months ago

Everything looks good! I have no suggestions.