ROCm / composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
https://rocm.docs.amd.com/projects/composable_kernel/en/latest/
Other
251 stars 102 forks source link

Add functional support of AB group scale #1345

Open zjing14 opened 2 weeks ago

zjing14 commented 2 weeks ago
jianyuh commented 2 weeks ago

CC @jwfromm