ROCm / composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
https://rocm.docs.amd.com/projects/composable_kernel/en/latest/
Other
251 stars 102 forks source link

layernorm2d forward #1339

Closed rocking5566 closed 1 week ago

rocking5566 commented 2 weeks ago

layernorm2d using tile