ROCm / composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
https://rocm.docs.amd.com/projects/composable_kernel/en/latest/
Other
297 stars 113 forks source link

Fix the optional ckProfiler grouped_gemm arguments. #1368

Closed illsilin closed 3 months ago

illsilin commented 3 months ago

The optional arguments for ckProfiler grouped_gemm were offset by 1 which caused a runtime error. With the updated indices, the ckProfiler works correctly.