Closed copybara-service[bot] closed 5 days ago
[XLA:GPU] Enable fusing of fp8 matmuls through Triton.
Move cuBLAS fp8 GEMM rewriter after Triton GemmFusion, so that the Triton path has a chance to trigger.
Don't normalize fp8 types to fp16 for dot instruction.
[XLA:GPU] Enable fusing of fp8 matmuls through Triton.
Move cuBLAS fp8 GEMM rewriter after Triton GemmFusion, so that the Triton path has a chance to trigger.
Don't normalize fp8 types to fp16 for dot instruction.