Closed htyu closed 2 weeks ago
Summary: Adding the FP8 blockwise triton kernel. The cutlass counterpart is not quite ready yet.
Differential Revision: D58615475
This pull request was exported from Phabricator. Differential Revision: D58615475
This pull request has been merged in pytorch/benchmark@fc298af7b9a9894643a077a989fe80c63ea5770b.
Summary: Adding the FP8 blockwise triton kernel. The cutlass counterpart is not quite ready yet.
Differential Revision: D58615475