pytorch / benchmark

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
BSD 3-Clause "New" or "Revised" License
817 stars 259 forks source link

Add FP8 blockwise triton kernel #2304

Closed htyu closed 2 weeks ago

htyu commented 2 weeks ago

Summary: Adding the FP8 blockwise triton kernel. The cutlass counterpart is not quite ready yet.

Differential Revision: D58615475

facebook-github-bot commented 2 weeks ago

This pull request was exported from Phabricator. Differential Revision: D58615475

facebook-github-bot commented 2 weeks ago

This pull request has been merged in pytorch/benchmark@fc298af7b9a9894643a077a989fe80c63ea5770b.