Closed Aya-ZIbra closed 3 weeks ago
Name | Link |
---|---|
Latest commit | 0f4a11fbb9ff01617a54136a9a3745eb2b6541c2 |
Latest deploy log | https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66c1c56e2325cc00083ec2dd |
Deploy Preview | https://deploy-preview-3009--pytorch-fbgemm-docs.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
This pull request was exported from Phabricator. Differential Revision: D60747287
This pull request was exported from Phabricator. Differential Revision: D60747287
This pull request was exported from Phabricator. Differential Revision: D60747287
This pull request has been merged in pytorch/FBGEMM@6bd8cd7d1df4a001e476f38fa5e81104e77d39a6.
Summary: This diff adds support for Triton-splitk kernel. Includes: 1/ prefill_varseq_attn and decode_attn 2/ dequantize kernel 3/ fused quantization in rope functions
TODO: Dequantize + paged kv cache
Differential Revision: D60747287