pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Other
1.17k stars 474 forks source link

Marlin Mixed Input Kernel Productionization #3008

Closed jwfromm closed 3 weeks ago

jwfromm commented 3 weeks ago

Summary: This diff does quite a bit of facelifting to our Marlin BF16 X I4 kernels. These improvements include:

One downside of this work is that we have diverged a bit from VLLM so it may be harder to stay in sync going forward. However, I think the benefits of the improvements in this diff outweigh potential sync costs.

Differential Revision: D61408771

netlify[bot] commented 3 weeks ago

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
Latest commit d5db96230b0db325d72e7a6f89ef22c1055bc159
Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66c3d09495815f0008d169be
Deploy Preview https://deploy-preview-3008--pytorch-fbgemm-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D61408771

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D61408771

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D61408771

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D61408771

facebook-github-bot commented 3 weeks ago

This pull request has been merged in pytorch/FBGEMM@162cc69b133797b213664e41c9923d96593d1fc3.