FBGEMM CK FP8 Optimization for BS > 1

pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

Other

1.2k stars 494 forks source link

Closed jwfromm closed 2 months ago

jwfromm commented 2 months ago

Summary: This diff adds better kernel optimization for larger batch sizes in llama shapes.

Reviewed By: jianyuh, mxz297

Differential Revision: D60680651

facebook-github-bot commented 2 months ago

This pull request was exported from Phabricator. Differential Revision: D60680651

netlify[bot] commented 2 months ago

Name	Link
Latest commit	3dc989385e7f30a1c16179939b79f0f593e6f083
Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66b398f2543ae30008e03fb1
Deploy Preview	https://deploy-preview-2940--pytorch-fbgemm-docs.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 2 months ago

This pull request was exported from Phabricator. Differential Revision: D60680651

facebook-github-bot commented 2 months ago

This pull request was exported from Phabricator. Differential Revision: D60680651

facebook-github-bot commented 2 months ago

This pull request was exported from Phabricator. Differential Revision: D60680651

facebook-github-bot commented 2 months ago

This pull request was exported from Phabricator. Differential Revision: D60680651

facebook-github-bot commented 2 months ago

This pull request has been merged in pytorch/FBGEMM@0ebb3aeab41c5b304cac45436e893d42147ad81e.