Enable in-place output tensor for FP8 Rowwise Kernels

pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

Other

1.12k stars 451 forks source link

Enable in-place output tensor for FP8 Rowwise Kernels #2784

Closed jwfromm closed 2 weeks ago

jwfromm commented 3 weeks ago

Summary: There are some cases where passing in an externally allocated output tensor may be beneficial rather than allocating internally in our kernels. This Diff adds an optional output tensor argument to the fp8 rowwise kernel.

Differential Revision: D59023976

netlify[bot] commented 3 weeks ago

Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
Latest commit	a644cea232fca39e2097aec38741614f63c75c7d
Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/667b9b25a46aaa000860edd6
Deploy Preview	https://deploy-preview-2784--pytorch-fbgemm-docs.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D59023976

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D59023976

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D59023976

facebook-github-bot commented 2 weeks ago

This pull request has been merged in pytorch/FBGEMM@c07d9c5e7ab27695ecfcd378fad98bc43f9ceeea.

pytorch / FBGEMM

Enable in-place output tensor for FP8 Rowwise Kernels #2784

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Deploy Preview for pytorch-fbgemm-docs ready!