pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Other
1.18k stars 486 forks source link

Replace Triton persistent row-wise kernels with non-persistent #2789

Closed jiawenliu64 closed 3 months ago

jiawenliu64 commented 3 months ago

Summary: Triton new persistent row-wise kernels have numeric issues on trunk (three-order larger magnitude than expected similarity, which shows numeric issue)

This Diff replaces Triton persistent row-wise kernels with non-persistent to unblock Triton rowwise numeric evaluation on trunk

Before this Diff:

cutlass_rowwise sim: 22.875.
triton_rowwise sim: 32256.000.

After this Diff:

cutlass_rowwise sim: 22.875.
triton_rowwise sim: 23.500.

Reviewed By: jianyuh

Differential Revision: D59076826

netlify[bot] commented 3 months ago

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
Latest commit 40b6b7f2928dad4e395cce94a18dd98ab0f657d9
Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/667c9b7abe836c00082ee27e
Deploy Preview https://deploy-preview-2789--pytorch-fbgemm-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 3 months ago

This pull request was exported from Phabricator. Differential Revision: D59076826

facebook-github-bot commented 3 months ago

This pull request was exported from Phabricator. Differential Revision: D59076826

facebook-github-bot commented 3 months ago

This pull request was exported from Phabricator. Differential Revision: D59076826

facebook-github-bot commented 3 months ago

This pull request has been merged in pytorch/FBGEMM@7adfaa834cd43bd2b593b19003c8ccf36670644e.