Closed jwfromm closed 3 weeks ago
Name | Link |
---|---|
Latest commit | b83cab05c576d549266640925a4cb3414ea25ec0 |
Latest deploy log | https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66c3695df5d85f0008ebf5e8 |
Deploy Preview | https://deploy-preview-3010--pytorch-fbgemm-docs.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
This pull request was exported from Phabricator. Differential Revision: D61447274
This pull request was exported from Phabricator. Differential Revision: D61447274
This pull request was exported from Phabricator. Differential Revision: D61447274
This pull request has been merged in pytorch/FBGEMM@a973b754bf74fb19a3156687ad53cfc4b6f2bb54.
Summary: Refactor MX4 quantization to support per row padding as efficiently as possible. We do this in a cool way where we now support loading 2D blocks to make sure each thread has enough to work on. Using this approach, we should be able to apply padding to each row for effectively no cost.
Differential Revision: D61447274