Add Stochastic downcasting to MX4 Quantization

pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

Other

1.2k stars 494 forks source link

Add Stochastic downcasting to MX4 Quantization #2899

Closed jwfromm closed 3 months ago

jwfromm commented 3 months ago

Summary: This diff adds stochastic noise injection when using the newstochastic_casting argument during mx4 quantization. This is an experimental feature that defaults to being off. Experimentally, it may worsen results and increase runtime.

I did my best to optimize the random noise generation with optimal philox generation but it still is fairly costly. It brings the runtime from ~3000ms to ~3500ms.

Differential Revision: D59835478

facebook-github-bot commented 3 months ago

This pull request was exported from Phabricator. Differential Revision: D59835478

netlify[bot] commented 3 months ago

Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
Latest commit	be3d4891630d64cb23b907e46420bd54e6ea91c3
Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66a2b45cb308eb00081c2518
Deploy Preview	https://deploy-preview-2899--pytorch-fbgemm-docs.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 3 months ago

This pull request has been merged in pytorch/FBGEMM@53cd5c84e18cf43464e70f40a96389c090ff5c1f.

pytorch / FBGEMM

Add Stochastic downcasting to MX4 Quantization #2899

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Deploy Preview for pytorch-fbgemm-docs ready!