pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Other
1.2k stars 494 forks source link

Add Stochastic downcasting to MX4 Quantization #2899

Closed jwfromm closed 3 months ago

jwfromm commented 3 months ago

Summary: This diff adds stochastic noise injection when using the newstochastic_casting argument during mx4 quantization. This is an experimental feature that defaults to being off. Experimentally, it may worsen results and increase runtime.

I did my best to optimize the random noise generation with optimal philox generation but it still is fairly costly. It brings the runtime from ~3000ms to ~3500ms.

Differential Revision: D59835478

facebook-github-bot commented 3 months ago

This pull request was exported from Phabricator. Differential Revision: D59835478

netlify[bot] commented 3 months ago

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
Latest commit be3d4891630d64cb23b907e46420bd54e6ea91c3
Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66a2b45cb308eb00081c2518
Deploy Preview https://deploy-preview-2899--pytorch-fbgemm-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 3 months ago

This pull request has been merged in pytorch/FBGEMM@53cd5c84e18cf43464e70f40a96389c090ff5c1f.