pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Other
1.17k stars 474 forks source link

Fix CK Profiler Build and Tune Small CK FP8 Shapes #3017

Closed jwfromm closed 3 weeks ago

jwfromm commented 3 weeks ago

Summary: A recent bump to CK broke the profiler build, but excluding the problematic targets resolves the issue.

I also snuck in two improvements to the CK shape dispatch, the most significant of which doubles the performance for [64, 1280, 8192], which may be impactful for Llama70B.

Differential Revision: D61558684

netlify[bot] commented 3 weeks ago

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
Latest commit 375208f71e6dbcf84993365190edaac2120682aa
Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66c4fccab168d10008ff2258
Deploy Preview https://deploy-preview-3017--pytorch-fbgemm-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot commented 3 weeks ago

This pull request was exported from Phabricator. Differential Revision: D61558684

facebook-github-bot commented 3 weeks ago

This pull request has been merged in pytorch/FBGEMM@1c8ae9d0bae521b547e0f08c561db6541e35a52c.