facebookresearch / fairscale

PyTorch extensions for high performance and large scale training.
Other
3.18k stars 280 forks source link

Llama4 FP8 Training Debug - fairscale #1183

Open jiecaoyu opened 5 months ago

jiecaoyu commented 5 months ago

What does this PR do?

Fixes # (issue).

Before submitting

PR review

Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.