Closed vkuzo closed 8 hours ago
Note: Links to docs will display an error until the docs builds have been completed.
As of commit 1ade9c854da5242c05b6e24ce08892c8d5303f4e with merge base 2843388de0ba5ae5af8891ad000178e1e57e731e ():
* [Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job](https://hud.pytorch.org/pr/pytorch/ao/1341#33493705866) ([gh](https://github.com/pytorch/ao/actions/runs/12015505771/job/33493705866)) `RuntimeError: Command docker exec -t 760beda19f43769fee08feb3dc5aba4a686564f8f9d57497d9f2985a9c720bcf /exec failed with exit code 2`
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Summary:
This PR moves the setting of
is_amax_initialized
flag onFloat8Linear
to thesync_float8_amax_and_scale_history
function.There are two reasons for this:
The
sync_float8_amax_and_scale_history
function is already called outside of the main model forward/backward, it's already required to be called at every iteration, it does not need to know about AC, and it seems like a great place to stash logic which isn't easily compileable such as this init code.After this PR the
enable_amax_init
andenable_pre_and_post_forward
config options are now no-ops. In a future PR we should add a deprecation warning, and eventually remove these.Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags: