ktangsali commented 8 months ago

Modulus Pull Request

Description

The Gradient aggregation was not functioning correctly, because it was computing the losses on the same batch as opposed to different batches that is needed for Gradient aggregation. This PR adds a fix that enables use of Gradient aggregation for cases without CUDA Graphs.

Closes #51

Annular ring case now works as expected with Gradient aggregation.

Pink: Baseline, Blue: 0.1x batch size, Red: 0.1x batch size + 10 gradient aggregation steps.

Checklist

[x] I am familiar with the Contributing Guidelines.
[x] New or existing tests cover these changes.
[x] The documentation is up to date with these changes.
[x] The CHANGELOG.md is up to date with these changes.
[x] An issue is linked to this pull request.

Dependencies

ktangsali commented 8 months ago

/blossom-ci

ktangsali commented 8 months ago

/blossom-ci

ktangsali commented 8 months ago

/blossom-ci

NVIDIA / modulus-sym

Add a fix for gradient aggregation #82

Modulus Pull Request

Description

Checklist

Dependencies