Framework providing pythonic APIs, algorithms and utilities to be used with Modulus core to physics inform model training as well as higher level abstraction for domain experts
The Gradient aggregation was not functioning correctly, because it was computing the losses on the same batch as opposed to different batches that is needed for Gradient aggregation. This PR adds a fix that enables use of Gradient aggregation for cases without CUDA Graphs.
Closes #51
Annular ring case now works as expected with Gradient aggregation.
Modulus Pull Request
Description
The Gradient aggregation was not functioning correctly, because it was computing the losses on the same batch as opposed to different batches that is needed for Gradient aggregation. This PR adds a fix that enables use of Gradient aggregation for cases without CUDA Graphs.
Closes #51
Annular ring case now works as expected with Gradient aggregation.
Pink: Baseline, Blue: 0.1x batch size, Red: 0.1x batch size + 10 gradient aggregation steps.
Checklist
Dependencies