Closed ffuuugor closed 1 year ago
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
Background
Poisson sampling can sometimes result in an empty input batch, especially if a sampling rate (i.e. expected batch size) is small. This is not out of the ordinary and should be handled accordingly - gradients (signal) should be set to 0 and noise should still be added.
We've made an attempt to support this behaviour, but it wasn't fully covered with tests and got broken over time. As a result, at the moment we have a DataLoader that is capable of producing zero-sized batches, GradSampleModule that only partially supports them and DPOptimizer that doesn't support them at all
This PR addresses Issue #522 (thanks @xichens for reporting)
Improvements
This diff fixes the following