Closed ffuuugor closed 1 year ago
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor has updated the pull request. You must reimport the pull request before landing.
@ffuuugor has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@ffuuugor merged this pull request in pytorch/opacus@e8bc932c1e6a6a1f8a4b1b00f0f7ef134b5edf4d.
This PR is a collection of smaller fixes that will save us some deprecation issues in the future
1. Updating to PyTorch 2.0
Key files: grad_sample/functorch.py, requirements.txt
functorch
has been a part of core PyTorch since 1.13. Now they're going a step further and changing the API, while deprecating the old one.There's a guide on how to migrate. TL;DR -
make_functional
will no longer be part of the API, withtorch.func.functional_call()
being (non drop-in) replacement.They key difference for us is
make_functional()
creates a fresh copy of the module, whilefunctional_call()
uses existing module. As a matter of fact, we need the fresh copy (otherwise all the hooks start firing and you enter nested madness), so I've copy-pasted a gist from the official guide on how to get a full replacement formake_functional
.2. New mechanism for gradient accumulation detection
Key file: privacy_engine.py, grad_sample_module.py
As reported on the forum, clients are still getting "non-full backward hook" warning even when using
grad_sample_mode="ew"
. Naturally,functorch
andhooks
modes rely on backward hooks and can't be migrated to full hooks because reasons. However,ew
doesn't rely on hooks and it's unclear why the message should appear.The reason, however, is simple. If the client is using poisson sampling we add an extra check to prohibit gradient accumulation (two poisson batches combined is not a poisson batch), and we do that by the means of backward hooks.
~In this case, backward hook serves a simple purpose and there shouldn't be any problems with migrating to the new method, however that involved changing the checking method. That's because
register_backward_hook
is called after hooks on submodule, butregister_full_backward_hook
is called before.~Strikethrough solution didn't work, because hook order execution is weird for complex graphs, e.g. for GANs. For example, if your forward call looks like this:
then top-level module hook will precede submodule's hooks for
Generator
, but not forDiscriminator
As such, I've realised that gradient accumulation is not even supported in
ExpandedWeights
, so we don't have to worry about that. And the other two modes are both hooks-based, so we can just check the accumulation in the existing backward hook, no need for an extra hook. Deleted some code, profit.3. Refactoring
wrap_collate_with_empty
to please pickleNow here're two facts I didn't know before
1) You can't pickle a nested function, e.g. you can't do the following
2) Whether or not
multiprocessing
uses pickle is python- and platform- dependant.This affects our tests when we test
DataLoader
with multiple workers. As such, our data loaders tests:I'm not sure how cow common the issue is, but it's safer to just refactor
wrap_collate_with_empty
to avoid nested functions.4. Fix benchmark tests
We don't really run
benchmarks/tests
on a regular basis, and some of them were broken since we've upgraded to PyTorch 1.13 (API_CUTOFF_VERSION
doesn't exist anymore)4. Fix flake8 config
Flake8 config no longer support inline comments, fix is due