Closed Ge0rges closed 4 years ago
cc @mickypaganini might be interested in this.
One workaround I can imagine is removing these before the deep_copy and then adding it back right after.
This makes sense... Reparametrizing the network necessarily results in the creation of new parameters in terms of the leaf params. Now, I don't know exactly why in deepcopy in tensor.py we explicitly check for leaves-only, and whether that can be relaxed.
Yes, workaround would be to call prune.remove
before the deepcopy (but that might not be suitable if the masks or the original parameters are needed), or to copy first and then prune.
@mickypaganini What do you mean by reparametrizing? Deep_copy shouldn't affect the parameters that currently exist correct? So while the prune function is parameter dependent, I don't see why it couldn't also be carried over.
I have no problem implementing that workaround, but from a software engineering pov it seems to me like "hack" more than anything.
Pruning is a reparametrization. Pruning affects the parameters by rewriting them in terms of other tensors. Deepcopy doesn't affect the parameters that exist. However, it is designed to work only on leaf tensors, and not on derived tensors. That's the first thing it checks, and if that fails, it returns the error you see. I'm sure there's a good reason why pytorch only allows deepcopy of leaf params, but that makes it such that networks reparametrized in this way cannot be copied automatically.
Thank you for that explanation. Given that this has been looked at, and a solution proposed i will close the issue.
🐛 Bug
Given a a subclass of nn.module which has nn.Sequential() parameters. If we register a pruning method on a subset of those parameters, deep_copy fails.
To Reproduce
Steps to reproduce the behavior:
Below is a stack trace:
Expected behavior
Does not crash.
Environment
cc @albanD @mruberry @jbschlosser