pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
83.98k stars 22.64k forks source link

copy.deepcopy() breaks when pruning is set on sequential #37322

Closed Ge0rges closed 4 years ago

Ge0rges commented 4 years ago

🐛 Bug

Given a a subclass of nn.module which has nn.Sequential() parameters. If we register a pruning method on a subset of those parameters, deep_copy fails.

To Reproduce

Steps to reproduce the behavior:

  1. Subclass nn.module
  2. Create an attribute and set it to nn.Sequential(ModuleList)
  3. Register a pruning method on the parameters within ModuleList
class tester(nn.Module):
    def __init__(self):
        # Encoder
        self.tester = nn.Sequential(*encoder_layers)

        # Set the pruning method
        parameters_to_prune = []
        for param in self.tester:
            if isinstance(param, nn.Linear):
                parameters_to_prune.append((param, "weight"))

        # Add pruning
        prune.global_unstructured(
            parameters_to_prune,
            pruning_method=prune.L1Unstructured,
            amount=pruning_threshold
        )

def main():
    model = tester()
    copy = copy.deepcopy(model)

Below is a stack trace:

File "/home/gio/Documents/KNet/src/main_scripts/den_trainer.py", line 76, in train_tasks model_copy = copy.deepcopy(self.model).to(self.device) if with_den else None File "/usr/lib/python3.7/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, rv) File "/usr/lib/python3.7/copy.py", line 280, in _reconstruct state = deepcopy(state, memo) File "/usr/lib/python3.7/copy.py", line 150, in deepcopy y = copier(x, memo) File "/usr/lib/python3.7/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib/python3.7/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, rv) File "/usr/lib/python3.7/copy.py", line 306, in _reconstruct value = deepcopy(value, memo) File "/usr/lib/python3.7/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, rv) File "/usr/lib/python3.7/copy.py", line 280, in _reconstruct state = deepcopy(state, memo) File "/usr/lib/python3.7/copy.py", line 150, in deepcopy y = copier(x, memo) File "/usr/lib/python3.7/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib/python3.7/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, rv) File "/usr/lib/python3.7/copy.py", line 306, in _reconstruct value = deepcopy(value, memo) File "/usr/lib/python3.7/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, *rv) File "/usr/lib/python3.7/copy.py", line 280, in _reconstruct state = deepcopy(state, memo) File "/usr/lib/python3.7/copy.py", line 150, in deepcopy y = copier(x, memo) File "/usr/lib/python3.7/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/usr/lib/python3.7/copy.py", line 161, in deepcopy y = copier(memo) File "/home/gio/Documents/KNet/venv/lib/python3.7/site-packages/torch/tensor.py", line 44, in deepcopy raise RuntimeError("Only Tensors created explicitly by the user " RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

Expected behavior

Does not crash.

Environment

cc @albanD @mruberry @jbschlosser

albanD commented 4 years ago

cc @mickypaganini might be interested in this.

Ge0rges commented 4 years ago

One workaround I can imagine is removing these before the deep_copy and then adding it back right after.

mickypaganini commented 4 years ago

This makes sense... Reparametrizing the network necessarily results in the creation of new parameters in terms of the leaf params. Now, I don't know exactly why in deepcopy in tensor.py we explicitly check for leaves-only, and whether that can be relaxed.

Yes, workaround would be to call prune.remove before the deepcopy (but that might not be suitable if the masks or the original parameters are needed), or to copy first and then prune.

Ge0rges commented 4 years ago

@mickypaganini What do you mean by reparametrizing? Deep_copy shouldn't affect the parameters that currently exist correct? So while the prune function is parameter dependent, I don't see why it couldn't also be carried over.

I have no problem implementing that workaround, but from a software engineering pov it seems to me like "hack" more than anything.

mickypaganini commented 4 years ago

Pruning is a reparametrization. Pruning affects the parameters by rewriting them in terms of other tensors. Deepcopy doesn't affect the parameters that exist. However, it is designed to work only on leaf tensors, and not on derived tensors. That's the first thing it checks, and if that fails, it returns the error you see. I'm sure there's a good reason why pytorch only allows deepcopy of leaf params, but that makes it such that networks reparametrized in this way cannot be copied automatically.

Ge0rges commented 4 years ago

Thank you for that explanation. Given that this has been looked at, and a solution proposed i will close the issue.