microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14.02k stars 1.81k forks source link

AssertionError: The number of the output should be one after the Tuple unpacked manually #5568

Open gkrisp98 opened 1 year ago

gkrisp98 commented 1 year ago

Describe the issue:

Environment: Vs Code

Hi, I am trying to prune a face detector. I used the following code:

config_list = [{
    'sparsity' : 0.2,
    'op_types' : ['Conv2d'],
}, {
    'exclude' : True,
    'op_names' : ['loc.0', 'loc.1', 'loc.2', 'loc.3', 'loc.4', 'loc.5',
                  'conf.0', 'conf.1', 'conf.2', 'conf.3', 'conf.4', 'conf.5'
                  ]
}]

from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner
pruner = L1NormPruner(model, config_list)
_, masks = pruner.compress()

import matplotlib.pyplot as plt

for _, mask in masks.items():
    mask = mask['weight'].detach().cpu().numpy()

But, when I try to use ModeSpeedup using this code:

pruner._unwrap_model()

from nni.compression.pytorch.speedup import ModelSpeedup
#model.eval()
ModelSpeedup(model, torch.rand(1,3,28,28), masks).speedup_model()

I get the following error:

/m2/user/Projects/EXTD_Pytorch-master2/layers/functions/prior_box.py:51: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  output = torch.Tensor(mean).view(-1, 4)
/m2/user/Projects/EXTD_Pytorch-master2/layers/functions/prior_box.py:51: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  output = torch.Tensor(mean).view(-1, 4)
/m2/user/Projects/EXTD_Pytorch-master2/EXTD_64.py:186: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.priors = Variable(self.priorbox.forward(), volatile=True)
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
/home/user/anaconda3/envs/gpu/lib/python3.9/site-packages/torch/jit/_trace.py:992: TracerWarning: Output nr 1. of the traced function does not match the corresponding output of the Python function. Detailed error:
Tensor-likes are not close!

Mismatched elements: 160 / 60000 (0.3%)
Greatest absolute difference: 11.968140602111816 at index (0, 1, 2, 4) (up to 1e-05 allowed)
Greatest relative difference: 1.0 at index (0, 1, 0, 0) (up to 1e-05 allowed)
  _check_trace(
[2023-05-16 15:41:01] start to speedup the model
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
zero
[2023-05-16 15:41:02] infer module masks...
[2023-05-16 15:41:02] Update mask for base.0.0
[2023-05-16 15:41:02] Update mask for base.0.1
[2023-05-16 15:41:02] Update mask for base.0.2
[2023-05-16 15:41:02] Update mask for base.1.conv.0
[2023-05-16 15:41:02] Update mask for base.1.conv.1
[2023-05-16 15:41:02] Update mask for base.1.conv.2
[2023-05-16 15:41:02] Update mask for base.1.conv.3
...
[2023-05-16 15:41:03] Update mask for .aten::view.299
[2023-05-16 15:41:03] WARNING: throw some args away when calling the function "view"
[2023-05-16 15:41:03] WARNING: throw some args away when calling the function "view"
[2023-05-16 15:41:03] Update mask for .aten::max.267
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?95197ace-cb34-41e7-9b26-e0d4b265c104) or open in a [text editor](command:workbench.action.openLargeOutput?95197ace-cb34-41e7-9b26-e0d4b265c104). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[9], line 5
      3 from nni.compression.pytorch.speedup import ModelSpeedup
      4 #model.eval()
----> 5 ModelSpeedup(model, torch.rand(1,3,28,28), masks).speedup_model()

File [~/anaconda3/envs/gpu/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py:546](https://vscode-remote+ssh-002dremote-002b160-002e40-002e53-002e84.vscode-resource.vscode-cdn.net/m2/gkrispanis/Projects/EXTD_Pytorch-master2/~/anaconda3/envs/gpu/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py:546), in ModelSpeedup.speedup_model(self)
    543 fix_mask_conflict(self.masks, self.bound_model, self.dummy_input)
    545 _logger.info("infer module masks...")
--> 546 self.infer_modules_masks()
    547 _logger.info('resolve the mask conflict')
    549 # load the original stat dict before replace the model

File [~/anaconda3/envs/gpu/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py:383](https://vscode-remote+ssh-002dremote-002b160-002e40-002e53-002e84.vscode-resource.vscode-cdn.net/m2/gkrispanis/Projects/EXTD_Pytorch-master2/~/anaconda3/envs/gpu/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py:383), in ModelSpeedup.infer_modules_masks(self)
    381 curnode = visit_queue.get()
    382 # forward mask inference for curnode
--> 383 self.update_direct_sparsity(curnode)
    384 successors = self.torch_graph.find_successors(curnode.unique_name)
    385 for successor in successors:

File [~/anaconda3/envs/gpu/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py:257](https://vscode-remote+ssh-002dremote-002b160-002e40-002e53-002e84.vscode-resource.vscode-cdn.net/m2/gkrispanis/Projects/EXTD_Pytorch-master2/~/anaconda3/envs/gpu/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py:257), in ModelSpeedup.update_direct_sparsity(self, node)
    252 _auto_infer.input_debugname = input_debugname
    253 # update the mask tensor and the internal output of the submodules
    254 # after manually unpack the tuple/list of tensors, the number of the outputs
...
    258     node.outputs) == 1, 'The number of the output should be one after the Tuple unpacked manually'
    260 out_debugname = node.outputs[0]
    261 # update the output mask into self.masks

AssertionError: The number of the output should be one after the Tuple unpacked manually

Even though I get this error, the model seems to be pruned. What is this issue ? How does it affect the pruning of the model and is there any way to overcome it ? Thanks in advance

J-shang commented 1 year ago

hello @gkrisp98 , seems that the tuple unpacking goes wrong, we have released a new speedup version, you could have a try:

pip install nni=3.0rc1

from nni.compression.pytorch.speedup.v2 import ModelSpeedup
gkrisp98 commented 1 year ago

I tried to use speedup.v2 and I got this error:

TypeError                                 Traceback (most recent call last)
[<ipython-input-11-3f9ced178c5c>](https://localhost:8080/#) in <cell line: 5>()
      3 from nni.compression.pytorch.speedup.v2 import ModelSpeedup
      4 model.eval()
----> 5 ModelSpeedup(model, torch.rand(1,3,28,28), masks).speedup_model()

8 frames
[/usr/local/lib/python3.10/dist-packages/nni/compression/pytorch/speedup/v2/model_speedup.py](https://localhost:8080/#) in __init__(self, model, dummy_input, masks_or_file, map_location, batch_dim, batch_size, customized_mask_updaters, customized_replacers, graph_module, garbage_collect_values, logger)
     98         self.dummy_input = _normalize_input(dummy_input)
     99         self.bound_model = model
--> 100         self.graph_module = graph_module if isinstance(graph_module, GraphModule) else concrete_trace(model, self.dummy_input)
    101 
    102         super().__init__(self.graph_module, garbage_collect_values)

[/usr/local/lib/python3.10/dist-packages/nni/common/concrete_trace_utils/concrete_tracer.py](https://localhost:8080/#) in concrete_trace(root, concrete_args, use_operator_patch, operator_patch_backlist, forward_function_name, check_args, autowrap_leaf_function, autowrap_leaf_class, leaf_module, fake_middle_class, dce)
   1473     tracer = ConcreteTracer()
   1474 
-> 1475     graph = tracer.trace(root,
   1476         autowrap_leaf_function = autowrap_leaf_function,
   1477         autowrap_leaf_class = autowrap_leaf_class,

[/usr/local/lib/python3.10/dist-packages/nni/common/concrete_trace_utils/concrete_tracer.py](https://localhost:8080/#) in trace(self, root, autowrap_modules, autowrap_leaf_function, autowrap_leaf_class, leaf_module, fake_middle_class, concrete_args, use_operator_patch, operator_patch_backlist, forward_function_name)
    984                 with OperatorPatcherContext(self, use_operator_patch, operator_patch_backlist):
    985                     self.create_node('output', 'output',
--> 986                                     (self.create_arg(OperatorPatcherContext.patch_run(fn, *args, *more_args, **kwargs)),),
    987                                     {}, type_expr=fn.__annotations__.get('return', None))
    988         finally:

[/usr/local/lib/python3.10/dist-packages/nni/common/concrete_trace_utils/operator_patcher.py](https://localhost:8080/#) in patch_run(func, *args, **kwargs)
    287         with OperatorPatcherContext.ctx_tracer.do_temp_disable(True, True, True):
    288             new_func = OperatorPatcherContext.ctx_patcher.patch_inner(func)
--> 289         return new_func(*args, **kwargs)

[/content/drive/MyDrive/EXTD_Pytorch-master2/EXTD_64.py](https://localhost:8080/#) in new_func(self, x)
    185         self.priorbox = PriorBox(size, features_maps, cfg)
    186         with torch.no_grad():
--> 187           self.priors = self.priorbox.forward()
    188 
    189         loc = torch.cat([o.view(o.size(0), -1) for o in loc], 1)

[/usr/local/lib/python3.10/dist-packages/nni/common/concrete_trace_utils/operator_patcher.py](https://localhost:8080/#) in patch_run(func, *args, **kwargs)
    287         with OperatorPatcherContext.ctx_tracer.do_temp_disable(True, True, True):
    288             new_func = OperatorPatcherContext.ctx_patcher.patch_inner(func)
--> 289         return new_func(*args, **kwargs)

[/content/drive/MyDrive/EXTD_Pytorch-master2/layers/functions/prior_box.py](https://localhost:8080/#) in new_func(self)
     49                 mean += [cx, cy, s_kw, s_kh]
     50 
---> 51         output = torch.Tensor(mean).view(-1, 4)
     52         if self.clip:
     53             output.clamp_(max=1, min=0)

[/usr/local/lib/python3.10/dist-packages/nni/common/concrete_trace_utils/operator_patcher.py](https://localhost:8080/#) in patch_run(func, *args, **kwargs)
    287         with OperatorPatcherContext.ctx_tracer.do_temp_disable(True, True, True):
    288             new_func = OperatorPatcherContext.ctx_patcher.patch_inner(func)
--> 289         return new_func(*args, **kwargs)

[/usr/local/lib/python3.10/dist-packages/nni/common/concrete_trace_utils/concrete_proxy.py](https://localhost:8080/#) in __len__(self)
    134         if insts[cur].opcode == self.op_call_ex:
    135             # in executing func(..., *proxy)
--> 136             return _orig_len(self.value)
    137         elif insts[cur].opcode == self.op_tuple_unpack_call:
    138             # in executing func(*..., *proxy)

TypeError: object of type 'float' has no len()