microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
13.97k stars 1.81k forks source link

assert len(args) >= len(self.undetermined) AssertionError #5185

Open Alex-Songs opened 1 year ago

Alex-Songs commented 1 year ago

lib/python3.8/site-packages/nni/compression/pytorch/speedup/jit_translate.py", line 226, in call assert len(args) >= len(self.undetermined) AssertionError

Alex-Songs commented 1 year ago

model is: (subsampling): Conv2dSubsampling( (conv): Sequential( (subsampling/pad0): ConstantPad2d(padding=(0, 0, 2, 0), value=0) (subsampling/conv0): Conv2d(1, 32, kernel_size=(3, 3), stride=(2, 1)) (subsampling/relu0): ReLU() (subsampling/pad1): ConstantPad2d(padding=(0, 0, 2, 0), value=0) (subsampling/conv1): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 1)) (subsampling/relu1): ReLU() ) (affine): Linear(in_features=1152, out_features=1024, bias=True) (norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)

Alex-Songs commented 1 year ago

[2022-10-26 16:26:17] start to speedup the model no multi-dimension masks found. [2022-10-26 16:26:20] infer module masks... [2022-10-26 16:26:20] Update mask for subsampling.aten::unsqueeze.197 [2022-10-26 16:26:20] Update mask for subsampling.aten::permute.198 [2022-10-26 16:26:20] Update mask for pad0.aten::constant_pad_nd.208 [2022-10-26 16:26:20] Update mask for conv0.aten::_convolution.209 Traceback (most recent call last): assert len(args) >= len(self.undetermined) AssertionError

Louis-J commented 1 year ago

it does have a problem. when the modelspeedup doing inference, it should run at the leaf module level such as 'pad0', 'conv0' but not at op level.

I can't reproduce it in a simple model with conv2d. so I need your help. please offer a simple code that can reproduce it, and your nni version.

sunlinlin-aragon commented 1 year ago

I have the same problem in efficientnet-b0, Request help。 1

Louis-J commented 1 year ago

I have the same problem in efficientnet-b0, Request help。 1

can you offer a runnable and reproduceable code? I can't reproduce it.

Hap-Zhang commented 1 year ago

@Louis-J Hi, i have the same problem when run this demo, could you have a look? https://github.com/microsoft/nni/blob/master/examples/model_compress/pruning/legacy/speedup/speedup_yolov3.py

Hap-Zhang commented 1 year ago

@Louis-J I run this demo in docker: docker pull msranni/nni , the nni version is 2.9

briancpark commented 1 year ago

I have the same issue when doing densenet121 from torchvision's pretrained models.

Louis-J commented 1 year ago

reproduced. We are trying to refactor the ModelSpeedup now.

Hap-Zhang commented 1 year ago

Hi,@Louis-J could you let me know if there is some update?

Louis-J commented 1 year ago

Hi,@Louis-J could you let me know if there is some update?

sorry i was infected with covid last week. i can't reproduce it in speedup_yolov3.py on cpu.

Blakey-Gavin commented 1 year ago

I also met the same problem, as shown below. nni version: 2.10 python version: 3.8.13 pytorch version: 3.8.0 + cu111 model: From the current results, the model has no impact. The use of mobilenet or densenet is always this error, which is in torchlibrosa/stft.py

inputs: ['3443'] is empty image

Debug information: image

image

image

Describe: "ola_window" is a parameter, which is generated by "register_buffer". As follows(torchlibrosa: stft.py): ISTFT.init_overlap_add_window() https://github.com/qiuqiangkong/torchlibrosa/blob/master/torchlibrosa/stft.py

I hope you can help me solve this problem. Thanks!

shiuang commented 1 year ago

Here is a simple case I met which had the same error. I hope that can help you guys to debug.

import torch
import torch.nn as nn
from nni.compression.pytorch.pruning import L1NormPruner
from nni.compression.pytorch.speedup import ModelSpeedup

class ConvLayer(nn.Sequential):
    def __init__(self, in_channels, out_channels, kernel=3, stride=1, dropout=0.1, return_feature=False):
        super().__init__()
        self.add_module('conv', nn.Conv2d(in_channels, out_channels, kernel_size=kernel,
                                          stride=stride, padding=kernel//2, bias = False))
        self.add_module('norm', nn.BatchNorm2d(out_channels))
        self.add_module('relu', nn.ReLU(inplace=True))

        self.return_feature = return_feature

    def forward(self, x):
        for i, m in enumerate(self.modules()):
            if i == 0:
                continue
            x = m.forward(x)
            if self.return_feature and i == 1:
                feature = x
        if not self.return_feature:
            return x
        else:
            return x, feature

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv = ConvLayer(3, 16)

    def forward(self, x):
        return self.conv(x)

model = Model()
model.eval()

config_list = [
    {
        'sparsity_per_layer': 0.7,
        'op_types': ['Conv2d']
    }
]
pruner = L1NormPruner(model, config_list)
_, masks = pruner.compress()
pruner._unwrap_model()
ModelSpeedup(model, torch.rand(1, 3, 224, 224), masks).speedup_model()
xbbkok commented 1 year ago

I also met this problem :

module.feature_extractor.conv1.conv1_1 sparsity : 0.5 module.feature_extractor.conv1.conv1_2 sparsity : 0.5 module.feature_extractor.conv1.conv1_3 sparsity : 0.5 module.feature_extractor.layer1.0.conv1 sparsity : 0.5 module.feature_extractor.layer1.0.conv2 sparsity : 0.5 module.feature_extractor.layer1.0.conv3 sparsity : 0.5 module.feature_extractor.layer1.0.downsample.0 sparsity : 0.5 module.feature_extractor.layer1.1.conv1 sparsity : 0.5 module.feature_extractor.layer1.1.conv2 sparsity : 0.5 module.feature_extractor.layer1.1.conv3 sparsity : 0.5 module.feature_extractor.layer1.2.conv1 sparsity : 0.5 module.feature_extractor.layer1.2.conv2 sparsity : 0.5 module.feature_extractor.layer1.2.conv3 sparsity : 0.5 module.feature_extractor.layer2.0.conv1 sparsity : 0.5 module.feature_extractor.layer2.0.conv2 sparsity : 0.5 module.feature_extractor.layer2.0.conv3 sparsity : 0.5 module.feature_extractor.layer2.0.downsample.0 sparsity : 0.5 module.feature_extractor.layer2.1.conv1 sparsity : 0.5 module.feature_extractor.layer2.1.conv2 sparsity : 0.5 module.feature_extractor.layer2.1.conv3 sparsity : 0.5 module.feature_extractor.layer2.2.conv1 sparsity : 0.5 module.feature_extractor.layer2.2.conv2 sparsity : 0.5 module.feature_extractor.layer2.2.conv3 sparsity : 0.5 module.feature_extractor.layer2.3.conv1 sparsity : 0.5 module.feature_extractor.layer2.3.conv2 sparsity : 0.5 module.feature_extractor.layer2.3.conv3 sparsity : 0.5 module.feature_extractor.layer3.0.conv1 sparsity : 0.5 module.feature_extractor.layer3.0.conv2 sparsity : 0.5 module.feature_extractor.layer3.0.conv3 sparsity : 0.5 module.feature_extractor.layer3.0.downsample.0 sparsity : 0.5 module.feature_extractor.layer3.1.conv1 sparsity : 0.5 module.feature_extractor.layer3.1.conv2 sparsity : 0.5 module.feature_extractor.layer3.1.conv3 sparsity : 0.5 module.feature_extractor.layer3.2.conv1 sparsity : 0.5 module.feature_extractor.layer3.2.conv2 sparsity : 0.5 module.feature_extractor.layer3.2.conv3 sparsity : 0.5 module.feature_extractor.layer3.3.conv1 sparsity : 0.5 module.feature_extractor.layer3.3.conv2 sparsity : 0.5 module.feature_extractor.layer3.3.conv3 sparsity : 0.5 module.feature_extractor.layer3.4.conv1 sparsity : 0.5 module.feature_extractor.layer3.4.conv2 sparsity : 0.5 module.feature_extractor.layer3.4.conv3 sparsity : 0.5 module.feature_extractor.layer3.5.conv1 sparsity : 0.5 module.feature_extractor.layer3.5.conv2 sparsity : 0.5 module.feature_extractor.layer3.5.conv3 sparsity : 0.5 module.feature_extractor.layer3.6.conv1 sparsity : 0.5 module.feature_extractor.layer3.6.conv2 sparsity : 0.5 module.feature_extractor.layer3.6.conv3 sparsity : 0.5 module.feature_extractor.layer3.7.conv1 sparsity : 0.5 module.feature_extractor.layer3.7.conv2 sparsity : 0.5 module.feature_extractor.layer3.7.conv3 sparsity : 0.5 module.feature_extractor.layer3.8.conv1 sparsity : 0.5 module.feature_extractor.layer3.8.conv2 sparsity : 0.5 module.feature_extractor.layer3.8.conv3 sparsity : 0.5 module.feature_extractor.layer3.9.conv1 sparsity : 0.5 module.feature_extractor.layer3.9.conv2 sparsity : 0.5 module.feature_extractor.layer3.9.conv3 sparsity : 0.5 module.feature_extractor.layer3.10.conv1 sparsity : 0.5 module.feature_extractor.layer3.10.conv2 sparsity : 0.5 module.feature_extractor.layer3.10.conv3 sparsity : 0.5 module.feature_extractor.layer3.11.conv1 sparsity : 0.5 module.feature_extractor.layer3.11.conv2 sparsity : 0.5 module.feature_extractor.layer3.11.conv3 sparsity : 0.5 module.feature_extractor.layer3.12.conv1 sparsity : 0.5 module.feature_extractor.layer3.12.conv2 sparsity : 0.5 module.feature_extractor.layer3.12.conv3 sparsity : 0.5 module.feature_extractor.layer3.13.conv1 sparsity : 0.5 module.feature_extractor.layer3.13.conv2 sparsity : 0.5 module.feature_extractor.layer3.13.conv3 sparsity : 0.5 module.feature_extractor.layer3.14.conv1 sparsity : 0.5 module.feature_extractor.layer3.14.conv2 sparsity : 0.5 module.feature_extractor.layer3.14.conv3 sparsity : 0.5 module.feature_extractor.layer3.15.conv1 sparsity : 0.5 module.feature_extractor.layer3.15.conv2 sparsity : 0.5 module.feature_extractor.layer3.15.conv3 sparsity : 0.5 module.feature_extractor.layer3.16.conv1 sparsity : 0.5 module.feature_extractor.layer3.16.conv2 sparsity : 0.5 module.feature_extractor.layer3.16.conv3 sparsity : 0.5 module.feature_extractor.layer3.17.conv1 sparsity : 0.5 module.feature_extractor.layer3.17.conv2 sparsity : 0.5 module.feature_extractor.layer3.17.conv3 sparsity : 0.5 module.feature_extractor.layer3.18.conv1 sparsity : 0.5 module.feature_extractor.layer3.18.conv2 sparsity : 0.5 module.feature_extractor.layer3.18.conv3 sparsity : 0.5 module.feature_extractor.layer3.19.conv1 sparsity : 0.5 module.feature_extractor.layer3.19.conv2 sparsity : 0.5 module.feature_extractor.layer3.19.conv3 sparsity : 0.5 module.feature_extractor.layer3.20.conv1 sparsity : 0.5 module.feature_extractor.layer3.20.conv2 sparsity : 0.5 module.feature_extractor.layer3.20.conv3 sparsity : 0.5 module.feature_extractor.layer3.21.conv1 sparsity : 0.5 module.feature_extractor.layer3.21.conv2 sparsity : 0.5 module.feature_extractor.layer3.21.conv3 sparsity : 0.5 module.feature_extractor.layer3.22.conv1 sparsity : 0.5 module.feature_extractor.layer3.22.conv2 sparsity : 0.5 module.feature_extractor.layer3.22.conv3 sparsity : 0.5 module.feature_extractor.layer4.0.conv1 sparsity : 0.5 module.feature_extractor.layer4.0.conv2 sparsity : 0.5 module.feature_extractor.layer4.0.conv3 sparsity : 0.5 module.feature_extractor.layer4.0.downsample.0 sparsity : 0.5 module.feature_extractor.layer4.1.conv1 sparsity : 0.5 module.feature_extractor.layer4.1.conv2 sparsity : 0.5 module.feature_extractor.layer4.1.conv3 sparsity : 0.5 module.feature_extractor.layer4.2.conv1 sparsity : 0.5 module.feature_extractor.layer4.2.conv2 sparsity : 0.5 module.feature_extractor.layer4.2.conv3 sparsity : 0.5 module.aspp_module.encoder.global_fc sparsity : 0.5 module.aspp_module.encoder.conv1 sparsity : 0.5 module.aspp_module.aspp1.0 sparsity : 0.5 module.aspp_module.aspp1.2 sparsity : 0.5 module.aspp_module.aspp2.0 sparsity : 0.5 module.aspp_module.aspp2.2 sparsity : 0.5 module.aspp_module.aspp3.0 sparsity : 0.5 module.aspp_module.aspp3.2 sparsity : 0.5 module.aspp_module.aspp4.0 sparsity : 0.5 module.aspp_module.aspp4.2 sparsity : 0.5 module.aspp_module.concat_process.0 sparsity : 0.5 module.aspp_module.concat_process.2 sparsity : 0.5 [2023-02-20 15:49:30] start to speedup the model no multi-dimension masks found. [2023-02-20 15:49:33] infer module masks... [2023-02-20 15:49:33] Update mask for .prim::TupleUnpack.343 [2023-02-20 15:49:33] Update mask for module.feature_extractor.conv1.conv1_1.aten::mul.344 args: () self.undetermined: [0, 1] Traceback (most recent call last): File "/home/usr/code/DL/python/depth_estimation/dorn/mydemos/model_compression/nni_demo_dorn01.py", line 321, in ModelSpeedup(model, torch.rand([1, 3, 257, 353]).to("cuda:0"), masks).speedup_model() File "/home/nest/anaconda3/envs/usrpy39_nni/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py", line 546, in speedup_model self.infer_modules_masks() File "/home/nest/anaconda3/envs/usrpy39_nni/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py", line 383, in infer_modules_masks self.update_direct_sparsity(curnode) File "/home/nest/anaconda3/envs/usrpy39_nni/lib/python3.9/site-packages/nni/compression/pytorch/speedup/compressor.py", line 237, in update_direct_sparsity _auto_infer = AutoMaskInference( File "/home/nest/anaconda3/envs/usrpy39_nni/lib/python3.9/site-packages/nni/compression/pytorch/speedup/infer_mask.py", line 80, in init self.output = self.module(*dummy_input) File "/home/nest/anaconda3/envs/usrpy39_nni/lib/python3.9/site-packages/nni/compression/pytorch/speedup/jit_translate.py", line 229, in call assert len(args) >= len(self.undetermined) AssertionError

ankitknitj commented 1 year ago

Hi, facing the same problem while speeding up Convnext model from torchvision. Any updates on this bug?

TheSeriousProgrammer commented 1 year ago

I was trying to write a custom pruning sensitivity analysis code and got the same error, scheduled pruners like LotteryTicketPruner is working though. Thanks for the awesome framework!!

andreianicolau commented 1 year ago

Same issue for the convnext model. I tried not only v2.10 but also v2.9 and v3.0rc1 and all have the same issue. Update mask for features.1.0.block.6 [2023-08-14 15:03:50] Update mask for features.1.0.aten::mul.268 Traceback (most recent call last): File [...] File "/anaconda/envs/sparse-env/lib/python3.8/site-packages/nni/compression/pytorch/speedup/jit_translate.py", line 227, in call assert len(args) >= len(self.undetermined) AssertionError

We even tried versions prior to 2.9 and we were able to go further; however, we encountered another error indicating that LayerNorm2d is not supported (it was not supported to replace the module with type: LayerNorm2d).