VainF / Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
https://arxiv.org/abs/2301.12900
MIT License
2.61k stars 323 forks source link

使用Group剪枝方法出错 #181

Open BearCooike opened 1 year ago

BearCooike commented 1 year ago

使用以下代码生成pruner的时候,会出现以下错误,L2与BNScaler两个方法是可以正常走通的。

pruner = tp.pruner.GroupNormPruner(
                            self.model,
                            self.input, # 用于分析依赖的伪输入
                            importance=tp.importance.GroupNormImportance(), # 重要性评估指标
                            iterative_steps=1, # 迭代剪枝,设为1则一次性完成剪枝
                            ch_sparsity=self.sparsity,
                            ignored_layers=self.ignored_layers, 
                        )

image

VainF commented 1 year ago

您好,方便提供网络结构的打印输出嘛?

BearCooike commented 1 year ago

就是YoloV8的结构 YoloV8( (focus): Conv( (conv): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (conv1): Conv( (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (csp1): C2f( (cv1): Conv( (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(96, 64, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (conv2): Conv( (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (csp2): C2f( (cv1): Conv( (conv): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) (1): BottleneckV8( (cv1): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (conv3): Conv( (conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (csp3): C2f( (cv1): Conv( (conv): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) (1): BottleneckV8( (cv1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (conv4): Conv( (conv): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (csp4): C2f( (cv1): Conv( (conv): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(768, 512, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (sppf): SPPF( (cv1): Conv( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False) ) (up1): Upsample(scale_factor=2.0, mode=nearest) (csp5): C2f( (cv1): Conv( (conv): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(384, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (up2): Upsample(scale_factor=2.0, mode=nearest) (csp6): C2f( (cv1): Conv( (conv): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (conv7): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (csp7): C2f( (cv1): Conv( (conv): Conv2d(384, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(384, 256, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (conv8): Conv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (bn): None (act): SiLU() ) (csp8): C2f( (cv1): Conv( (conv): Conv2d(768, 512, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(768, 512, kernel_size=(1, 1), stride=(1, 1)) (bn): None (act): SiLU() ) (m): ModuleList( (0): BottleneckV8( (cv1): Conv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (cv2): Conv( (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) ) ) ) (detect): DetectV8( (cv2): ModuleList( (0): Sequential( (0): Conv( (conv): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (1): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (2): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) ) (1): Sequential( (0): Conv( (conv): Conv2d(256, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (1): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (2): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) ) (2): Sequential( (0): Conv( (conv): Conv2d(512, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (1): Conv( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (2): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) ) ) (cv3): ModuleList( (0): Sequential( (0): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (2): Conv2d(128, 80, kernel_size=(1, 1), stride=(1, 1)) ) (1): Sequential( (0): Conv( (conv): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (2): Conv2d(128, 80, kernel_size=(1, 1), stride=(1, 1)) ) (2): Sequential( (0): Conv( (conv): Conv2d(512, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (1): Conv( (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (bn): None (act): SiLU() ) (2): Conv2d(128, 80, kernel_size=(1, 1), stride=(1, 1)) ) ) (dfl): DFL( (conv): Conv2d(16, 1, kernel_size=(1, 1), stride=(1, 1), bias=False) ) ) )

Creison-Maique commented 1 year ago

@BearCooike I am having the exact same problem. Could you overcome the problem?

BearCooike commented 1 year ago

@BearCooike I am having the exact same problem. Could you overcome the problem?

sorry, i can't

VainF commented 1 year ago

Hi all, this issue was already fixed. The yolov8 example now uses GroupNormImportance by default. https://github.com/VainF/Torch-Pruning/blob/1aaf4298bc4b9432ad387029a293303ef1c6428a/examples/yolov8/yolov8_pruning.py#L326