VainF / Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
https://arxiv.org/abs/2301.12900
MIT License
2.61k stars 322 forks source link

Pruning yolov8 failed #147

Closed Hyunseok-Kim0 closed 1 year ago

Hyunseok-Kim0 commented 1 year ago

Hello, I am trying to apply filter pruning to yolov8 model. I saw there is sample code for yolov7 in https://github.com/VainF/Torch-Pruning/blob/master/benchmarks/prunability/yolov7_train_pruned.py. Since yolov8 has very similar structure with yolov7, I thought it would be possible to pruning it with minimal modification. However, the pruning failed due to weird problem near Concat layer. I used code below under yolov8 root to prune the model.

import torch

from ultralytics import YOLO
import torch_pruning as tp

from ultralytics.nn.modules import Detect

def prune():
    # load trained yolov8x model
    model = YOLO('yolov8x.pt')

    for name, param in model.model.named_parameters():
        param.requires_grad = True

    # pruning
    model.model.eval()
    example_inputs = torch.randn(1, 3, 640, 640).to(model.device)
    imp = tp.importance.MagnitudeImportance(p=2)  # L2 norm pruning

    ignored_layers = []
    unwrapped_parameters = []

    modules_list = list(model.model.modules())
    for i, m in enumerate(modules_list):
        if isinstance(m, (Detect,)):
            ignored_layers.append(m)

    iterative_steps = 1  # progressive pruning
    pruner = tp.pruner.MagnitudePruner(
        model.model,
        example_inputs,
        importance=imp,
        iterative_steps=iterative_steps,
        ch_sparsity=0.5,  # remove 50% channels
        ignored_layers=ignored_layers,
        unwrapped_parameters=unwrapped_parameters
    )
    base_macs, base_nparams = tp.utils.count_ops_and_params(model.model, example_inputs)
    pruner.step()

    pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(pruner.model, example_inputs)
    print(model.model)
    print("Before Pruning: MACs=%f G, #Params=%f G" % (base_macs / 1e9, base_nparams / 1e9))
    print("After Pruning: MACs=%f G, #Params=%f G" % (pruned_macs / 1e9, pruned_nparams / 1e9))

    # fine-tuning, TBD

if __name__ == "__main__":
    prune()

Following message is stack trace when pruning is failed.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch_pruning/importance.py", line 88, in __call__
    w = layer.weight.data[idxs]
IndexError: index 640 is out of bounds for dimension 0 with size 640

the layer in error message is batchnorm layer which has (640,) shaped tensor in layer.weight.data. However, idxs has (1280,) shape and out of index values. In other layers around concat it also shows similar error, which means idxs has much larger shape or larger value than layer weight length. I tried to figure out why this problem happens, but stuck right now. I guess there is problem in graph construction like _ConcatIndexMapping or something for yolov8. It will be nice if you can help or give some advice to solve this problem.

chbw818 commented 3 months ago

@chbw818 could u post the entire updated code if possible it would be helpfull for many persons

Thanks! I update the code in https://github.com/chbw818/yolov8-prune-using-torch-pruning-

Prime-Rogue commented 3 months ago

Hello, may I ask if I can use your code for pruning and use the official weight yolov8n.pt to run normally? However, using the best. pt trained by myself will result in an error. The error message is as follows:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, CPU and cuda: 0! (When checking argument for argument mat in method wrapper CUDA-addmv_)

Why make such a mistake just by changing weights?

DF。 @.***

 

------------------ 原始邮件 ------------------ 发件人: "VainF/Torch-Pruning" @.>; 发送时间: 2024年5月28日(星期二) 晚上8:55 @.>; @.**@.>; 主题: Re: [VainF/Torch-Pruning] Pruning yolov8 failed (Issue #147)

@chbw818 could u post the entire updated code if possible it would be helpfull for many persons

Thanks! I update the code in https://github.com/chbw818/yolov8-prune-using-torch-pruning-

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

chbw818 commented 3 months ago

Hello, may I ask if I can use your code for pruning and use the official weight yolov8n.pt to run normally? However, using the best. pt trained by myself will result in an error. The error message is as follows: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, CPU and cuda: 0! (When checking argument for argument mat in method wrapper CUDA-addmv_) Why make such a mistake just by changing weights? DF。 @.   ------------------ 原始邮件 ------------------ 发件人: "VainF/Torch-Pruning" @.>; 发送时间: 2024年5月28日(星期二) 晚上8:55 @.>; @*.**@*.>; 主题: Re: [VainF/Torch-Pruning] Pruning yolov8 failed (Issue #147) @chbw818 could u post the entire updated code if possible it would be helpfull for many persons Thanks! I update the code in https://github.com/chbw818/yolov8-prune-using-torch-pruning- — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

i think that my code will work if you correctly modify the loss.py could you please give me more information about the error?It seems that you forgot to put a tensor or weight to cuda

Prime-Rogue commented 3 months ago

Thank you for your code. I used it improperly and the problem has now been resolved

DF。 @.***

 

------------------ 原始邮件 ------------------ 发件人: "VainF/Torch-Pruning" @.>; 发送时间: 2024年5月30日(星期四) 凌晨0:17 @.>; @.**@.>; 主题: Re: [VainF/Torch-Pruning] Pruning yolov8 failed (Issue #147)

Hello, may I ask if I can use your code for pruning and use the official weight yolov8n.pt to run normally? However, using the best. pt trained by myself will result in an error. The error message is as follows: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, CPU and cuda: 0! (When checking argument for argument mat in method wrapper CUDA-addmv_) Why make such a mistake just by changing weights? DF。 @.   … ------------------ 原始邮件 ------------------ 发件人: "VainF/Torch-Pruning" @.>; 发送时间: 2024年5月28日(星期二) 晚上8:55 @.>; @.@.>; 主题: Re: [VainF/Torch-Pruning] Pruning yolov8 failed (Issue #147) @chbw818 could u post the entire updated code if possible it would be helpfull for many persons Thanks! I update the code in https://github.com/chbw818/yolov8-prune-using-torch-pruning- — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>

i think that my code will work if you correctly modify the loss.py could you please give me more information about the error?It seems that you forgot to put a tensor or weight to cuda

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

niranyingluofen commented 3 months ago

Hey Guys! I think I found the bug. When a concat module & a split module are directly connected, the index mapping system fails to compute correct idxs. I'm going to rewrite the concat & split tracing. Really thanks for this issue!

请问一下这个问题有修复吗?

yoloyash commented 2 months ago

Can someone share their before and after results of model size, performance drop, inference time?

jenyaMi commented 2 months ago

Hey @chbw818, thank you a lot for sharing your updated yolov8 pruning code. could you tell me please what torch, ultralytics, python versions did you use. I am trying to prune yolov8 and found your updated code for that. It seems to be working for everyone, but I am encountering this error. AttributeError: Can't get attribute 'main' on <module 'builtins' (built-in)>

Here is the full traceback: AttributeError: Can't get attribute 'main' on <module 'builtins' (built-in)>

AttributeError Traceback (most recent call last)

in 16 args = parser.parse_args() 17 ---> 18 prune(args) in prune(args) 432 pruning_cfg['name'] = f"step_{i}_finetune" 433 pruning_cfg['batch'] = batch_size # restore batch size --> 434 model.train_v2(pruning=True, **pruning_cfg) 435 436 # post fine-tuning validation in train_v2(self, pruning, **kwargs) 337 338 self.trainer.hub_session = self.session # attach optional HUB session --> 339 self.trainer.train() 340 # Update model and cfg after training 341 if RANK in (-1, 0): /workspace/data/notebooks/belt_detection/ultralytics/ultralytics/engine/trainer.py in train(self) 202 203 else: --> 204 self._do_train(world_size) 205 206 def _setup_scheduler(self): /workspace/data/notebooks/belt_detection/ultralytics/ultralytics/engine/trainer.py in _do_train(self, world_size) 467 f"{(time.time() - self.train_time_start) / 3600:.3f} hours." 468 ) --> 469 self.final_eval() 470 if self.args.plots: 471 self.plot_metrics() in final_eval_v2(self) 272 for f in self.last, self.best: 273 if f.exists(): --> 274 strip_optimizer_v2(f) # strip optimizers 275 if f is self.best: 276 LOGGER.info(f'\nValidating {f}...') in strip_optimizer_v2(f, s) 284 Disabled half precision saving. originated from ultralytics/yolo/utils/torch_utils.py 285 """ --> 286 x = torch.load(f, map_location=torch.device('cpu')) 287 args = {**DEFAULT_CFG_DICT, **x['train_args']} # combine model args with default args, preferring model args 288 if x.get('ema'): /opt/conda/lib/python3.8/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args) 605 opened_file.seek(orig_position) 606 return torch.jit.load(opened_file) --> 607 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) 608 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) 609 /opt/conda/lib/python3.8/site-packages/torch/serialization.py in _load(zip_file, map_location, pickle_module, pickle_file, **pickle_load_args) 880 unpickler = UnpicklerWrapper(data_file, **pickle_load_args) 881 unpickler.persistent_load = persistent_load --> 882 result = unpickler.load() 883 884 torch._utils._validate_loaded_sparse_tensors() /opt/conda/lib/python3.8/site-packages/torch/serialization.py in find_class(self, mod_name, name) 873 def find_class(self, mod_name, name): 874 mod_name = load_module_mapping.get(mod_name, mod_name) --> 875 return super().find_class(mod_name, name) 876 877 # Load the data (which may in turn use `persistent_load` to load tensors) AttributeError: Can't get attribute '__main__' on Does anyone know what could be the problem?
Alejandro-Casanova commented 2 months ago

hello,when i ran the code https://github.com/VainF/Torch-Pruning/blob/master/examples/yolov8/yolov8_pruning.py i met the error as follows: Traceback (most recent call last): File "d:/ultralytics-main/prune_v8.py", line 17, in <module> from ultralytics.engine.model import TASK_MAP ImportError: cannot import name 'TASK_MAP' from 'ultralytics.engine.model' (d:\ultralytics-main\ultralytics\engine\model.py) it seems caused by the version of the code of yolov8. but i have trained a model with the newest version of yolov8,I wonder how to solve it.

The problem have been solved.And the same issue can be found in the issues. First,using this line to replace the line 250 in yolov8_pruning.py self.trainer = self.task_map[self.task]['trainer'](overrides=overrides, _callbacks=self.callbacks) next,fix the loss function in ultralytics/ultralytics/utils/loss.py like this: `def bbox_decode(self, anchor_points, pred_dist): """Decode predicted object bounding box coordinates from anchor points and distribution.""" if self.use_dfl: b, a, c = pred_dist.shape # batch, anchors, channels mydevice=torch.device('cuda:0') self.proj=self.proj.to(mydevice) pred_dist = pred_dist.view(b, a, 4, c // 4).softmax(3).matmul(self.proj.type(pred_dist.dtype))

        # pred_dist = pred_dist.view(b, a, c // 4, 4).transpose(2,3).softmax(3).matmul(self.proj.type(pred_dist.dtype))
        # pred_dist = (pred_dist.view(b, a, c // 4, 4).softmax(2) * self.proj.type(pred_dist.dtype).view(1, 1, -1, 1)).sum(2)
    return dist2bbox(pred_dist, anchor_points, xywh=False)`

Does anyone know why the loss function has to be modified? Maybe somewhere in the yolov8_pruning script the parameter "proj" is sent back to cpu accidentally?