Closed Hyunseok-Kim0 closed 1 year ago
Could you please try the tp.importance.RandomPruenr
? I'm not sure if this is caused by DepGraph or the importance module.
Error occurred from C2f
module in yolov8 when tp.importance.RandomImportance
is used.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch_pruning/utils/op_counter.py", line 26, in count_ops_and_params
_ = flops_model(example_inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/workspace/Projects/ultralytics/ultralytics/nn/tasks.py", line 203, in forward
return self._forward_once(x, profile, visualize) # single-scale inference, train
File "/workspace/Projects/ultralytics/ultralytics/nn/tasks.py", line 58, in _forward_once
x = m(x) # run
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/Projects/ultralytics/ultralytics/nn/modules.py", line 193, in forward
y.extend(m(y[-1]) for m in self.m)
File "/workspace/Projects/ultralytics/ultralytics/nn/modules.py", line 193, in <genexpr>
y.extend(m(y[-1]) for m in self.m)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/Projects/ultralytics/ultralytics/nn/modules.py", line 131, in forward
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/Projects/ultralytics/ultralytics/nn/modules.py", line 34, in forward
return self.act(self.bn(self.conv(x)))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [160, 320, 3, 3], expected input[1, 160, 32, 32] to have 320 channels, but got 160 channels instead
Thank you. I will try it!
It was possible to executing pruner.step() using commit version 0d7a99b after I modified C2f
module, with tp.importance.MagnitudeImportance
. However, recent version did not work.
Here is the error message of most recent version (commit 69902e8)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/workspace/Projects/Torch-Pruning/torch_pruning/importance.py", line 80, in __call__
local_norm = local_norm[idxs]
IndexError: index 2880 is out of bounds for dimension 0 with size 1600
Here is the successful output (commit 0d7a99b + modified C2f module)
It looks pruning working properly. The model map decreased 0.414 to 0.378 with ch_sparsity 0.01.
@Hyunseok-Kim0 Hello, how did you modify the C2f module to make it work ? In that case, could you still be able to retrieve pretrained weights ?
@VainF I encountered the same problem with C2f module, it seems to me that it does not prune the Conv1 layer in the C2f module. Cf https://user-images.githubusercontent.com/27466624/222874205-3873bdac-7135-4ecc-8ab2-ca18b8e13fdf.jpg. When I grouped the convs of the Cf2 module together and exclude these groups from the pruning, It works.
Here is how I group these convs. With bottleneck_index is one of [2, 4, 6, 8, 12, 15, 18, 21]
def get_model_groups(model, bottleneck_index):
return [
[
f"model.{i}.m.{n}.cv2.conv"
for n in range(len(model.module[i].m if hasattr(model, "module") else model[i].m))
]
+ [f"model.{i}.cv1.conv"]
for i in bottleneck_index
]
Thank you! Maybe I just introduced new bugs in the latest commit. Will fix it.
BTW, @Hyunseok-Kim0 could you please share your solution with other guys? I think it would be very helpful!
Here is the modified C2f module. I found this in https://github.com/tianyic/only_train_once/issues/5.
class C2f_v2(nn.Module):
# CSP Bottleneck with 2 convolutions
def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
self.c = int(c2 * e) # hidden channels
self.cv0 = Conv(c1, self.c, 1, 1)
self.cv1 = Conv(c1, self.c, 1, 1)
self.cv2 = Conv((2 + n) * self.c, c2, 1) # optional act=FReLU(c2)
self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))
def forward(self, x):
# y = list(self.cv1(x).chunk(2, 1))
y = [self.cv0(x), self.cv1(x)]
y.extend(m(y[-1]) for m in self.m)
return self.cv2(torch.cat(y, 1))
@ducanhluu Here is the code for migrating pretrained C2f weight I used.
def infer_shortcut(bottleneck):
c1 = bottleneck.cv1.conv.in_channels
c2 = bottleneck.cv2.conv.out_channels
return c1 == c2 and hasattr(bottleneck, 'add') and bottleneck.add
def transfer_weights(c2f, c2f_v2):
c2f_v2.cv2 = c2f.cv2
c2f_v2.m = c2f.m
state_dict = c2f.state_dict()
state_dict_v2 = c2f_v2.state_dict()
# Transfer cv1 weights from C2f to cv0 and cv1 in C2f_v2
old_weight = state_dict['cv1.conv.weight']
half_channels = old_weight.shape[0] // 2
state_dict_v2['cv0.conv.weight'] = old_weight[:half_channels]
state_dict_v2['cv1.conv.weight'] = old_weight[half_channels:]
# Transfer cv1 batchnorm weights and buffers from C2f to cv0 and cv1 in C2f_v2
for bn_key in ['weight', 'bias', 'running_mean', 'running_var']:
old_bn = state_dict[f'cv1.bn.{bn_key}']
state_dict_v2[f'cv0.bn.{bn_key}'] = old_bn[:half_channels]
state_dict_v2[f'cv1.bn.{bn_key}'] = old_bn[half_channels:]
# Transfer remaining weights and buffers
for key in state_dict:
if not key.startswith('cv1.'):
state_dict_v2[key] = state_dict[key]
# Transfer all non-method attributes
for attr_name in dir(c2f):
attr_value = getattr(c2f, attr_name)
if not callable(attr_value) and '_' not in attr_name:
setattr(c2f_v2, attr_name, attr_value)
c2f_v2.load_state_dict(state_dict_v2)
def replace_c2f_with_c2f_v2(module):
for name, child_module in module.named_children():
if isinstance(child_module, C2f):
# Replace C2f with C2f_v2 while preserving its parameters
shortcut = infer_shortcut(child_module.m[0])
c2f_v2 = C2f_v2(child_module.cv1.conv.in_channels, child_module.cv2.conv.out_channels,
n=len(child_module.m), shortcut=shortcut,
g=child_module.m[0].cv2.conv.groups,
e=child_module.c / child_module.cv2.conv.out_channels)
transfer_weights(child_module, c2f_v2)
setattr(module, name, c2f_v2)
else:
replace_c2f_with_c2f_v2(child_module)
@Hyunseok-Kim0 thank you for your sharing. With your patch, I can now prune the entire yolov8 as well.
There was a mistake in my previous message. The problem was actually at the split layer. By explicitly slit convs beforehand in the patch, it resolves the issue
Hey Guys! I think I found the bug. When a concat module & a split module are directly connected, the index mapping system fails to compute correct idxs
. I'm going to rewrite the concat & split tracing. Really thanks for this issue!
The result is different when I use c2f_v2 module insead of c2f without pruning. I just run code as follows:
‘’‘ model = YOLO('runs/detect/train/weights/best.pt')
for name, param in model.model.named_parameters(): param.requires_grad = True
replace_c2f_with_c2f_v2(model.model)
success = model.export(format="onnx", imgsz=(192,640),simplify=True) ’‘’
onnx result is different from torch result. The precison is very low.
Need I re-train yolov8 model by using c2f_v2 instead of c2f ?
The result is different when I use c2f_v2 module insead of c2f without pruning. I just run code as follows:
‘’‘ model = YOLO('runs/detect/train/weights/best.pt')
for name, param in model.model.named_parameters(): param.requires_grad = True
replace_c2f_with_c2f_v2(model.model)
success = model.export(format="onnx", imgsz=(192,640),simplify=True) ’‘’
onnx result is different from torch result. The precison is very low.
There may be some issues with the weights copy. will check it and get back to you.
Hi @xiaofulee, I found that the official YOLOv8 relies on the following function to set up BN.eps
and BN.momentum
. By default, BN.eps=1e-5 which is incompatible with the official YOLO weights (BN.eps=0.001). Could you please try the latest commit?
def initialize_weights(model):
# Initialize model weights to random values
for m in model.modules():
t = type(m)
if t is nn.Conv2d:
pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif t is nn.BatchNorm2d:
m.eps = 1e-3
m.momentum = 0.03
elif t in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU]:
m.inplace = True
The updated pipeline:
for name, param in model.model.named_parameters():
param.requires_grad = True
replace_c2f_with_c2f_v2(model.model)
initialize_weights(model.model) # set BN.eps, momentum, ReLU.inplace
I update the code but it make no sence.
The mAp of baseline is 79.1%.
When ch_sparsity is set to 0.1, mAp is just 26.2%. The Ap of class "face" is 0%.
Here is the test code:
It requires post training.
Does the unpruned model work properly after module replacing?
It requires post training.
Does the unpruned model work properly after module replacing?
Yes. It works.
I will study how to post training. Thank you.
I'm not a yolo expert. But this line may be helpful for post-training:
pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(pruner.model, example_inputs)
print(model.model)
print("Before Pruning: MACs=%f G, #Params=%f M" % (base_macs / 1e9, base_nparams / 1e6))
print("After Pruning: MACs=%f G, #Params=%f M" % (pruned_macs / 1e9, pruned_nparams / 1e6))
# post-training
model.train(data='coco128.yaml', epochs=100, imgsz=640)
Reference: https://docs.ultralytics.com/modes/train/
Please replace the coco128 toy set with a full coco dataset and use a smaller learning rate (original_lr x 0.1) for post-training.
I'm not a yolo expert. But this line may be helpful for post-training:
pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(pruner.model, example_inputs) print(model.model) print("Before Pruning: MACs=%f G, #Params=%f M" % (base_macs / 1e9, base_nparams / 1e6)) print("After Pruning: MACs=%f G, #Params=%f M" % (pruned_macs / 1e9, pruned_nparams / 1e6)) # post-training model.train(data='coco128.yaml', epochs=100, imgsz=640)
Reference: https://docs.ultralytics.com/modes/train/
Please replace the coco128 toy set with a full coco dataset and use a smaller learning rate (original_lr x 0.1) for post-training.
Thanks a lot. It‘s a greate work.
Is there any progress about fixing bug? I still see error message with most recent version of code.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-bc2d9c71db3c> in <cell line: 2>()
1 # output with current version
----> 2 prune()
4 frames
/usr/local/lib/python3.9/dist-packages/torch_pruning/dependency.py in _update_concat_index_mapping(self, cat_node)
935 offsets = [0]
936 for ch in chs:
--> 937 offsets.append(offsets[-1] + ch)
938 cat_node.module.offsets = offsets
939
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
Here is the gist for reproduction of bug of current version. https://colab.research.google.com/gist/Hyunseok-Kim0/92905bfa852f9c151c35c2ec9167595e/yolov8-pruning.ipynb
Hi @Hyunseok-Kim0, this error was fixed. No error with your example. Thank you!
Hi, @VainF @Hyunseok-Kim0 @ducanhluu @xiaofulee Based on your information, I tried to prune yolov8-n. latest code: [d3562c0] I can found that MAC and PARAMS reduce, after Post training , mAP also back. But when I deploy pruned model , the inference time is same with original model. Is there others operation need to do ? Thanks
@Ghustwb In my case the inference and training time are reduced together after pruning. Can you show your pruned model loading code and inference code?
@Hyunseok-Kim0 ,I just run yolov8_pruning.py with modified yaml and size.
I tested 3 deploy ways, pytoch model
,onnx model
,TensorRT model
All of them, the inference time between before-pruning with after-pruning are same.
And just now, I found a strange thing, the model structure after beforepruning is actually the same as the model structure before pruning.
Is there any bug in this file yolov8_pruning.py
?
model.train(data='coco128.yaml', epochs=100, imgsz=640)
this operation in yolov8 will reconstruct all params based on yaml?
@Ghustwb When you load pruned yolov8 model, try to load with pt file directly like model = YOLO('pruned.pt')
without yaml file or update model
in yaml file. You better go check how yolov8 model calls trainer and load model.
I modified yolov8 source to make it not to load new model when pruned model is given. If you do not want to modify yolov8 source code, I think you have to save pruned model and load it again.
I am pruning YOLOv8 according to the tutorial, but I am encountering an issue which might be caused by replacing a module. How can I resolve this?
@Ghustwb When you load pruned yolov8 model, try to load with pt file directly like
model = YOLO('pruned.pt')
without yaml file or updatemodel
in yaml file. You better go check how yolov8 model calls trainer and load model. I modified yolov8 source to make it not to load new model when pruned model is given. If you do not want to modify yolov8 source code, I think you have to save pruned model and load it again.
@Hyunseok-Kim0 Thanks for your information, I will try it again. And can you take a PR for yolov8_pruning.py
?
I see corresponding part is already changed in sample code. However, I think some parts in yolov8 also need to be changed together for better performance. I will try to make a PR tomorrow about those changes.
Thank you @Hyunseok-Kim0 , I am waiting for your PR. BTW, can you share your performance result after yolo8 pruning? About mAP or inference time.
@Hyunseok-Kim0 Please share the AP comparasion before and after pruning. thanks!
I encountered an error in the code during execution, stating that the tensors are not on the same device.
@Hyunseok-Kim0
Hi, I would like to ask how to inference about pruned YOLOv8 on the test dataset, has the structure changed?
I wrote in the script main.py of YOLOv8 that:
model=YOLO("./runs/detect/step_15_finetune/weights/best.pt")
model.val(data='./ultralytics/datasets/custom.yaml')
But the error is reported as:
File "/opt/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/serialization.py", line 1039, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'C2f_v2' on <module '__main__' from 'main.py'>
Can you help me?
I encountered an error in the code during execution, stating that the tensors are not on the same device. @Hyunseok-Kim0 can you explain it to me thank you
@Hyunseok-Kim0 Hi, I would like to ask how to inference about pruned YOLOv8 on the test dataset, has the structure changed? I wrote in the script main.py of YOLOv8 that:
model=YOLO("./runs/detect/step_15_finetune/weights/best.pt") model.val(data='./ultralytics/datasets/custom.yaml')
But the error is reported as:
File "/opt/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/serialization.py", line 1039, in find_class return super().find_class(mod_name, name) AttributeError: Can't get attribute 'C2f_v2' on <module '__main__' from 'main.py'>
Can you help me?
you can validate inside this code itself https://github.com/VainF/Torch-Pruning/blob/master/benchmarks/prunability/yolov8_pruning.py . Just remove all codes under def prune() and use below
def prune(args):
# load trained yolov8 model
model = YOLO(args.model)
results = model.val(data='./ultralytics/datasets/custom.yaml')
I'm not a yolo expert. But this line may be helpful for post-training:
pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(pruner.model, example_inputs) print(model.model) print("Before Pruning: MACs=%f G, #Params=%f M" % (base_macs / 1e9, base_nparams / 1e6)) print("After Pruning: MACs=%f G, #Params=%f M" % (pruned_macs / 1e9, pruned_nparams / 1e6)) # post-training model.train(data='coco128.yaml', epochs=100, imgsz=640)
Reference: https://docs.ultralytics.com/modes/train/
Please replace the coco128 toy set with a full coco dataset and use a smaller learning rate (original_lr x 0.1) for post-training.
First of all thank you for the great work. I appreciate it. My question is I would like to prunned yolov8n-seg model. I followed the steps mentioned above in the issue. I started training. It's running without any error but my concern is that model size was around 6 MB (without pruning)but now the model size id around 32 MB. i don't know how. Is there any mistake am I doing or this method won't work for yolov8 nano segmentation models? Thank you in advance for your answer!!
Hi @apanand14. Pruning alters the model's structure, making the original definition in your .py file incompatible. To handle this, we save the entire model using torch.save(model, PATH)
to a ".pth" file. This might increase the file size, but it doesn't affect the actual model size during training or inference. To check the real size, you can export the model to ONNX.
@VainF Thank you for your response. Yes. After exporting to ONNX the model size remains almost same. One more thing, I export to NCNN later. I hope that it will work without any modifications or should I consider it before exporting to NCNN? Thank you.
@VainF Thank you for your response. Yes. After exporting to ONNX the model size remains almost same. One more thing, I export to NCNN later. I hope that it will work without any modifications or should I consider it before exporting to NCNN? Thank you.
If the original model can be exported to ONNX and NCNN without any issues, the same pipeline should also be applicable to the pruned model.
Hi, I'm facing some error while pruning itself. I'm putting my code and error below: Please let me know if I'm doing any mistake then correct me. Thank you in advance.
from ultralytics import YOLO
import torch
import gc
import torch
import torch.nn as nn
from ultralytics.nn.modules import replace_c2f_with_c2f_v2, initialize_weights
import torch_pruning as tp
def run():
model = YOLO('yolov8n-seg.pt')
for name, param in model.model.named_parameters():
param.requires_grad = True
replace_c2f_with_c2f_v2(model.model)
initialize_weights(model.model) # set BN.eps, momentum, ReLU.inplace
example_inputs = torch.randn(1, 3, 640, 640)
imp = tp.importance.MagnitudeImportance(p=2) # L2 norm pruning
ignored_layers = []
unwrapped_parameters = []
iterative_steps = 1 # progressive pruning
pruner = tp.pruner.MagnitudePruner(
model.model,
example_inputs,
importance=imp,
iterative_steps=iterative_steps,
ch_sparsity=0.5, # remove 50% channels
ignored_layers=ignored_layers,
unwrapped_parameters=unwrapped_parameters
)
pruner.step()
base_macs, base_nparams = tp.utils.count_ops_and_params(model.model, example_inputs)
pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(pruner.model, example_inputs)
print(model.model)
print("Before Pruning: MACs=%f G, #Params=%f M" % (base_macs / 1e9, base_nparams / 1e6))
print("After Pruning: MACs=%f G, #Params=%f M" % (pruned_macs / 1e9, pruned_nparams / 1e6))
# post-training
model.train(data='custom_data.yaml', epochs=100, batch=8, workers=4, optimizer='Adam', lr0=0.001)
if __name__ == '__main__':
run()
Error:
Traceback (most recent call last): File "C:\yolov8\train.py", line 63, in
run() File "C:\yolov8\train.py", line 42, in run base_macs, base_nparams = tp.utils.count_ops_and_params(model.model, example_inputs) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\torch_pruning\utils\op_counter.py", line 28, in count_ops_andparams = flops_model(example_inputs) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\torch\nn\modules\module.py", line 1212, in _call_impl result = forward_call(*input, *kwargs) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\ultralytics\nn\tasks.py", line 178, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\ultralytics\nn\tasks.py", line 57, in _forward_once x = m(x) # run File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\ultralytics\nn\modules.py", line 531, in forward x = self.detect(self, x) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\ultralytics\nn\modules.py", line 499, in forward box, cls = torch.cat([xi.view(shape[0], self.no, -1) for xi in x], 2).split((self.reg_max 4, self.nc), 1) File "C:\Users\anaconda3\envs\yolov8\lib\site-packages\ultralytics\nn\modules.py", line 499, in box, cls = torch.cat([xi.view(shape[0], self.no, -1) for xi in x], 2).split((self.reg_max 4, self.nc), 1) RuntimeError: shape '[1, 144, -1]' is invalid for input of size 716800
我在执行过程中遇到代码错误,指出张量不在同一设备上。@Hyunseok-Kim0你能给我解释一下吗,谢谢
That's the problem. How did you solve it?
Hey Guys! I think I found the bug. When a concat module & a split module are directly connected, the index mapping system fails to compute correct
idxs
. I'm going to rewrite the concat & split tracing. Really thanks for this issue!
This bug still exists. https://github.com/VainF/Torch-Pruning/issues/219
我在执行过程中遇到代码错误,指出张量不在同一设备上。@Hyunseok-Kim0你能给我解释一下吗,谢谢
这就是问题所在。你是怎么解决的?
Hello, have you solved it yet?
import argparse import math import os from copy import deepcopy from datetime import datetime from pathlib import Path from typing import List, Union
import numpy as np import torch import torch.nn as nn from matplotlib import pyplot as plt from ultralytics import YOLO, version from ultralytics.nn.modules import Detect, C2f, Conv, Bottleneck from ultralytics.nn.tasks import attempt_load_one_weight from ultralytics.engine.model import Model from ultralytics.engine.trainer import BaseTrainer from ultralytics.utils import yaml_load, LOGGER, RANK, DEFAULT_CFG_DICT, DEFAULT_CFG_KEYS from ultralytics.utils.checks import check_yaml from ultralytics.utils.torch_utils import initialize_weights, de_parallel
import torch_pruning as tp
x : List
Parameter numbers of all pruning steps
y1 : List
mAPs after fine-tuning of all pruning steps
y2 : List
MACs of all pruning steps
y3 : List
mAPs after pruning (not fine-tuned) of all pruning steps
Returns
-------
"""
try:
plt.style.use("ggplot")
except:
pass
x, y1, y2, y3 = np.array(x), np.array(y1), np.array(y2), np.array(y3)
y2_ratio = y2 / y2[0]
# create the figure and the axis object
fig, ax = plt.subplots(figsize=(8, 6))
# plot the pruned mAP and recovered mAP
ax.set_xlabel('Pruning Ratio')
ax.set_ylabel('mAP')
ax.plot(x, y1, label='recovered mAP')
ax.scatter(x, y1)
ax.plot(x, y3, color='tab:gray', label='pruned mAP')
ax.scatter(x, y3, color='tab:gray')
# create a second axis that shares the same x-axis
ax2 = ax.twinx()
# plot the second set of data
ax2.set_ylabel('MACs')
ax2.plot(x, y2_ratio, color='tab:orange', label='MACs')
ax2.scatter(x, y2_ratio, color='tab:orange')
# add a legend
lines, labels = ax.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax2.legend(lines + lines2, labels + labels2, loc='best')
ax.set_xlim(105, -5)
ax.set_ylim(0, max(y1) + 0.05)
ax2.set_ylim(0.05, 1.05)
# calculate the highest and lowest points for each set of data
max_y1_idx = np.argmax(y1)
min_y1_idx = np.argmin(y1)
max_y2_idx = np.argmax(y2)
min_y2_idx = np.argmin(y2)
max_y1 = y1[max_y1_idx]
min_y1 = y1[min_y1_idx]
max_y2 = y2_ratio[max_y2_idx]
min_y2 = y2_ratio[min_y2_idx]
# add text for the highest and lowest values near the points
ax.text(x[max_y1_idx], max_y1 - 0.05, f'max mAP = {max_y1:.2f}', fontsize=10)
ax.text(x[min_y1_idx], min_y1 + 0.02, f'min mAP = {min_y1:.2f}', fontsize=10)
ax2.text(x[max_y2_idx], max_y2 - 0.05, f'max MACs = {max_y2 * y2[0] / 1e9:.2f}G', fontsize=10)
ax2.text(x[min_y2_idx], min_y2 + 0.02, f'min MACs = {min_y2 * y2[0] / 1e9:.2f}G', fontsize=10)
plt.title('Comparison of mAP and MACs with Pruning Ratio')
plt.savefig('pruning_perf_change.png')
def infer_shortcut(bottleneck): c1 = bottleneck.cv1.conv.in_channels c2 = bottleneck.cv2.conv.out_channels return c1 == c2 and hasattr(bottleneck, 'add') and bottleneck.add
class C2f_v2(nn.Module):
def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
self.c = int(c2 * e) # hidden channels
self.cv0 = Conv(c1, self.c, 1, 1)
self.cv1 = Conv(c1, self.c, 1, 1)
self.cv2 = Conv((2 + n) * self.c, c2, 1) # optional act=FReLU(c2)
self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))
def forward(self, x):
# y = list(self.cv1(x).chunk(2, 1))
y = [self.cv0(x), self.cv1(x)]
y.extend(m(y[-1]) for m in self.m)
return self.cv2(torch.cat(y, 1))
def transfer_weights(c2f, c2f_v2): c2f_v2.cv2 = c2f.cv2 c2f_v2.m = c2f.m
state_dict = c2f.state_dict()
state_dict_v2 = c2f_v2.state_dict()
# Transfer cv1 weights from C2f to cv0 and cv1 in C2f_v2
old_weight = state_dict['cv1.conv.weight']
half_channels = old_weight.shape[0] // 2
state_dict_v2['cv0.conv.weight'] = old_weight[:half_channels]
state_dict_v2['cv1.conv.weight'] = old_weight[half_channels:]
# Transfer cv1 batchnorm weights and buffers from C2f to cv0 and cv1 in C2f_v2
for bn_key in ['weight', 'bias', 'running_mean', 'running_var']:
old_bn = state_dict[f'cv1.bn.{bn_key}']
state_dict_v2[f'cv0.bn.{bn_key}'] = old_bn[:half_channels]
state_dict_v2[f'cv1.bn.{bn_key}'] = old_bn[half_channels:]
# Transfer remaining weights and buffers
for key in state_dict:
if not key.startswith('cv1.'):
state_dict_v2[key] = state_dict[key]
# Transfer all non-method attributes
for attr_name in dir(c2f):
attr_value = getattr(c2f, attr_name)
if not callable(attr_value) and '_' not in attr_name:
setattr(c2f_v2, attr_name, attr_value)
c2f_v2.load_state_dict(state_dict_v2)
def replace_c2f_with_c2f_v2(module): for name, child_module in module.named_children(): if isinstance(child_module, C2f):
shortcut = infer_shortcut(child_module.m[0])
c2f_v2 = C2f_v2(child_module.cv1.conv.in_channels, child_module.cv2.conv.out_channels,
n=len(child_module.m), shortcut=shortcut,
g=child_module.m[0].cv2.conv.groups,
e=child_module.c / child_module.cv2.conv.out_channels)
transfer_weights(child_module, c2f_v2)
setattr(module, name, c2f_v2)
else:
replace_c2f_with_c2f_v2(child_module)
def save_model_v2(self: BaseTrainer): """ Disabled half precision saving. originated from ultralytics/yolo/engine/trainer.py """ ckpt = { 'epoch': self.epoch, 'best_fitness': self.best_fitness, 'model': deepcopy(de_parallel(self.model)), 'ema': deepcopy(self.ema.ema), 'updates': self.ema.updates, 'optimizer': self.optimizer.state_dict(), 'train_args': vars(self.args), # save as dict 'date': datetime.now().isoformat(), 'version': version}
# Save last, best and delete
torch.save(ckpt, self.last)
if self.best_fitness == self.fitness:
torch.save(ckpt, self.best)
if (self.epoch > 0) and (self.save_period > 0) and (self.epoch % self.save_period == 0):
torch.save(ckpt, self.wdir / f'epoch{self.epoch}.pt')
del ckpt
def final_eval_v2(self: BaseTrainer): """ originated from ultralytics/yolo/engine/trainer.py """ for f in self.last, self.best: if f.exists(): strip_optimizer_v2(f) # strip optimizers if f is self.best: LOGGER.info(f'\nValidating {f}...') self.metrics = self.validator(model=f) self.metrics.pop('fitness', None) self.run_callbacks('on_fit_epoch_end')
def strip_optimizer_v2(f: Union[str, Path] = 'best.pt', s: str = '') -> None: """ Disabled half precision saving. originated from ultralytics/yolo/utils/torch_utils.py """ x = torch.load(f, map_location=torch.device('cpu')) args = {DEFAULT_CFG_DICT, x['train_args']} # combine model args with default args, preferring model args if x.get('ema'): x['model'] = x['ema'] # replace model with ema for k in 'optimizer', 'ema', 'updates': # keys x[k] = None for p in x['model'].parameters(): p.requires_grad = False x['train_args'] = {k: v for k, v in args.items() if k in DEFAULT_CFG_KEYS} # strip non-default keys
torch.save(x, s or f)
mb = os.path.getsize(s or f) / 1E6 # filesize
LOGGER.info(f"Optimizer stripped from {f},{f' saved as {s},' if s else ''} {mb:.1f}MB")
def train_v2(self: YOLO, pruning=False, **kwargs): """ Disabled loading new model when pruning flag is set. originated from ultralytics/yolo/engine/model.py """
self._check_is_pytorch_model()
if self.session: # Ultralytics HUB session
if any(kwargs):
LOGGER.warning('WARNING ⚠️ using HUB training arguments, ignoring local training arguments.')
kwargs = self.session.train_args
overrides = self.overrides.copy()
overrides.update(kwargs)
if kwargs.get('cfg'):
LOGGER.info(f"cfg file passed. Overriding default params with {kwargs['cfg']}.")
overrides = yaml_load(check_yaml(kwargs['cfg']))
overrides['mode'] = 'train'
if not overrides.get('data'):
raise AttributeError("Dataset required but missing, i.e. pass 'data=coco128.yaml'")
if overrides.get('resume'):
overrides['resume'] = self.ckpt_path
self.task = overrides.get('task') or self.task
self.trainer = Model.task_map[self.task][1](overrides=overrides, _callbacks=self.callbacks)
if not pruning:
if not overrides.get('resume'): # manually set model only if not resuming
self.trainer.model = self.trainer.get_model(weights=self.model if self.ckpt else None, cfg=self.model.yaml)
self.model = self.trainer.model
else:
# pruning mode
self.trainer.pruning = True
self.trainer.model = self.model
# replace some functions to disable half precision saving
self.trainer.save_model = save_model_v2.__get__(self.trainer)
self.trainer.final_eval = final_eval_v2.__get__(self.trainer)
self.trainer.hub_session = self.session # attach optional HUB session
self.trainer.train()
# Update model and cfg after training
if RANK in (-1, 0):
self.model, _ = attempt_load_one_weight(str(self.trainer.best))
self.overrides = self.model.args
self.metrics = getattr(self.trainer.validator, 'metrics', None)
def prune(args):
model = YOLO(args.model)
model.__setattr__("train_v2", train_v2.__get__(model))
pruning_cfg = yaml_load(check_yaml(args.cfg))
batch_size = pruning_cfg['batch']
# use coco128 dataset for 10 epochs fine-tuning each pruning iteration step
# this part is only for sample code, number of epochs should be included in config file
# pruning_cfg['data'] = "coco128.yaml"
# pruning_cfg['epochs'] = 10
model.model.train()
replace_c2f_with_c2f_v2(model.model)
initialize_weights(model.model) # set BN.eps, momentum, ReLU.inplace
for name, param in model.model.named_parameters():
param.requires_grad = True
example_inputs = torch.randn(1, 3, pruning_cfg["imgsz"], pruning_cfg["imgsz"]).to(model.device)
macs_list, nparams_list, map_list, pruned_map_list = [], [], [], []
base_macs, base_nparams = tp.utils.count_ops_and_params(model.model, example_inputs)
# do validation before pruning model
pruning_cfg['name'] = f"baseline_val"
pruning_cfg['batch'] = 1
validation_model = deepcopy(model)
metric = validation_model.val(**pruning_cfg)
init_map = metric.box.map
macs_list.append(base_macs)
nparams_list.append(100)
map_list.append(init_map)
pruned_map_list.append(init_map)
print(f"Before Pruning: MACs={base_macs / 1e9: .5f} G, #Params={base_nparams / 1e6: .5f} M, mAP={init_map: .5f}")
# prune same ratio of filter based on initial size
pruning_ratio = 1 - math.pow((1 - args.target_prune_rate), 1 / args.iterative_steps)
for i in range(args.iterative_steps):
model.model.train()
for name, param in model.model.named_parameters():
param.requires_grad = True
ignored_layers = []
unwrapped_parameters = []
for m in model.model.modules():
if isinstance(m, (Detect,)):
ignored_layers.append(m)
example_inputs = example_inputs.to(model.device)
pruner = tp.pruner.GroupNormPruner(
model.model,
example_inputs,
importance=tp.importance.GroupNormImportance(), # L2 norm pruning,
iterative_steps=1,
pruning_ratio=pruning_ratio,
ignored_layers=ignored_layers,
unwrapped_parameters=unwrapped_parameters
)
# Test regularization
# output = model.model(example_inputs)
# (output[0].sum() + sum([o.sum() for o in output[1]])).backward()
# pruner.regularize(model.model)
pruner.step()
# pre fine-tuning validation
pruning_cfg['name'] = f"step_{i}_pre_val"
pruning_cfg['batch'] = 1
validation_model.model = deepcopy(model.model)
metric = validation_model.val(**pruning_cfg)
pruned_map = metric.box.map
pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(pruner.model, example_inputs.to(model.device))
current_speed_up = float(macs_list[0]) / pruned_macs
print(f"After pruning iter {i + 1}: MACs={pruned_macs / 1e9} G, #Params={pruned_nparams / 1e6} M, "
f"mAP={pruned_map}, speed up={current_speed_up}")
# fine-tuning
for name, param in model.model.named_parameters():
param.requires_grad = True
pruning_cfg['name'] = f"step_{i}_finetune"
pruning_cfg['batch'] = batch_size # restore batch size
model.train_v2(pruning=True, **pruning_cfg)
# post fine-tuning validation
pruning_cfg['name'] = f"step_{i}_post_val"
pruning_cfg['batch'] = 1
validation_model = YOLO(model.trainer.best)
metric = validation_model.val(**pruning_cfg)
current_map = metric.box.map
print(f"After fine tuning mAP={current_map}")
macs_list.append(pruned_macs)
nparams_list.append(pruned_nparams / base_nparams * 100)
pruned_map_list.append(pruned_map)
map_list.append(current_map)
# remove pruner after single iteration
del pruner
save_pruning_performance_graph(nparams_list, map_list, macs_list, pruned_map_list)
if init_map - current_map > args.max_map_drop:
print("Pruning early stop")
break
model.export(format='onnx')
if name == "main": parser = argparse.ArgumentParser() parser.add_argument('--model', default='last.pt', help='Pretrained pruning target model file') parser.add_argument('--cfg', default='F:\PycharmProjects\yolov8\ultralytics\cfg\default.yaml', help='Pruning config file.' ' This file should have same format with ultralytics/yolo/cfg/default.yaml') parser.add_argument('--iterative-steps', default=16, type=int, help='Total pruning iteration step') parser.add_argument('--target-prune-rate', default=0.5, type=float, help='Target pruning rate') parser.add_argument('--max-map-drop', default=0.2, type=float, help='Allowed maximum map drop after fine-tuning')
args = parser.parse_args()
prune(args)
我使用了yolov8示例的剪枝,修改为自己的权重和模型结构,但是报错了 IndexError: index 768 is out of bounds for dimension 0 with size 384
请问这是什么原因呢?
I use the offical code, but the pruning does not work. is it due to the coco128 dataset and only 10 epoch ?
before prune:image 1/1 D:\ObjectDection\Yolov8\ultralytics-8.0.132\bus.jpg: 640x480 (no detections), 17.9ms after prune:image 1/1 D:\ObjectDection\Yolov8\ultralytics-8.0.132\bus.jpg: 640x480 (no detections), 36.8ms May I ask why the reasoning time has increased by half? thanks! Supplementary:Using a custom dataset pruning rate of 0.5, compared to a time increase of 0.4ms @Ghustwb https://github.com/VainF/Torch-Pruning/issues/147#issuecomment-1521406684 @Ghustwb
Hey Guys! I think I found the bug. When a concat module & a split module are directly connected, the index mapping system fails to compute correct
idxs
. I'm going to rewrite the concat & split tracing. Really thanks for this issue!
I tried to prune yolov8 with original c2f module, but it is still IndexError.
@apanand14 Hi, I'm meeting the same mistake. How did you solve it?
the pruning script is old . it will work only till the version which was mentioned in readme. if some one can give the updated pruning script it would be great
hello,when i ran the code https://github.com/VainF/Torch-Pruning/blob/master/examples/yolov8/yolov8_pruning.py
i met the error as follows:
Traceback (most recent call last): File "d:/ultralytics-main/prune_v8.py", line 17, in <module> from ultralytics.engine.model import TASK_MAP ImportError: cannot import name 'TASK_MAP' from 'ultralytics.engine.model' (d:\ultralytics-main\ultralytics\engine\model.py)
it seems caused by the version of the code of yolov8.
but i have trained a model with the newest version of yolov8,I wonder how to solve it.
hello,when i ran the code https://github.com/VainF/Torch-Pruning/blob/master/examples/yolov8/yolov8_pruning.py i met the error as follows:
Traceback (most recent call last): File "d:/ultralytics-main/prune_v8.py", line 17, in <module> from ultralytics.engine.model import TASK_MAP ImportError: cannot import name 'TASK_MAP' from 'ultralytics.engine.model' (d:\ultralytics-main\ultralytics\engine\model.py)
it seems caused by the version of the code of yolov8. but i have trained a model with the newest version of yolov8,I wonder how to solve it.
The problem have been solved.And the same issue can be found in the issues.
First,using this line to replace the line 250 in yolov8_pruning.py
self.trainer = self.task_map[self.task]['trainer'](overrides=overrides, _callbacks=self.callbacks)
next,fix the loss function in ultralytics/ultralytics/utils/loss.py
like this:
`def bbox_decode(self, anchor_points, pred_dist):
"""Decode predicted object bounding box coordinates from anchor points and distribution."""
if self.use_dfl:
b, a, c = pred_dist.shape # batch, anchors, channels
mydevice=torch.device('cuda:0')
self.proj=self.proj.to(mydevice)
pred_dist = pred_dist.view(b, a, 4, c // 4).softmax(3).matmul(self.proj.type(pred_dist.dtype))
# pred_dist = pred_dist.view(b, a, c // 4, 4).transpose(2,3).softmax(3).matmul(self.proj.type(pred_dist.dtype))
# pred_dist = (pred_dist.view(b, a, c // 4, 4).softmax(2) * self.proj.type(pred_dist.dtype).view(1, 1, -1, 1)).sum(2)
return dist2bbox(pred_dist, anchor_points, xywh=False)`
@chbw818 could u post the entire updated code if possible it would be helpfull for many persons
Hello, I am trying to apply filter pruning to yolov8 model. I saw there is sample code for yolov7 in https://github.com/VainF/Torch-Pruning/blob/master/benchmarks/prunability/yolov7_train_pruned.py. Since yolov8 has very similar structure with yolov7, I thought it would be possible to pruning it with minimal modification. However, the pruning failed due to weird problem near Concat layer. I used code below under yolov8 root to prune the model.
Following message is stack trace when pruning is failed.
the layer in error message is batchnorm layer which has (640,) shaped tensor in
layer.weight.data
. However,idxs
has (1280,) shape and out of index values. In other layers around concat it also shows similar error, which meansidxs
has much larger shape or larger value than layer weight length. I tried to figure out why this problem happens, but stuck right now. I guess there is problem in graph construction like_ConcatIndexMapping
or something for yolov8. It will be nice if you can help or give some advice to solve this problem.