talebolano / yolov3-network-slimming

yolov3 network slimming剪枝的一种实现
344 stars 93 forks source link

剪枝问题 #4

Open lucheng07082221 opened 5 years ago

lucheng07082221 commented 5 years ago

你好呀:

load network done! load weightsfile done!

Pre-processing... layer index: 4 total channel: 32 remaining channel: 32 layer index: 8 total channel: 64 remaining channel: 64 layer index: 12 total channel: 32 remaining channel: 31 layer index: 16 total channel: 64 remaining channel: 64 layer index: 22 total channel: 128 remaining channel: 128 layer index: 26 total channel: 64 remaining channel: 53 layer index: 30 total channel: 128 remaining channel: 128 layer index: 36 total channel: 64 remaining channel: 63 layer index: 40 total channel: 128 remaining channel: 128 layer index: 46 total channel: 256 remaining channel: 256 layer index: 50 total channel: 128 remaining channel: 86 layer index: 54 total channel: 256 remaining channel: 256 layer index: 60 total channel: 128 remaining channel: 125 layer index: 64 total channel: 256 remaining channel: 256 layer index: 70 total channel: 128 remaining channel: 126 layer index: 74 total channel: 256 remaining channel: 256 layer index: 80 total channel: 128 remaining channel: 128 layer index: 84 total channel: 256 remaining channel: 256 layer index: 90 total channel: 128 remaining channel: 128 layer index: 94 total channel: 256 remaining channel: 256 layer index: 100 total channel: 128 remaining channel: 126 layer index: 104 total channel: 256 remaining channel: 256 layer index: 110 total channel: 128 remaining channel: 120 layer index: 114 total channel: 256 remaining channel: 256 layer index: 120 total channel: 128 remaining channel: 125 layer index: 124 total channel: 256 remaining channel: 256 layer index: 130 total channel: 512 remaining channel: 512 layer index: 134 total channel: 256 remaining channel: 256 layer index: 138 total channel: 512 remaining channel: 512 layer index: 144 total channel: 256 remaining channel: 249 layer index: 148 total channel: 512 remaining channel: 512 layer index: 154 total channel: 256 remaining channel: 244 layer index: 158 total channel: 512 remaining channel: 512 layer index: 164 total channel: 256 remaining channel: 239 layer index: 168 total channel: 512 remaining channel: 512 layer index: 174 total channel: 256 remaining channel: 249 layer index: 178 total channel: 512 remaining channel: 512 layer index: 184 total channel: 256 remaining channel: 240 layer index: 188 total channel: 512 remaining channel: 512 layer index: 194 total channel: 256 remaining channel: 256 layer index: 198 total channel: 512 remaining channel: 512 layer index: 204 total channel: 256 remaining channel: 235 layer index: 208 total channel: 512 remaining channel: 512 layer index: 214 total channel: 1024 remaining channel: 1024 layer index: 218 total channel: 512 remaining channel: 464 layer index: 222 total channel: 1024 remaining channel: 1024 layer index: 228 total channel: 512 remaining channel: 466 layer index: 232 total channel: 1024 remaining channel: 1024 layer index: 238 total channel: 512 remaining channel: 473 layer index: 242 total channel: 1024 remaining channel: 1024 layer index: 248 total channel: 512 remaining channel: 465 layer index: 252 total channel: 1024 remaining channel: 1024 layer index: 258 total channel: 512 remaining channel: 512 layer index: 262 total channel: 1024 remaining channel: 1024 layer index: 266 total channel: 512 remaining channel: 512 layer index: 270 total channel: 1024 remaining channel: 1024 layer index: 274 total channel: 512 remaining channel: 512 layer index: 278 total channel: 1024 remaining channel: 1024 layer index: 291 total channel: 256 remaining channel: 1 layer index: 299 total channel: 256 remaining channel: 0 layer index: 303 total channel: 512 remaining channel: 0 layer index: 307 total channel: 256 remaining channel: 0 layer index: 311 total channel: 512 remaining channel: 0 layer index: 315 total channel: 256 remaining channel: 0 layer index: 319 total channel: 512 remaining channel: 22 layer index: 332 total channel: 128 remaining channel: 7 layer index: 340 total channel: 128 remaining channel: 1 layer index: 344 total channel: 256 remaining channel: 1 layer index: 348 total channel: 128 remaining channel: 3 layer index: 352 total channel: 256 remaining channel: 1 layer index: 356 total channel: 128 remaining channel: 1 layer index: 360 total channel: 256 remaining channel: 7 Pre-processing Successful!

save pruned cfg file in prune_yolov3_20.cfg Traceback (most recent call last): File "prune.py", line 91, in newmodel = Darknet(prunecfg) File "/home2/lc/yolov3-network-slimming/yolomodel.py", line 324, in init self.net_info, self.module_list = create_modules(self.blocks) File "/home2/lc/yolov3-network-slimming/yolomodel.py", line 232, in create_modules conv = nn.Conv2d(prev_filters, filters, kernel_size, stride, pad, bias=bias) File "/home/user1/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 297, in init False, _pair(0), groups, bias) File "/home/user1/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 38, in init self.reset_parameters() File "/home/user1/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 44, in reset_parameters stdv = 1. / math.sqrt(n) ZeroDivisionError: float division by zero

为什么剪枝后会出现channel为0的情况?

talebolano commented 5 years ago

是在稀疏化训练之后么,是否剪枝过多?

lucheng07082221 commented 5 years ago

@talebolano 稀疏化率用的是默认0.0001 ,剪枝比例用的默认0.3 模型用的yolov3,这里面应该有个处理过程,就是当出现剪枝后的通道数为0的时候应该取当前层最大的channel保留把

talebolano commented 5 years ago

是的,应该保留,我看你只有最后几层被剪掉了,其他层都没有变过,你是用了什么数据集,稀疏化水平如何?

lucheng07082221 commented 5 years ago

@talebolano 这个是我训练数据的稀疏化水平: 0~20%:0.793500,20~40%:0.875185,40~60%:1.030651,60~80%:1.364402,80~100%:6.337093

talebolano commented 5 years ago

看原论文上是稀疏化到0.01之后去减的

lucheng07082221 commented 5 years ago

@talebolano 你有没有试过在有些数据集上训练yolov2,由于数据本身复杂性,利用L1正则去稀疏可能也不会带来太大的稀疏性,导致根本无法剪枝,即每一层的channel都是有用的?

talebolano commented 5 years ago

是的,这也是困扰我的问题,我尝试过替换惩罚项,换成kl散度,剃度截断还有ista,都不怎么work。你看过rethinking the value of network pruning那篇论文了么,像那篇论文所说,也许可以对稀疏化这个步骤换一种思路

lucheng07082221 commented 5 years ago

@talebolano 还没仔细研究,我觉得想要剪枝效果好,首先前提是网络容量相对数据集要充分大,作者用cifar10做测试是不对的,cifar10只要一个很小的网络就可以达到高的精度

EtheneXiang commented 5 years ago

请问下,这和剪枝是怎么实现的,怎么丢弃的。我理解中的剪枝的实现方法,就是为权重施加一个相同大小的Mask, Mask中只有激活的地方才是1,其余全0。 可是,作者这个就是真真正正的剪枝,剪完之后,通道确实少了,模型确实变小了