SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Application

keko950 commented 5 years ago

https://arxiv.org/abs/1907.11093

AlexeyAB commented 5 years ago

paper: https://arxiv.org/abs/1907.11093v1

source code: https://github.com/PengyiZhang/SlimYOLOv3

metrics

procedure

LukeAI commented 5 years ago

Would these pruned weights work with this darknet repo. ?

AlexeyAB commented 5 years ago

@LukeAI It gives me an error.

So I asked about this: https://github.com/PengyiZhang/SlimYOLOv3/issues/18

PS

I just recompile Darknet from scratch and it works. So this repository supports pruned models from https://github.com/PengyiZhang/SlimYOLOv3

for example: ./darknet detector test data/drone.data prune_0.5_0.5_0.7.cfg prune_0.5_0.5_0.7_final.weights

LukeAI commented 5 years ago

wow, this looks it's probably the best value inference-time boost since CUDNN_HALF.

AlexeyAB commented 5 years ago

There is just no comparison with common datasets like MS COCO and OpenImages.

LukeAI commented 5 years ago

I note that they include this cfg for yolov3-spp3.cfg which achieves a higher AP that yolov3-spp.cfg maybe would be good to add to this repo.s cfg?

keko950 commented 5 years ago

I note that they include this cfg for yolov3-spp3.cfg which achieves a higher AP that yolov3-spp.cfg maybe would be good to add to this repo.s cfg?

Did you test that cfg?

anhuipl2010 commented 5 years ago

when the author say just add updateBN() in train(),can add it in your darknet c++ project, otherwise some others need build in pytorch env.Maybe his code cannot run,when some people found some problems. @AlexeyAB

gmayday1997 commented 5 years ago

hi, I have added bn-pruning algorithm prune.cpp.

  ./darknet prune ./cfg/yolov3.cfg ./cfg/yolov3.weights -rate 0.3

the pruned cfg/weights are saved as ./cfg/yolov3_prune.cfg / .cfg/yolov3_prune.weights

But, some bugs need to be fixed, because something goes wrong in retraining. I will solve it in next two days.

AlexeyAB commented 5 years ago

@gmayday1997

Did you get the same good accuracy after pruning?
Can you describe the order in which commands are run to run: training, pruning, and file tuning?

What you will fix all bugs, you can do Pull Request to this repository.

anhuipl2010 commented 5 years ago

@gmayday1997 thanks. you're handsome.

WongKinYiu commented 5 years ago

it can not detect objects after pruning. (both of yolov3 and yolov3-tiny) some models get filters=0 after pruning.

AlexeyAB commented 5 years ago

@WongKinYiu Do you mean https://github.com/PengyiZhang/SlimYOLOv3 or https://github.com/AlexeyAB/darknet/issues/3732#issuecomment-520365823 ?

WongKinYiu commented 5 years ago

@AlexeyAB i mean https://github.com/AlexeyAB/darknet/issues/3732#issuecomment-520365823 i think maybe sparse training is needed for pruning the model.

LukeAI commented 5 years ago

Has anybody managed to get any results using https://github.com/PengyiZhang/SlimYOLOv3 ?

WongKinYiu commented 5 years ago

@LukeAI the results is good using the pruned model provided by https://github.com/PengyiZhang/SlimYOLOv3

predictions

gmayday1997 commented 5 years ago

Did you get the same good accuracy after pruning? Can you describe the order in which commands are run to run: training, pruning, and file tuning? @AlexeyAB I have fixed some bugs, and fine-tuning goes well now.

yolov3	FLOPS	Map(coco_val5k @0.5)
before pruned	65	54.65
pruned w/o fine-tuning	36.3	0.2
pruned w fine-tuning)(2k iter)	36.3	34.6
pruned w fine-tuning)(9k iter)	36.3	41.3
pruned w fine-tuning)(12k iter)	36.3	43.4
pruned w fine-tuning)(28k iter)	36.3	45.2
pruned w fine-tuning)(update later)

examples(before pruned)
examples(pruned w fine-tuning 9k iter)
examples(pruned w fine-tuning 28k iter)

yolov3-tiny	FLOPS	Map(custom data @0.5)
before pruned	7.1	75.4
pruned w/o fintuning	5.	13.5
pruned w fintuning)(2k iter)	5	68.1

In addition, I added some tricks proposed in Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers 2019-08-18 10-01-33屏幕截图

WongKinYiu commented 5 years ago

@gmayday1997 Hello, do u implement overall_ratio & perlayer_ratio? (-rate in ur code is overall_ratio?) and do u have plan to implement sparse training?

gmayday1997 commented 5 years ago

@ WongKinYiu hi, - rate is global threshold. local threshold pruning and sparse training inplementation are in plan.

WongKinYiu commented 5 years ago

@gmayday1997 thanks.

AlexeyAB commented 5 years ago

@gmayday1997 Thanks!

Will see, will you get the same good results as https://github.com/PengyiZhang/SlimYOLOv3 with only 10% drop in accuracy 23.9% / 26.4% mAP

isgursoy commented 5 years ago

watching

dexception commented 5 years ago

+1

OpenAI2022 commented 5 years ago

I have a question, this channel prune method cut off a certain percentage channels in yolo model, may cause some layers'(or almost) kernels number not the power of 2. for example, orginal layer has 256 kernels, after prune, left 111 kernels, which is not a number of the power of 2. will this hurt the inference performance? should we set the kernel number of a layer to be the power of 2 to make cudnn has the best speed?

Hwijune commented 5 years ago

this is my test result.

origin : 96.56 mAP 36ms prune 95 : 94.98 mAP 10ms prune 50 : 95.66 mAP 23ms

OpenAI2022 commented 5 years ago

this is my test result. origin : 96.56 mAP 36ms prune 95 : 94.98 mAP 10ms prune 50 : 95.66 mAP 23ms

the mAP is so high, can you provide some information of your training data? like how many pics you use to train,to test, how many classes you have?

Hwijune commented 5 years ago

this is my test result. origin : 96.56 mAP 36ms prune 95 : 94.98 mAP 10ms prune 50 : 95.66 mAP 23ms

the mAP is so high, can you provide some information of your training data? like how many pics you use to train,to test, how many classes you have?

using 10000 pics, val 500, used a small dataset to test

AlexeyAB commented 5 years ago

@hwijune

Did you use https://github.com/PengyiZhang/SlimYOLOv3 or https://github.com/gmayday1997/darknet.CG/blob/master/src/prune.cpp for pruning?

AlexeyAB commented 5 years ago

@gmayday1997 Hi, What final results did you get?

Hwijune commented 5 years ago

https://github.com/PengyiZhang/SlimYOLOv3

@AlexeyAB use https://github.com/PengyiZhang/SlimYOLOv3 repository,

yolov3 ktian08-hyp branch

Another question, do you plan to support image rotation augmentation? I want to apply it to detection.

OpenAI2022 commented 5 years ago

On my dataset, I got poor result. My training dataset consists of two classes, half postive samples,half negative samples, about 20k pics(10k for postive, 10k for negative). The object I need to detect is very few in real world, like cancer patients among people, which means i need to focus on both accuracy and recall. the bigger cutting ratio I set, the worse result I have.I think my experiment verified this paper https://arxiv.org/abs/1810.05270. If your dataset is sample, maybe this method will work. when i cut off 50% channels in my yolov3 model, i got 20% decrease on both accuracy and recall. training loss is around at 0.5, but before prune, the number is 0.2. If anyone has the same situation, we can discuss it. Hope this will help your guys

gmayday1997 commented 5 years ago

hi @AlexeyAB
I make some model pruning experiments on coco. Here are the results.

yolov3	FLOPS	Map(coco_val5k @0.5)
before pruned	65	54.65
pruned @prune_rate=0.3	36.3	46.7
pruned @prune_rate=0.3(random pruning)	36.3	48
pruned @prune_rate=0.3(prune top large bn-value)	36.3	45.2
pruned @prune_rate=0.5(random pruning)	15	43

On base of my experiments, I found bn-pruning is really one answer, but not the only one, because I pruned random pruning or even pruned the most top largest bn values, can also achieve good performance. It is very interesting.

AlexeyAB commented 5 years ago

@gmayday1997 Thanks, as I see, the drop in accuracy is quite noticeable

LukeAI commented 5 years ago

@gmayday1997 that is substantial - is that after sparsity training? or just after pruning?

gmayday1997 commented 5 years ago

@AlexeyAB @LukeAI emm, yes, but depends. I found small accuracy drop on my custom dataset(12w trainval,1.7w test, 2class). Training on coco dataset is really slow, and I have only one GPU, So the finetuning iterations are all less than 8w. Maybe I need finetune more iterations. In addition, retaining small bn-scale value and pruning large ones can also achieve comparable accuracy.

LukeAI commented 5 years ago

This paper suggests that a trained, pruned, fine-tuned model does not perform any better than a model using the same pruned cfg, random intitial weights and then trained from scratch. This doesn't mean that there is no value in pruning - it seems like it might be an effective way to automatically discover more efficient network architectures for a particular dataset? but it does imply that maybe pruning the same model on various datasets and comparing the results - what kind of thing tends to be preserved and what tends to be pruned away? Could be a nice way to discover a more efficient general-purpose architecture?

AlexeyAB commented 5 years ago

@LukeAI Conclusion: just reduce the number of filters in the middle layers and train the model from scratch

Hwijune commented 5 years ago

I proceeded to normal training -> parsity traing(100epoch) -> pruning -> fine tuning order. written in the paper.

gmayday1997 commented 5 years ago

@AlexeyAB some updates(training with more iterations)

yolov3	volume(MB)	FLOPS	Map(coco_val5k @0.5)	finetuning iters	parameters
before pruned	246	65	54.65	500k	1x
pruned @prune_rate=0.3	122	36.3	48.1	80k	0.5x
pruned @prune_rate=0.5	60.5	16	49.2	160k	0.25x
pruned @prune_rate=0.7	31	7	in processing		0.125x

download links to pruned cfgs/weights

pruned @prune_rate=0.3: cfg(google driver),weight(google driver) OR cfg(baidupan)(s846),weight(baidupan)(eswd)

pruned @prune_rate=0.5: cfg(google driver),weight(google driver) OR cfg(baidupan)(y9gk), weight(baidupan)(5eqt)

LukeAI commented 5 years ago

@gmayday1997 thanks for sharing your results! Would you be able to add a column with "inference time" or FPS? What is the "w" unit for no. of iterations?

gmayday1997 commented 5 years ago

@LukeAI Yes, I will test those metrics tomorrow. 'w' is just a clerical error. It is an abbreviation for 10 thousands in Chinese. I fixed it, thank you.

gmayday1997 commented 5 years ago

Hi, @LukeAI Here are the results about FPS test.

speed test(experiment on 1080Ti)

yolov3	volume(MB)	FLOPS	FPS(352x 288)	FPS(960 x 540)	FPS(1960 x 1080)
before pruned	246	65	60	57	53
pruned @prune_rate=0.3	122	36.3	82	78	76
pruned @prune_rate=0.5	60.5	16	107	105	97

varghesealex90 commented 5 years ago

@gmayday1997 , The pipeline used is

1)Train full model

2) Prune

3) Fine-Tune pruned model

i see there the drop in accuracy is minimal when looking a the gain in FPS.

Can you please report the performance (mAP) after step 2 i.e. before fine-tuning. Using the code provided by the SLIM YOLOv3 authors, I get mAP of 0% with the prune.weights before fine-tuning. Is this normal?

gmayday1997 commented 5 years ago

@varghesealex90 year, it is normal that the pruned model get low accuracy before fine-tuning. Based on my experiment , I found that the algorithm described blow is helpful to preserve accuracy.

varghesealex90 commented 5 years ago

@gmayday1997 thanks for the clarification. Is the above technique implemented in https://github.com/gmayday1997/darknet.CG

gmayday1997 commented 5 years ago

@varghesealex90 yes, here are the implementations. https://github.com/gmayday1997/darknet.CG/blob/945137080809e721f42883cbd1f7f4f7718d28f6/src/prune.cpp#L568

dexception commented 5 years ago

Tested Yolov3-Tiny Fine Tuning after Pruning with: 0.7 : average loss is always nan 0.5: Working fine 0.3: Working fine

Hwijune commented 5 years ago

@gmayday1997 Is there a difference without sparsity training?

sctrueew commented 4 years ago

@gmayday1997 Hi,

I have added the prune.cpp to this repo and I tested it on my model with -rate 0.3 but doesn't recognize anything. I'm using TinyV3.

WongKinYiu commented 4 years ago

u have to retrain the model after pruning.

AlexeyAB / darknet