VainF / Torch-Pruning

[CVPR 2023] DepGraph: Towards Any Structural Pruning
https://arxiv.org/abs/2301.12900
MIT License
2.65k stars 329 forks source link

Post pruning Fine-tuning on COCO train dataset #227

Open chinya07 opened 1 year ago

chinya07 commented 1 year ago

Hello, I want to use the COCO train dataset for Fine-tuning after pruning yolov8n model. I tried to keep just train2017.cache in the dataset folder however, it always points to the validation subset. Is there any workaround with this? Thanks in advance.

Screenshot 2023-08-01 at 1 09 58 PM
AymenBOUGUERRA commented 1 year ago

Hello @chinya07, Can you please provide more context or detail your workflow ?

Pruning conv layers(which are heavily used in YOLO architectures) will result in a complete destruction of the feature maps for any serious sparsity ratio, here is a table mentioning the experiments that I have done relating to sparsity ratio / inference speed in TensorrT:

image

furthermore, it seems that the pruning step must be (1-(1/2^n)) with 0<n<5 in order to have a speedup in TensorRT, and using such aggressive pruning ratios will require you to not only finetune the model but rather retrain it from scratch as the feature maps are utterly destroyed (it could be interesting to try max-pooling the conv2d layers for the YOLO models instead of the magnitude L2 norm criteria @VainF )

Having said all of this, I do not understand why you would wish to remove the valid set as it is supposed to be used paired to the training set to evaluate the results at the end of each epoch.

Don't hesitate for any question or clarification.

chinya07 commented 1 year ago

Hi @AymenBOUGUERRA,

Thank you very much for this valuable information. I am currently working to implement filter pruning on the YOLOv8 nano model, aiming for faster inference on a mobile device. In my initial experiments, I am employing L2 norm filter pruning. I had a concern that the same COCO validation dataset was being used both for fine-tuning the pruned model and for validating it afterward. However, I realized I misinterpreted this: the model is actually being fine-tuned on the COCO train dataset only. Could you please guide me on how to utilize a custom dataset for fine-tuning and/or validating the pruned model? Thank you!

AymenBOUGUERRA commented 1 year ago

Hello @chinya07

You can disregard the table above as it is specifically representative of inference on Nvidia GPUs using TensorRT, for mobile application, the inference speed up should be proportional to the pruned parameters.

In order to use a custom dataset on YOLOs, the best and fastest approach in my opinion is to convert your dataset from whatever format it is to the yolo (Darknet) format. I haven't tried this repo, but it looks like i can do the job https://github.com/HumanSignal/labelImg/tree/master There may be a better way to do so depending on the initial format of your dataset.

To further increase your inference speed, I suggest you also take a look on int8 quantization, which gave amazing results in my particular case (Nvidia GPUs using QAT), here is a well reputed and simplified repo that allows to do quantization on cpu backends https://github.com/intel/neural-compressor/blob/master/docs/source/quantization.md#get-started (note that PTQ should work on all types of devices, while QAT can rely on specific hardware)

Don't hesitate if you have any question.

aidevmin commented 12 months ago

Hello @chinya07, Can you please provide more context or detail your workflow ?

Pruning conv layers(which are heavily used in YOLO architectures) will result in a complete destruction of the feature maps for any serious sparsity ratio, here is a table mentioning the experiments that I have done relating to sparsity ratio / inference speed in TensorrT:

image

furthermore, it seems that the pruning step must be (1-(1/2^n)) with 0<n<5 in order to have a speedup in TensorRT, and using such aggressive pruning ratios will require you to not only finetune the model but rather retrain it from scratch as the feature maps are utterly destroyed (it could be interesting to try max-pooling the conv2d layers for the YOLO models instead of the magnitude L2 norm criteria @VainF )

Having said all of this, I do not understand why you would wish to remove the valid set as it is supposed to be used paired to the training set to evaluate the results at the end of each epoch.

Don't hesitate for any question or clarification.

@AymenBOUGUERRA What method did you used? There are Group-L1, Group-BN, Group-GReg etc methods

AymenBOUGUERRA commented 12 months ago

@aidevmin

I can't recall what methode i used, but I am pretty sure that I used the default one in the yolov7 pruning script. The method used should have no impact on the speed gains and only on the model's accuracy tho.