Why is there no significant change in model size after pruning and quantization?

dnth / yolov5-deepsparse-blogpost

By the end of this post, you will learn how to: Train a SOTA YOLOv5 model on your own data. Sparsify the model using SparseML quantization aware training, sparse transfer learning, and one-shot quantization. Export the sparsified model and run it using the DeepSparse engine at insane speeds. P/S: The end result - YOLOv5 on CPU at 180+ FPS using on

https://dicksonneoh.com/portfolio/supercharging_yolov5_180_fps_cpu/

53 stars 13 forks source link

Why is there no significant change in model size after pruning and quantization? #17

Open Pass-O-Guava opened 1 year ago

Pass-O-Guava commented 1 year ago

Why is there no significant change in model size after pruning and quantization?

For yolov5n model, I ran the training and export python script and found best.pt 7.4M, best.onnx 2M, however, the original model yolov5n.pt 4.1M.

Why .pt model changed from 4.1M to 7.4M, and why .pt model converted to .onnx model from 7.4M to 2M?

Looking forward to someone's help to answer the reason for this, thanks!