The parameters and computation decreased, but FPS was worse than the baseline model

VainF / Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs

https://arxiv.org/abs/2301.12900

MIT License

2.44k stars 308 forks source link

The parameters and computation decreased, but FPS was worse than the baseline model #382

Open TTFF322 opened 1 month ago

TTFF322 commented 1 month ago

Hello, recently I have been applying your code to yolov8. Based on previous problems encountered, and the help of the files in the examples/yolov8 folder. I managed to reduce both the number of parameters and the amount of computation. However, I had a problem. When testing the FPS of the pruned model, I found that although the number of parameters and so on were reduced, the FPS results measured after pruning were much lower than the baseline model.

janthmueller commented 1 month ago

Is this the case for cpu and cuda? If it is only on cuda you may want to check out NVIDIAs optimization guide:

https://docs.nvidia.com/deeplearning/performance/index.html#optimizing-performance

Maybe some performance issues regarding tensor shapes.

TTFF322 commented 1 month ago

I had only tested it on Nvidia GPUs before. After seeing your reply, I tested it on CPU, and the FPS was increased on CPU. Maybe some performance issues regarding tensor shapes. Thank you.