Is it possible to deploy the pruned model for inference speedup ?

Arnav0400 / ViT-Slim

Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”

MIT License

245 stars 17 forks source link

Is it possible to deploy the pruned model for inference speedup ? #4

Closed reversal67 closed 2 years ago

reversal67 commented 2 years ago

Dear author: Since there's only FLOPS reduction reported in the paper, I am curious to know if it's possible to convert the pruned model to ONNX for inference speedup on hardware. Thanks

Arnav0400 commented 2 years ago

Hey you can find the attached code for getting a pruned model. the branch is - throughput ViT-Slim-throughput.zip