ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
https://els-rd.github.io/transformer-deploy/
Apache License 2.0
1.64k stars 150 forks source link

ViT serving #162

Open VoVoR opened 1 year ago

VoVoR commented 1 year ago

Have you guys tried using your code to create a triton with TRT backend for Vision transformer models? I tried myself and got throughput less than pytorch bare inference