huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.51k stars 26.4k forks source link

OWL-VIT Vision Foundation Model deployment in the edge cases - Need SDPA support for OWL-ViT Model optimization for Edge Deployment #28103

Open solomonmanuelraj opened 9 months ago

solomonmanuelraj commented 9 months ago

Feature request

Hi Team, I am working with OWL-ViT Size model which has around 611 MB size ( https://huggingface.co/google/owlvit-base-patch16). I want to optimize this model and like to deploy in the edge device for object detection.

Come to know from the group torch.scaled_dot_product_attention can be used for model optimization.

I need your feedback comments how optimally we can reduce the memory size so that we can deploy in the edge device.

waiting for your response.

with thanks

Motivation

It will help to deploy the models in edge so that more applications we can use it.

Your contribution

Like to know your feedback comments.

NielsRogge commented 9 months ago

Hi,

For this one I'd recommend taking a look at the Optimum library which provides utilities for ONNX export and further optimization like pruning/quantization.

You can probably reduce the size of the model most by quantization.

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

NielsRogge commented 8 months ago

Since various models have seen SDPA addition (see e.g. #28133), one could add it to OWL-ViT as well.

nileshkokane01 commented 8 months ago

@NielsRogge ,

Taking inspiration from mistral and as well as from llama can I add this on the similar lines for OWL-ViT?

Let me know.