Open solomonmanuelraj opened 9 months ago
Hi,
For this one I'd recommend taking a look at the Optimum library which provides utilities for ONNX export and further optimization like pruning/quantization.
You can probably reduce the size of the model most by quantization.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Since various models have seen SDPA addition (see e.g. #28133), one could add it to OWL-ViT as well.
Feature request
Hi Team, I am working with OWL-ViT Size model which has around 611 MB size ( https://huggingface.co/google/owlvit-base-patch16). I want to optimize this model and like to deploy in the edge device for object detection.
Come to know from the group torch.scaled_dot_product_attention can be used for model optimization.
I need your feedback comments how optimally we can reduce the memory size so that we can deploy in the edge device.
waiting for your response.
with thanks
Motivation
It will help to deploy the models in edge so that more applications we can use it.
Your contribution
Like to know your feedback comments.