roboflow / inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
https://inference.roboflow.com
Other
1.38k stars 133 forks source link

bumping owlv2 version and putting cache size in env #813

Closed isaacrob-roboflow closed 2 weeks ago

isaacrob-roboflow commented 2 weeks ago

Description

Pulls OWLv2 cache size out into an environmental variable to make load testing easier as we don't have to have >1000 unique images in order to do a sane test. Also, bumps OWLv2 model size to large (which ideally will get deployed assuming the load testing works out)

I am not sure where the environmental variables get piped in. I found that I had to set the default in the owlv2 constructor in order for the pytest to use the larger version. If anyone has thoughts on that let me know.

Also, I am setting a max cache size of 100 for models. If people start building VERY large models, on the order of 10000 prompts, this could become a problem.

Type of change

Please delete options that are not relevant.

How has this change been tested, please provide a testcase or example of how you tested the change?

Passed integration tests, both in base and large form

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs