Description

Pulls OWLv2 cache size out into an environmental variable to make load testing easier as we don't have to have >1000 unique images in order to do a sane test. Also, bumps OWLv2 model size to large (which ideally will get deployed assuming the load testing works out)

I am not sure where the environmental variables get piped in. I found that I had to set the default in the owlv2 constructor in order for the pytest to use the larger version. If anyone has thoughts on that let me know.

Also, I am setting a max cache size of 100 for models. If people start building VERY large models, on the order of 10000 prompts, this could become a problem.

Type of change

Please delete options that are not relevant.

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Passed integration tests, both in base and large form

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

[ ] Docs updated? What were the changes:

roboflow / inference

bumping owlv2 version and putting cache size in env #813

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs