A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
Pulls OWLv2 cache size out into an environmental variable to make load testing easier as we don't have to have >1000 unique images in order to do a sane test. Also, bumps OWLv2 model size to large (which ideally will get deployed assuming the load testing works out)
I am not sure where the environmental variables get piped in. I found that I had to set the default in the owlv2 constructor in order for the pytest to use the larger version. If anyone has thoughts on that let me know.
Also, I am setting a max cache size of 100 for models. If people start building VERY large models, on the order of 10000 prompts, this could become a problem.
Type of change
Please delete options that are not relevant.
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] This change requires a documentation update
How has this change been tested, please provide a testcase or example of how you tested the change?
Passed integration tests, both in base and large form
Any specific deployment considerations
For example, documentation changes, usability, usage/costs, secrets, etc.
Description
Pulls OWLv2 cache size out into an environmental variable to make load testing easier as we don't have to have >1000 unique images in order to do a sane test. Also, bumps OWLv2 model size to large (which ideally will get deployed assuming the load testing works out)
I am not sure where the environmental variables get piped in. I found that I had to set the default in the owlv2 constructor in order for the pytest to use the larger version. If anyone has thoughts on that let me know.
Also, I am setting a max cache size of 100 for models. If people start building VERY large models, on the order of 10000 prompts, this could become a problem.
Type of change
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
Passed integration tests, both in base and large form
Any specific deployment considerations
For example, documentation changes, usability, usage/costs, secrets, etc.
Docs