marqo-ai / marqo

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
https://www.marqo.ai/
Apache License 2.0
4.63k stars 191 forks source link

[ENHANCEMENT] Better internal batching for images inference #345

Open jn2clark opened 1 year ago

jn2clark commented 1 year ago

Is your feature request related to a problem? Please describe. Currently batching is effectively performed over text based fields (due to the internal splitting creating batches) but for images this not the case. This means there is probably up to a 2x improvement for throughput that could be gained by better batching of images for inference.

Describe the solution you'd like Batching of images for inference.

Describe alternatives you've considered Batching at the model level via an inference server. However the performance of earlier tests for this was that although batching at the model worked reasonably well, the overall throughput was worse.

jn2clark commented 1 year ago

@pandu-k Thinking a bit more lazy calls to inference might be the best way to go without changing much code then running inference in batches for the respective image and text.