We want to implement batch processing for the image classifier and thus increase throughput.
Decouple the web server from the AI model evaluator, and make the web server scale with more requests or possibly act asynchronously
Rewrite the image submission processing so it waits a specified number of seconds to bunch multiple requests together into a multi-threaded queue for processing. Make sure it takes advantage of the model's runtime batch input submission capabilities.
We want to implement batch processing for the image classifier and thus increase throughput.