Closed billpku closed 4 years ago
@billpku This is exactly what service-streamer
does:
request A, request B, request C, they stack to a batch, send to GPU and predict with the help of 'service-streamer', then get the predict results and respond to each request.
Requests are automatically stacked to batches by service-streamer
. You need only provide the batch_predict
function.
Here is another example for you: https://github.com/ShannonAI/service-streamer/wiki/Vision-Recognition-Service-with-Flask-and-service-streamer. It shows how to change your predict
function to batch_predict
function and then use service-streamer
.
Hi,
Thank you for your awesome tools.
Follow the README introduction, I had set up a flask server with service-streamer. But I can only get one predict result per request with "Streamer" wrapper(which can enable multi GPUs support, according to the project introduction) Seem like if I want to toke full advantage of 'service-streamer' I need to batch the requests first, then predict with 'service-streamer'.
So, is there any tips for batching the http requests then sent to GPU with bath predict , then send back the specific predict result to each request?
To make it clearer, here is an example: request A, request B, request C, they stack to a batch, send to GPU and predict with the help of 'service-streamer', then get the predict results and respond to each request. How can I get this done?
Thank you.