Open pribadihcr opened 6 years ago
I think you are asking if the server can handle multiple concurrent requests. Assuming you are referring to the go servers, yes they can. It generally handles two simultaneous requests faster than two sequential requests, although there is a limit depending on the complexity of the model and the backend used. If the cpu or gpu is fully loaded then simultaneous requests could be slower than sending the requests sequentially. Also, keep in mind that it is almost always better to batch requests, especially for the GPU, so sending a single request with multiple rows is usually faster than multiple requests with a single row
Hi, is this serving can handle such a problem?