oracle / graphpipe

Machine Learning Model Deployment Made Simple
https://oracle.github.io/graphpipe
Other
720 stars 103 forks source link

handle multiple user request at near time #7

Open pribadihcr opened 6 years ago

pribadihcr commented 6 years ago

Hi, is this serving can handle such a problem?

vishvananda commented 6 years ago

I think you are asking if the server can handle multiple concurrent requests. Assuming you are referring to the go servers, yes they can. It generally handles two simultaneous requests faster than two sequential requests, although there is a limit depending on the complexity of the model and the backend used. If the cpu or gpu is fully loaded then simultaneous requests could be slower than sending the requests sequentially. Also, keep in mind that it is almost always better to batch requests, especially for the GPU, so sending a single request with multiple rows is usually faster than multiple requests with a single row