Open patrickjane opened 3 years ago
You can have multiple classification running in different threads/ processes with no problem. BUT, the ideal way to efficiently run multiple images/predictions at the same time is batch inference. https://github.com/notAI-tech/fastDeploy (using which the docker images are built) is optimized for this.
Okay cool.
Batch is not an option, since I want to classify user uploads, and thus, it must be classified on-request.
fastDeploy does support micro batching on user-requiest via /sync. If 2 requests come in at the same time, fastDeploy batch predicts them and returns the response in the same http call.
For example, take a look at https://fastdeploy.notai.tech/docs/benchmarks.html#deepsegment_en
8192 requests when run one by one takes 79.06 seconds. If you have 128 users running total of 8192 requests one by one this would take only 7.05 seconds (10x performance increase in sync because of batching.)
Take a look at https://fastdeploy.notai.tech/api#request-type-file
Sounds great, will investigate.
First of all, thanks a ton for this awesome piece of software, much appreciated!
I would just like to ask a quick question. Are the
classify
anddetect
functions reentrant? E.g. is it safe to use those from a webserver's request handlers, and thus potentially having multiple classifications running at the same time?Note: Currently not using the docker image, using a plain python install with my own webserver wrapper.