Closed dev-svk-flbs closed 1 year ago
Hi,
I would suggest to use Celery and compute the long running task in the background and use long polling in the frontend. The workflow:
running
.I have a feedback form running with https://deploymachinelearning.com and long running tasks with Celery was the the most frequently requested topic. I will write a course about this topic, but it will be paid. I'm planning to write it in this year.
Hi,
Thank you so much for making this repo available. I had a question on how to deploy this kind of API framework for ML inferences that might take very long (say 1 minute).
Typically when the predict endpoint is hit by a request, if the inference model is light, it will run fast. But for my case there is no way to reduce the inference time less than 1 minute. In this case, the whole endpoint will be locked, and wont be able to accept any further inference request. This is a very common situation with ML inferencing so I am wondering there must by something easy to handle this kind of situation.
I did some reading on DRF async but not sure if thats the right approach.
Can you please provide some directions?