Closed dimeldo closed 6 years ago
Hi! For running the model on production you could use gunicorn - https://github.com/lukalabs/cakechat#gunicorn-http-server
You can set multiple workers for gunicorn to process multiple requests - http://docs.gunicorn.org/en/stable/settings.html#workers
More info about serving flask application on production you can find here: https://www.digitalocean.com/community/tutorials/how-to-serve-flask-applications-with-gunicorn-and-nginx-on-ubuntu-18-04 https://www.pyimagesearch.com/2018/02/05/deep-learning-production-keras-redis-flask-apache/
Great, thank you!
Hello there! Thanks for the valuable and well-rounded project.
I see that you're using Flask app to serve model predictions using simple REST API. Can you please do a guide on how to deploy a trained model in production that could handle multiple requests at once? And maybe even scale (number of machines) with the number of requests? Including setting up the VM's, environment, etc...?
The community is really lacking guides like that so this can be very helpful.