Recommendation for production server

nlmatics / nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.

https://www.nlmatics.com

Apache License 2.0

1.11k stars 160 forks source link

Recommendation for production server #3

Open jpbalarini opened 10 months ago

jpbalarini commented 10 months ago

The documentation says that the provided server is good for a development environment. Do you have any examples or suggestions on how to run this in a production environment?

Related to the previous question, I saw that the provided server enqueues the requests one after another. We have to index a lot of documents, and we plan to do that in parallel. Do you have any recommendations to serve ~100 concurrent requests?

Thanks!

ansukla commented 10 months ago

The ingestor is very fast unless you are using OCR. You can create a kubernetes cluster and have many instances of ingestor to scale.

ianschmitz commented 9 months ago

More would still need to be done on nlm-ingestor side though @ansukla, correct? I ask because i see the following warning from Flask when starting the container:

WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.

My assumption would be that the nlm-ingestor docker image would be sufficient to be deployed to a k8s cluster, ECS, VM, whatever but the container itself would be in a "production" state and it would be up to the consumer to figure out how to load balance, secure if necessary, etc.

ansukla commented 9 months ago

Yes, that's the recommended approach to deploy this behind nginx or a cloud gateway/firewall. If you have the bandwidth and would like to front this with nginx or something similar, happy to accept that PR (something I have been planning to do but didn't get the time).

ianschmitz commented 9 months ago

A simpler alternative to fronting this with nginx (which is an awesome tech), is a simple gunicorn or similar WSGI server which is in the list of recommended production servers

On the surface it seems like it would involve adding gunicorn as another dependency, and then running that in the ./run.sh file in production?

ansukla commented 9 months ago

That’s a great idea - the gunicorn option should work.

On Thu, Feb 8, 2024 at 7:59 PM Ian Schmitz @.***> wrote:

A simpler alternative to fronting this with nginx (which is an awesome tech), is a simple gunicorn https://gunicorn.org/#quickstart or similar WSGI server which is in the list of recommended production servers https://flask.palletsprojects.com/en/3.0.x/deploying/#self-hosted-options

On the surface it seems like it would involve adding gunicorn as another dependency, and then running that in the ./run.sh file in production?

— Reply to this email directly, view it on GitHub https://github.com/nlmatics/nlm-ingestor/issues/3#issuecomment-1935167808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALJTIUKXUDXXCLJ2HUJ7G3YSVYHFAVCNFSM6AAAAABCMLSICWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVGE3DOOBQHA . You are receiving this because you were mentioned.Message ID: @.***>

jzribi3 commented 9 months ago

Has anyone already deployed nlm-ingestor on Azure? Briefly, which resources did you use? Thank you!

ianschmitz commented 9 months ago

Has anyone already deployed nlm-ingestor on Azure? Briefly, which resources did you use? Thank you!

@jzribi3 it's a docker image so any Azure service capable of deploying a container should be sufficient. https://azure.microsoft.com/en-ca/products/category/containers