Open jpbalarini opened 6 months ago
Yes, I think a good idea will be to front the service with nginx or something similar. I believe @kiran-nlmatics faced it in our public server and solved it. This happens due to connection exhaustion in flask. This will need some work and time to push to the repo, but in the meantime you can create your own installation setup with nginx and gunicorn backend. Also adding @kiran-nlmatics to the thread.
I observed a similar issue with our Azure Market place offering earlier, in that randomly, there are connection drops from the client while a PDF was getting parsed and the gunicorn backend was getting stuck in a spawning loop because of the very nature on how NAT-ing was happening in these VMs. As suggested by @ansukla if you create the multiple server loads behind a loadbalancer which can reverse notify the server (gunicorn or similar backend?), the issue will not occur.
In our FREE Server (a K8S cluster with an appropriate load balancer), we never faced this specific connection reset issue.
Thanks @kiran-nlmatics. Can you share the specs of that K8S cluster? Like CPUs, Memory that you were using for the free endpoint? It might be useful to our use case.
I'm running the nlm-ingestor on a pipeline where I'm processing thousands of total documents (~100 in parallel). I created multiple nlm-ingestor services behind a load balancer to distribute the load. But even if I create a lot of services, I get this randomly by llmsherpa:
And this is the stack trace:
So what is happening is that for some reason the nlm-ingestor service drops the connection (maybe there are too many?) and llmsherpa doesn't get a proper
response_json
with areturn_dict
value.Have you encountered this issue? Any idea on how to properly debug what could be happening? Thanks!