Open matthieu-perso opened 2 years ago
Hello @MatthieuMoullecDev !
Thank you for the interest in Grobid and the issue.
You can use the Grobid python client, which is very well tested and has been able to scale to 12M PDF. Without managing the server availability (503
responses), you will get for sure these timeouts, but the python client is managing them for you.
Then the main adaptation to avoid timeout is on the server settings. You can have a look at the FAQ entry on the topic here. Two important aspects I think from your description are the amount of RAM memory and the number of threads. The settings for threads in the client and the grobid server need to be aligned with the real number of available threads available on the server.
Hey Patrice,
Thanks for your quick and helpful reply !
I saw the Python client but was struggling with an error I managed to debug (write-up here). I will have a go with it.
Thanks for the link to the production FAQs, will follow these guidelines and go from there.
Configuration
Problem
What would be the reason the service times out so fast ? Any workarounds if I wish for all requests to be completed ?
Code (for the local instance, identical cloud one except for url and token )