Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
7.49k stars 584 forks source link

Max retries exceeded. Unstructured API is stuck. #3169

Closed Neel-132 closed 3 weeks ago

Neel-132 commented 3 weeks ago

Unstructured API is stuck and it just logs "Starting new HTTPS Connection..."

Screenshots Screenshot 2024-06-10 183711

Environment Info PyMuPDF==1.24.4 unstructured[all-docs]==0.14.0 nltk==3.8.1 pandas==2.2.2 llama-index==0.10.38 llama-parse==0.4.3 PyPDF2==3.0.1 fastapi==0.111.0 minio==7.2.7 uvicorn==0.29.0 psutil==5.9.8 boto3==1.34.109 botocore==1.34.109

The same error occurred some time back as well but I fixed it by making split_pdf_page = True. But right now its again stuck

awalker4 commented 3 weeks ago

Hi there - is this for the paid api, free api, or a self host? We're experiencing some network issues on the paid api right now, we're working to resolve ASAP.

Neel-132 commented 3 weeks ago

Hey this is the paid API. Okay

awalker4 commented 3 weeks ago

Sounds like w're be back! Let me know if this is working. We're very close to a new release of the api which improves the architecture, so we should have better stability going forward.

Neel-132 commented 3 weeks ago

Hey it is working. Also, for this max retries exceed error is it a bug due to network issues or is there a way to specify the number of retries?

awalker4 commented 3 weeks ago

Yep! Take a look at the retries documentation here. In your screenshot, the error is no healthy upstream which was due to our network issues just now. You can adjust the retry config to change how long the client tries before giving up.