huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
1.82k stars 470 forks source link

Fix ReadTimeout when downloading file #2323

Closed Wauplin closed 2 weeks ago

Wauplin commented 3 weeks ago

Related to internal slack thread (private link) cc @amyeroberts

Some tests started to fail in transformers. Might be related to an infra update or simply to a higher load in the CI. Failures are usually due to a ReadTimeout when downloading a file. There is already a retry mechanism when streaming the data from the server but not when making the first call. This PR fixes this by adding the same retry mechanism on the first call as well.

Expected result: no more random failures in the CI.

(note: code is quite duplicated but I did not found a way of factorizing it nicely. So let's say it's fine and better for readability)

HuggingFaceDocBuilderDev commented 3 weeks ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wauplin commented 3 weeks ago

Infra/hub team is looking into it (see slack thread). It should now be fixed server-side. @amyeroberts can you let me know if the problem persists in transformers's CI? In the meantime, I'll convert this PR to draft and drop it if the fix is confirmed.

amyeroberts commented 2 weeks ago

@Wauplin It seems that we can close this - I haven't seen any read time outs in the past 1.5 days since the fix was handled by the infra team. Thanks for responding to this so quickly!

Wauplin commented 2 weeks ago

Great news! Thanks for following-up on this. I'm closing it but feel free to reach out if it ever happens again :)