huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.03k stars 531 forks source link

`check_disk_space` fails when the target directory does not exist #1690

Closed carmocca closed 1 year ago

carmocca commented 1 year ago

Describe the bug

See: https://github.com/llm-efficiency-challenge/neurips_llm_efficiency_challenge/issues/26, https://github.com/Lightning-AI/lit-gpt/issues/537 and https://github.com/llm-efficiency-challenge/neurips_llm_efficiency_challenge/issues/26

This was implemented in the latest release, particularly: https://github.com/huggingface/huggingface_hub/pull/1590/files#diff-765507f4602931b08aa20d0177611d5da7d5c7805882b0a79c81b5ed72c9472aR975

cc: @martinbrose as the PR author

Reproduction

Sorry, I haven't created a repro script, but it should be quite simple to reproduce. If you have trouble with it, I could try it myself.

Logs

See linked issues

System info

https://github.com/huggingface/huggingface_hub/releases/tag/v0.17.0
martinbrose commented 1 year ago

Thanks @carmocca for raising this. I'll have a look in the next couple of days.

martinbrose commented 1 year ago

@Wauplin, my PR might solve this.

Not sure if it introduces additional problems with blob_path for example?

Wauplin commented 1 year ago

Thanks for reporting it @carmocca ! Looks like a quite critical issue. We will fix it and make a hot-fix release for it. Thanks @martinbrose for having looked into it. I've commented your PR. I think we should aim for the most fault-tolerant implementation as possible as this warning is meant to be a quality of life improvement, so not a critical component. :)

Wauplin commented 1 year ago

This issue has been fixed by @martinbrose's PR in https://github.com/huggingface/huggingface_hub/pull/1692. The PR is now merged and I've made a hot-fix release for it: releases/tag/v0.17.3. check_disk_space is now far more robust and will ignore any error (preference for not breaking the download).