Closed OlivierKessler01 closed 4 months ago
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
@Wauplin just ran Ruff, sorry 'bout this.
Hi @Wauplin thank you for the contribution. I just added the documentation block, have a look :)
And, merged!
Implement a new
repo-files
command to manipulate files (deletion...)Fixes https://github.com/huggingface/huggingface_hub/issues/2235
Opiniated choice made regarding globs matching
Crux
I've encoutered an issue when implementing the folder matching as per this issue specification :
To implement the pattern lexer I reused
huggingface_hub.hf_api._prepare_folder_deletions()
, which is already used in theupload
command to lexe/tokenize allow and disallow lists.The function is a wrapper around
fnmatch.fnmatch()
(viahuggingface_hub.utils._paths.filter_repo_objects()
). It matches files against unix shell-style file matching patterns.Unix shell-style automatons respond to this case in an rather unconvenient way :
Solution implemented
It adds "*" at the end of the patterns that end with "/". Given git untracks any folder that is empty, deleting all files in the folder effectively deletes the folder. Did it in the new
huggingface_hub.hf_api.delete_files_r()
function, because updatinghuggingface_hub.utils._paths.filter_repo_objects()
at a lower-level would alter the behavior of various functions like theupload
command and the commit scheduler in a non-BC compatible way.What do you think about it ?
All this has been documented in the function doc.
What could have been done instead
Remove support for this
huggingface-cli repo-files <repo_id> delete folder/
Natively support this through already existing logic, and let Git untrack the folder.
huggingface-cli repo-files <repo_id> delete folder/*
Tests added (duration that might disrupt DevX and project output)
Remarks
Feel free to tell me if you'd like something different here :)