HR / github-clone

:octocat: ⬇️ ⠀git clone repo subdirectories
https://git.io/ghclone
Apache License 2.0
179 stars 15 forks source link

Issue with cloning subdirectories which contain large number of files #9

Open Sobiru opened 2 years ago

Sobiru commented 2 years ago

Github truncates subdirectories which have many files such as https://github.com/iamcal/emoji-data/tree/master/img-apple-64 . As you can see, it says Sorry, we had to truncate this directory to 1,000 files. 2,524 entries were omitted from the list.

When you want to ghclone such subdirectory, it clones only the 1000 files and not all of the files. Is there a workaround?

Thank you.

HR commented 2 years ago

Thanks for reporting this @Sobiru

So according to the GitHub API docs, the repo directory contents API has an upper limit of 1,000 files for a directory. However, it recommends using the Git Trees API to retrieve more.

Here's the request for that https://api.github.com/repos/iamcal/emoji-data/git/trees/d779b828b2ee049e35f69e2b6edf71173504f364 and it returns all of the files but requires at least another request to find the tree_sha for the given path/directory (i.e. img-apple-64)

I'll look into implementing when I have time but you're welcome to add it yourself and open a PR! 😄