Naunter / BT_BlockLists

Transmission block list
The Unlicense
960 stars 47 forks source link

archive in repo #7

Closed whoizit closed 2 years ago

whoizit commented 2 years ago

Your repository weighs 4gb already. You should not save the binary archive to the repository. There is a releases section for that. Like this guy doing https://github.com/sayomelu/transmission-blocklist

Naunter commented 2 years ago

No problem, I will fix it

antonagestam commented 2 years ago

@whoizit The repository you linked to also commits the archive, it's just not on the main branch. Pulling that repository will pull those objects too unless you use --single-branch.

whoizit commented 2 years ago

yours repo still oversized image their repo 7.28MiB image I clone it for test and this absolutely right

Naunter commented 2 years ago

@whoizit because when you git clone the repo, you will download all my branches which not only my master branch but also the old main branch. The main branch was discarded, and still contains all the old files and histories.

And also just like what @antonagestam said, even using a release branch it just upload files to a sub-branch but will still take the size of the whole branch just like my main branch. The way to solve it is submitting the binary files to Releases (Remember, not the branch, it is a function call Github Releases. check here for details: https://docs.github.com/en/repositories/releasing-projects-on-github/about-releases)

Also, if you really need to save files in the repo, you have to remember clean out the binary files and other large files in the commit history. Otherwise it will like my main branch took lots of space.

Googling git filter-branch for more details about how to remove large files in repo history.

Naunter commented 2 years ago

I don't know where you got this nice analyze screenshots, I can't find the function on Github, but the results I got is less than 1GB.

The master branch size: image

The git push size: image

yours repo still oversized image their repo 7.28MiB image I clone it for test and this absolutely right

whoizit commented 2 years ago

https://github.com/Shywim/github-repo-size

whoizit commented 2 years ago

Receiving objects: 100% (1018/1018), 1.89 GiB | 4.64 MiB/s, done.

~/tmp > git clone https://github.com/Naunter/BT_BlockLists                                                          ± master | 06 Feb 04:33:15 (7)
Cloning into 'BT_BlockLists'...
remote: Enumerating objects: 1018, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 1018 (delta 1), reused 8 (delta 1), pack-reused 1009
Receiving objects: 100% (1018/1018), 1.89 GiB | 4.64 MiB/s, done.
Resolving deltas: 100% (345/345), done.

Receiving objects: 100% (15/15), 7.28 MiB | 2.89 MiB/s, done.

~/tmp > git clone https://github.com/sayomelu/transmission-blocklist                                                ± master | 06 Feb 04:37:26 (7)
Cloning into 'transmission-blocklist'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 15 (delta 0), reused 1 (delta 0), pack-reused 12
Receiving objects: 100% (15/15), 7.28 MiB | 2.89 MiB/s, done.
Resolving deltas: 100% (1/1), done.
Naunter commented 2 years ago

Receiving objects: 100% (1018/1018), 1.89 GiB | 4.64 MiB/s, done.

~/tmp > git clone https://github.com/Naunter/BT_BlockLists                                                          ± master | 06 Feb 04:33:15 (7)
Cloning into 'BT_BlockLists'...
remote: Enumerating objects: 1018, done.
remote: Counting objects: 100% (9/9), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 1018 (delta 1), reused 8 (delta 1), pack-reused 1009
Receiving objects: 100% (1018/1018), 1.89 GiB | 4.64 MiB/s, done.
Resolving deltas: 100% (345/345), done.

Receiving objects: 100% (15/15), 7.28 MiB | 2.89 MiB/s, done.

~/tmp > git clone https://github.com/sayomelu/transmission-blocklist                                                ± master | 06 Feb 04:37:26 (7)
Cloning into 'transmission-blocklist'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 15 (delta 0), reused 1 (delta 0), pack-reused 12
Receiving objects: 100% (15/15), 7.28 MiB | 2.89 MiB/s, done.
Resolving deltas: 100% (1/1), done.

As I said above, you are checking the size of all my branches under the BT_BlockLists repository, not just the master branch.

bstivers commented 2 years ago

I fail to see the problem...

This is the caveat to using multiple branches. I do quite a bit of data science stuff that ends up storing a bunch of csv's a whole lot larger than 600,000 lines. I've learned how to use the --single-branch flag — among other tricks, e.g. Git Large File Storage (LFS), for data versioning.

This is a non-issue IMHO.

git clone https://github.com/Naunter/BT_BlockLists.git -b master --single-branch
Cloning into 'BT_BlockLists'...
remote: Enumerating objects: 12, done.
remote: Total 12 (delta 0), reused 0 (delta 0), pack-reused 12
Unpacking objects: 100% (12/12), 15.55 MiB | 1.37 MiB/s, done.
Naunter commented 2 years ago

Thanks @bstivers , you made a simple and clear explanation.

Git has so many tricks and commands to discover, I'm still just a git noob.

I took few screenshots to comparison:

This is the main branch, which is the reason cause the large size. It has 1.87 GiB. image

This is the master branch, which is the up to date branch we are using. It only 15.55 MiB. image

This is the full repo, which contains both main and master branch. It is 1.89 GiB. image

I think this issue is solved and I will close this issue.

After this, I will change the main branch to a new name call "old-branch".

If @whoizit you or any other still have questions, welcome to leave comments or open a new issue.