Open chiukapoor opened 6 months ago
Yes, we are aware of this issue. The size has grown over the years. Time to time we have removed unwanted/unused files. Here is the output of git count-objects
git count-objects -vH count: 272 size: 22.38 MiB in-pack: 8981 packs: 2 size-pack: 1.44 GiB prune-packable: 0 garbage: 0 size-garbage: 0 bytes
This number is substantially smaller than 6G.
I also looked at the vendor folders in platform-operator and helm-pod. They are about 40 MB each. So they are also not contributing a whole lot to the size.
Maybe the branches are counting towards the size.
I am running 'git gc' now. Let's see if that helps.
I think it is the branches. In .git/objects/pack, there is a .pack file which is 6.6G is size. It is all the history of the repository over the years.
We have 130+ branches. The only active branches right now are "develop" and "master". My workflow is to work on develop and then do a PR to master.
So, Option 1: We can delete all the other branches. This can reduce the repo size.
Option 2: Find all the files that are no longer in the master branch and then purge them from other branches. This involves work:
Figure out files that are not present on the master https://stackoverflow.com/questions/28284890/in-git-how-can-i-list-all-files-that-exist-in-branch-a-that-do-not-exist-in-bra
Purge the files https://stackoverflow.com/questions/11050265/remove-large-pack-file-created-by-git
Option 3: Punt this issue for later with the acknowledgment that we will have to fix this eventually. In the documentation, explicitly mention shallow cloning the repository. With shallow clone, the size of the repository is 518M.
Any other option:?
Option 1 is simplest. Ideally, it would be great to know how much of a delta deleting a particular branch will achieve. May be, I can try to delete a branch, then re-clone, and see how much does it reduce the size of the cloned repo.
Option 3 does not rock the boat right now.
Thoughts?
I believe we should go with the Option 1
, keeping the only active branches develop
and master
. We can make the master branch protected so that everyone needs to commit to develop
first and later raise a PR for master
.
This will also make our master
branch stable and develop
branch with changes that may be breaking.
Along with this, as I have suggested on Slack we may move independent modules out of this repo such as operator-analysis so that it may have it's own independent releases and development cycles.
@chiukapoor I have cleaned up the branches. There are now only 12 branches remaining (including master and develop). The 10 remaining branches (except master and develop) cannot be deleted without further careful analysis. We can do that later.
The size of the repository with the above curl command still shows 6.6G. The .pack file in .git folder is the cause of the size. Can you look into the ways to reduce the size of this file?
Upon researching Git packing, I discovered that the .pack file encompasses both the objects and history of a Git repository.
To identify large files in the Git history, I utilized the following script found on Stack Overflow:
git rev-list --objects --all | \
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | \
sed -n 's/^blob //p' | \
grep -vF --file=<(git ls-tree -r HEAD | awk '{print $3}') | \
awk '$2 >= 20*2^20' | \
sort --numeric-sort --key=2 | \
cut -c 1-12,41- | \
awk '{ sum += $2; print } END { printf "Total size: %.2f GB\n", sum/(2^30) }'
This script detects blob objects (which represent file contents) larger than 20 MiB across the entire Git history, excluding those currently in the HEAD. It then sorts and presents these files alongside their sizes, concluding with the total size of all the identified large files in the repository's history.
Here are the identified large files:
6cc959a84865 23MiB operator-discovery-helper/operator-discovery-helper
c29d3304c15d 24MiB operator-discovery-helper/operator-discovery-helper
1727a8ac456d 31MiB mutating-webhook-helper/mutating-webhook-helper
9e0c44142215 31MiB mutating-webhook-helper/mutating-webhook-helper
0ab2439526ea 31MiB mutating-webhook-helper/mutating-webhook-helper
519a93cd8f5a 32MiB platform-operator/helm-pod/helm-pod
d7171c80b080 32MiB platform-operator/helm-pod/helm-pod
d96d24be37c9 32MiB platform-operator/helm-pod/helm-pod
4dd2a62aff94 32MiB platform-operator/helm-pod/helm-pod
53286600b665 32MiB platform-operator/helm-pod/helm-pod
83df361707c2 32MiB platform-operator/helm-pod/helm-pod
e8d4c3b52a2f 32MiB platform-operator/helm-pod/helm-pod
1a6b53a50b59 32MiB platform-operator/helm-pod/helm-pod
e419a1e6c69f 32MiB platform-operator/helm-pod/helm-pod
3198ebd4d6d0 32MiB platform-operator/helm-pod/helm-pod
fca54fb09b63 32MiB platform-operator/helm-pod/helm-pod
285fb0983cdf 32MiB platform-operator/helm-pod/helm-pod
20f6b63cdc3c 32MiB platform-operator/helm-pod/helm-pod
cfe95149943e 32MiB platform-operator/helm-pod/helm-pod
27474c111651 32MiB platform-operator/helm-pod/helm-pod
d98771e16a20 32MiB platform-operator/helm-pod/helm-pod
4e5f8e52cf8e 32MiB platform-operator/helm-pod/helm-pod
1b5baf82e863 32MiB platform-operator/helm-pod/helm-pod
0fa8a4b16c1e 32MiB platform-operator/helm-pod/helm-pod
ec8cb44104c9 34MiB platform-operator/platform-operator
504a75825f40 34MiB platform-operator/artifacts/deployment/platform-operator
ba91c6b5f158 34MiB mutating-webhook-helper/mutating-webhook-helper
264a5d68ca65 34MiB platform-operator/platform-operator
2e41fcd8ffe4 34MiB platform-operator/platform-operator
b9a282fa6b46 34MiB platform-operator/platform-operator
ee0a8bbfaf2f 34MiB platform-operator/platform-operator
d02ca5030ea7 38MiB platform-operator/artifacts/deployment/platform-operator-april13
11fa89099370 38MiB deploy/kubectl
60f8ef967e94 39MiB operator-manager/artifacts/deployment/operator-manager
b1c07bd0da9d 39MiB mutating-webhook-helper/mutating-webhook-helper
e0dc2eec405d 39MiB platform-operator/artifacts/deployment/platform-operator
123d05935d23 40MiB platform-operator/helm-pod/helm
9cc647276bc1 40MiB platform-operator/helm-pod/helm
05553b898c78 40MiB platform-operator/artifacts/deployment/platform-operator
0e940ba669d7 41MiB platform-operator/helm-pod/kubectl
e8b37151032c 42MiB kubeplus-kubectl-plugins.tar.gz
9a04045298b4 42MiB kubeplus-kubectl-plugins.tar.gz
3351f0df45e5 42MiB kubeplus-kubectl-plugins.tar.gz
8d6f531437cb 42MiB kubeplus-kubectl-plugins.tar.gz
f5149c996804 42MiB kubeplus-kubectl-plugins.tar.gz
e9bb8ecae3e7 42MiB kubeplus-kubectl-plugins.tar.gz
615a761f73c5 42MiB kubeplus-kubectl-plugins.tar.gz
36ab3a933cd0 42MiB kubeplus-kubectl-plugins.tar.gz
d67dee5d1bdb 42MiB kubeplus-kubectl-plugins.tar.gz
1d91164ed3aa 42MiB kubeplus-kubectl-plugins.tar.gz
88f51fbba870 42MiB kubeplus-kubectl-plugins.tar.gz
79645595d6f7 43MiB kubeplus-kubectl-plugins.tar.gz
83001bf330a1 43MiB kubeplus-kubectl-plugins-latest.tar.gz
c02c3feaeb79 43MiB kubeplus-kubectl-plugins:latest.tar.gz
9610c5f64b2a 43MiB kubeplus-kubectl-plugins:latest.tar.gz
2a2d17bd49b8 43MiB kubeplus-kubectl-plugins-latest.tar.gz
2a2296d01486 43MiB kubeplus-kubectl-plugins.tar.gz
ca9ebf4f31d2 43MiB kubeplus-kubectl-plugins.tar.gz
ce6c2d07f604 43MiB kubeplus-kubectl-plugins-latest.tar.gz
24e829c98595 43MiB kubeplus-kubectl-plugins.tar.gz
8d1588a18332 43MiB kubeplus-kubectl-plugins.tar.gz
c365acdad501 43MiB kubeplus-kubectl-plugins.tar.gz
aa3c6a52a679 43MiB kubeplus-kubectl-plugins.tar.gz
8d0fd33348cd 43MiB kubeplus-kubectl-plugins-latest.tar.gz
7e12204ef71b 43MiB kubeplus-kubectl-plugins.tar.gz
c52b062dc6f7 43MiB kubeplus-kubectl-plugins-latest.tar.gz
8d0a1f70a0ee 43MiB kubeplus-kubectl-plugins-latest.tar.gz
094c577af392 43MiB kubeplus-kubectl-plugins-1.0.4.tar.gz
45fe1c0ebdaf 43MiB kubeplus-kubectl-plugins-1.0.3.tar.gz
458343d1d2ea 43MiB kubeplus-kubectl-plugins-latest.tar.gz
920ddf1b2f6e 43MiB kubeplus-kubectl-plugins-latest.tar.gz
3802ef6e3e63 43MiB kubeplus-kubectl-plugins-latest.tar.gz
ef8127701a0b 43MiB kubeplus-kubectl-plugins-latest.tar.gz
7eaaa2dfcaeb 43MiB kubeplus-kubectl-plugins.tar.gz
288471760cb5 43MiB kubeplus-kubectl-plugins.tar.gz
00ae97ed2bc2 43MiB kubeplus-kubectl-plugins.tar.gz
51dbc3d85ef1 43MiB kubeplus-kubectl-plugins.tar.gz
19e2c22af684 43MiB kubeplus-kubectl-plugins.tar.gz
4e408030e0e5 43MiB kubeplus-kubectl-plugins.tar.gz
51530a192a56 43MiB kubeplus-kubectl-plugins.tar.gz
c541036a7b74 43MiB kubeplus-kubectl-plugins.tar.gz
d279c88b08e0 43MiB kubeplus-kubectl-plugins.tar.gz
6e9d8f09c568 43MiB kubeplus-kubectl-plugins.tar.gz
7b8f3cf2d21b 43MiB kubeplus-kubectl-plugins.tar.gz
e8aaf4174282 43MiB kubeplus-kubectl-plugins.tar.gz
90c9430fc48f 43MiB kubeplus-kubectl-plugins.tar.gz
52529f9b1dcd 43MiB kubeplus-kubectl-plugins.tar.gz
c26ee398448a 43MiB kubeplus-kubectl-plugins.tar.gz
91bac79332c9 43MiB kubeplus-kubectl-plugins.tar.gz
fab110955c03 43MiB kubeplus-kubectl-plugins.tar.gz
9da488f6283b 43MiB kubeplus-kubectl-plugins.tar.gz
c1d46eb577f1 43MiB kubeplus-kubectl-plugins.tar.gz
724a4a4b6908 43MiB kubeplus-kubectl-plugins.tar.gz
fee22d952629 43MiB kubeplus-kubectl-plugins.tar.gz
bc93c331e907 43MiB kubeplus-kubectl-plugins.tar.gz
6b3c68e17943 43MiB kubeplus-kubectl-plugins.tar.gz
8de011f7db4e 43MiB kubeplus-kubectl-plugins.tar.gz
f86d617fd294 43MiB kubeplus-kubectl-plugins.tar.gz
f87526e06940 43MiB kubeplus-kubectl-plugins.tar.gz
9a6b019364d3 43MiB kubeplus-kubectl-plugins.tar.gz
c269be008951 43MiB kubeplus-kubectl-plugins.tar.gz
1507a72e3c33 43MiB kubeplus-kubectl-plugins.tar.gz
b7bea14a2edd 43MiB kubeplus-kubectl-plugins.tar.gz
bec01d901703 43MiB kubeplus-kubectl-plugins.tar.gz
0687db132456 43MiB kubeplus-kubectl-plugins.tar.gz
a65055f67793 43MiB kubeplus-kubectl-plugins.tar.gz
324fd4b57624 43MiB kubeplus-kubectl-plugins.tar.gz
c9b97ec83bba 43MiB kubeplus-kubectl-plugins.tar.gz
29be035a8756 43MiB kubeplus-kubectl-plugins.tar.gz
f4a342d2bb9a 43MiB kubeplus-kubectl-plugins.tar.gz
c23a7bc4f0b8 43MiB kubeplus-kubectl-plugins.tar.gz
98422aa8582d 43MiB kubeplus-kubectl-plugins.tar.gz
14d66637f425 43MiB kubeplus-kubectl-plugins.tar.gz
00d42daa755e 43MiB kubeplus-kubectl-plugins.tar.gz
b0325ff9cde0 43MiB kubeplus-kubectl-plugins.tar.gz
f23279a2b41e 43MiB kubeplus-kubectl-plugins.tar.gz
8fd838ff657e 43MiB kubeplus-kubectl-plugins.tar.gz
1a52874ed110 43MiB kubeplus-kubectl-plugins.tar.gz
d0ff22ce5bf9 43MiB kubeplus-kubectl-plugins.tar.gz
657bcd16613e 43MiB kubeplus-kubectl-plugins.tar.gz
0f5070b1803d 43MiB kubeplus-kubectl-plugins.tar.gz
2866cc9fdfd1 43MiB kubeplus-kubectl-plugins.tar.gz
b6ea2ae7ff33 43MiB kubeplus-kubectl-plugins.tar.gz
c4ca62a7caf3 43MiB kubeplus-kubectl-plugins.tar.gz
d4a9590ff3ab 44MiB platform-operator/helm-pod/helm-pod
f41354015c4d 44MiB platform-operator/helm-pod/helm-pod
ff838549a309 46MiB platform-operator/helm-pod/helm-pod
4a7a7f108001 46MiB platform-operator/helm-pod/helm-pod
5a683d172125 46MiB platform-operator/helm-pod/helm-pod
2e3cd5fa4589 46MiB platform-operator/helm-pod/helm-pod
21cffd7ac942 46MiB platform-operator/helm-pod/helm-pod
e799f05e656c 46MiB platform-operator/helm-pod/helm-pod
840ac9e13d66 46MiB plugins/kubediscovery-linux
07fffa8f7f92 46MiB plugins/kubediscovery-linux
dc06b09c883b 46MiB plugins/kubediscovery-linux
c865295defc5 46MiB plugins/kubediscovery-linux
cadf9c1a3eea 46MiB plugins/kubediscovery-linux
fcc470ff901b 48MiB platform-operator/artifacts/deployment/platform-operator
142ea25a47e4 48MiB deploy/helm
3c030a07d685 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
4a78acbb93e4 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
e36ea2b943cd 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
1a8a285709bd 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
8a5c0e5377a5 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
818043e7858b 49MiB kubeplus-kubectl-plugins-1.0.2.tar.gz
9e5da1a47fdc 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
1e99509de801 49MiB kubeplus-kubectl-plugins-1.0.1.tar.gz
3cd2bb94262a 49MiB kubeplus-kubectl-plugins-1.0.0.tar.gz
32f4ed9a6702 52MiB plugins/kubediscovery-macos
4aa98b45d6f2 52MiB plugins/kubediscovery-macos
92a90f3e9890 52MiB plugins/kubediscovery-macos
fc9a4637555e 52MiB plugins/kubediscovery-macos
e1c1c590b0aa 52MiB plugins/kubediscovery-macos
f42f9e45c5c1 52MiB plugins/kubediscovery-macos
2f968ce58d4c 52MiB plugins/kubediscovery-macos
a7552dc214b7 52MiB plugins/kubediscovery-macos
2edc8400ea3f 52MiB plugins/kubediscovery-macos
916aee293d23 52MiB plugins/kubediscovery-macos
bda4807b9296 52MiB plugins/kubediscovery-macos
2c091c1c3467 52MiB plugins/kubediscovery-macos
9af3216b3b8f 52MiB plugins/kubediscovery-macos
df3ba3066b9c 52MiB plugins/kubediscovery-macos
400ce62b304f 52MiB plugins/kubediscovery-macos
501bd028bf1f 52MiB plugins/kubediscovery-macos
6d064b83ee70 52MiB plugins/kubediscovery-macos
7dd447a08514 52MiB plugins/kubediscovery-macos
94e6e404f21d 52MiB plugins/kubediscovery-macos
2a0148ccded1 52MiB plugins/kubediscovery-macos
427b622953fb 52MiB plugins/kubediscovery-macos
a97d3bab4ef4 52MiB plugins/kubediscovery-macos
ce82c7ab107a 52MiB plugins/kubediscovery-macos
0dff922718dd 52MiB plugins/kubediscovery-macos
7abf91977364 52MiB plugins/kubediscovery-macos
bb751c1e454f 52MiB plugins/kubediscovery-macos
14b589ddd8c8 52MiB plugins/kubediscovery-macos
154b79f3b9e6 52MiB plugins/kubediscovery-macos
160bef440451 52MiB plugins/kubediscovery-macos
ee2c34cbf13f 52MiB plugins/kubediscovery-macos
3ab9733f4f0a 52MiB plugins/kubediscovery-macos
af01e219cfb5 52MiB plugins/kubediscovery-macos
808acb829512 52MiB plugins/kubediscovery-macos
7f6c0c68e82e 53MiB plugins/kubediscovery-linux
dceae80f154a 53MiB plugins/kubediscovery-linux
075b6a7eddfb 53MiB plugins/kubediscovery-linux
6bfcf37b8aac 53MiB plugins/kubediscovery-linux
864d55c4d895 53MiB plugins/kubediscovery-linux
56492e80f906 53MiB plugins/kubediscovery-linux
b7156bf52c10 53MiB plugins/kubediscovery-linux
4f3411a47197 53MiB plugins/kubediscovery-linux
032198a48b71 53MiB plugins/kubediscovery-linux
17cef560d252 53MiB plugins/kubediscovery-linux
55327469b936 53MiB plugins/kubediscovery-linux
c4cc3c64f6c4 53MiB plugins/kubediscovery-linux
b65a3750cd14 53MiB plugins/kubediscovery-linux
56f71541c9cf 53MiB plugins/kubediscovery-linux
15fa0e81f12d 53MiB plugins/kubediscovery-linux
ee3422023df1 53MiB plugins/kubediscovery-linux
9733c5bcb0a7 53MiB plugins/kubediscovery-linux
03302e21fdd7 53MiB plugins/kubediscovery-linux
3a7f7ec33ae5 53MiB plugins/kubediscovery-linux
6700b886ee78 53MiB plugins/kubediscovery-linux
e3add604ebf0 53MiB plugins/kubediscovery-linux
3f2955de655e 53MiB plugins/kubediscovery-linux
ba72f32c7520 53MiB plugins/kubediscovery-linux
858552299f49 53MiB plugins/kubediscovery-linux
22583aef0554 53MiB plugins/kubediscovery-linux
f1fbffdc1f1c 53MiB plugins/kubediscovery-linux
a924826f21c6 53MiB plugins/kubediscovery-linux
01e524733bf7 53MiB plugins/kubediscovery-linux
7bec1d24796f 53MiB plugins/kubediscovery-linux
044a3e5d2a05 53MiB plugins/kubediscovery-linux
35103b334abc 53MiB plugins/kubediscovery-linux
589b59ead533 53MiB plugins/kubediscovery-linux
7c46f3fe5191 53MiB plugins/kubediscovery-linux
31e89dccd03c 53MiB kubeplus-saas-manager-control-center.tar.gz
1374f5e51ea9 53MiB platform-operator/platform-operator
a75c3dc42be1 53MiB platform-operator/artifacts/deployment/platform-operator
e538a26d2135 53MiB kubeplus-kubectl-plugins.tar.gz
0580caf232a3 53MiB kubeplus-kubectl-plugins.tar.gz
a2a519ce93ce 53MiB kubeplus-kubectl-plugins.tar.gz
0b94db198af8 53MiB kubeplus-kubectl-plugins.tar.gz
9dfaecda3f0f 54MiB platform-operator/artifacts/deployment/platform-operator
c7a607202b2e 54MiB kubeplus-kubectl-plugins.tar.gz
7d0f8ca5b076 57MiB operator-deployer/artifacts/deployment/operator-deployer
9c7557c3efde 60MiB plugins/kubediscovery-macos
ec12c03b6f43 60MiB plugins/kubediscovery-macos
7790521df6e2 60MiB plugins/kubediscovery-macos
44b6f05f08d0 60MiB plugins/kubediscovery-macos
331e52927f7e 60MiB plugins/kubediscovery-macos
27dfff50acc7 60MiB plugins/kubediscovery-macos
51325de9a645 60MiB plugins/kubediscovery-macos
10a32b460c5e 60MiB plugins/kubediscovery-macos
3c1037693132 60MiB plugins/kubediscovery-macos
4112834a36d3 60MiB plugins/kubediscovery-macos
87b520927c2c 60MiB plugins/kubediscovery-macos
b7b09d906aa3 60MiB plugins/kubediscovery-macos
f848a1661011 60MiB plugins/kubediscovery-macos
12b684726ec7 60MiB plugins/kubediscovery-macos
09b4f8bef003 61MiB plugins/kubediscovery-linux
465e8875ce9a 61MiB plugins/kubediscovery-linux
9343f0ef3a5f 61MiB plugins/kubediscovery-linux
46445693320a 61MiB plugins/kubediscovery-linux
f740ec6e5417 61MiB plugins/kubediscovery-linux
5e3534f358b4 61MiB plugins/kubediscovery-linux
76e8a9df775d 61MiB plugins/kubediscovery-linux
13451705dc2c 61MiB plugins/kubediscovery-linux
1cd15dc10b22 61MiB plugins/kubediscovery-linux
f4beb877f893 61MiB plugins/kubediscovery-linux
380eac8ccd04 61MiB plugins/kubediscovery-linux
c6a08cf7d19e 61MiB plugins/kubediscovery-linux
eb4fe34d38a8 61MiB plugins/kubediscovery-linux
842bdeb6684b 61MiB plugins/kubediscovery-linux
013c419d1c1f 62MiB kubeplus-kubectl-plugins.tar.gz
3d85f5affb3a 62MiB kubeplus-kubectl-plugins.tar.gz
854a9ea40711 64MiB kubeplus-kubectl-plugins.tar.gz
06ea3a0b774d 64MiB kubeplus-kubectl-plugins.tar.gz
654eefc253c5 64MiB kubeplus-kubectl-plugins.tar.gz
6f3b47f1e4e4 65MiB kubeplus-kubectl-plugins.tar.gz
589ec8fb7189 65MiB kubeplus-kubectl-plugins.tar.gz
76732838cee0 65MiB kubeplus-kubectl-plugins.tar.gz
To address this issue, the outdated and unwanted objects such as old binaries will be removed using BFG, as suggested on Stack Overflow (PS: I have tested this locally and the .pack file size is down to less than 300 MiB)
To prevent this issue in the future, it's important to refrain from uploading binary files to the git repository. Instead, GitHub's release and tags feature and automated CI can be utilized. https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository
@chiukapoor Great work! I will go through these steps in the next couple of days.
Looking at the files, we do want to keep following files in the repo:
We can remove rest of the files. For above two files, whenever there is an update to the files, we can follow the practice of deleting the current version, and then adding the new version. That way, there will always be a single version of these files present in the repo.
Also, looks like there is another approach to remove files from history https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository
I will experiment with both (bfg cleaner and git filter-repo) in coming days.
Issue