Open FedeDP opened 1 year ago
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
/remove-lifecycle stale
So, after we removed the kernels
branch, the git clone --mirror
being made by hook
plugin is now at ~75MB:
git clone --mirror https://github.com/falcosecurity/kernel-crawler
Clone nel repository spoglio 'kernel-crawler.git' in corso...
remote: Enumerating objects: 2361, done.
remote: Counting objects: 100% (571/571), done.
remote: Compressing objects: 100% (275/275), done.
remote: Total 2361 (delta 294), reused 375 (delta 229), pack-reused 1790
Ricezione degli oggetti: 100% (2361/2361), 75.29 MiB | 5.02 MiB/s, fatto.
Risoluzione dei delta: 100% (1383/1383), fatto.
Unfortunately, this is not enough for it to work.
In a kernel-crawler fork, i tried to rewrite entire main
history removing big jsons (since at the start of the repo, they were pushed to the main branch); and now the repo size is ~2MB:
git clone --mirror https://github.com/fededp/kernel-crawler
Clone nel repository spoglio 'kernel-crawler.git' in corso...
remote: Enumerating objects: 1875, done.
remote: Counting objects: 100% (517/517), done.
remote: Compressing objects: 100% (357/357), done.
remote: Total 1875 (delta 193), reused 476 (delta 159), pack-reused 1358
Ricezione degli oggetti: 100% (1875/1875), 2.10 MiB | 3.06 MiB/s, fatto.
Risoluzione dei delta: 100% (1058/1058), fatto.
. So, i think we'll need to force push main branch. /cc @leogr @maxgio92 @LucaGuerra
I will need the help of an admin to do so.
Using git, it will be done like so:
git filter-branch -f --index-filter "git rm -rf --cached --ignore-unmatch kernels/" HEAD
git push origin HEAD --force-with-lease
So, the work was done; see for example this main
commit: https://github.com/falcosecurity/kernel-crawler/commit/4438279c57e9931129a16e3b124275fe62ef443f
But the end result is a bit different than expected, sizing around ~45MB:
git clone --mirror https://github.com/falcosecurity/kernel-crawler
Clone nel repository spoglio 'kernel-crawler.git' in corso...
remote: Enumerating objects: 2684, done.
remote: Counting objects: 100% (2684/2684), done.
remote: Compressing objects: 100% (830/830), done.
remote: Total 2684 (delta 1718), reused 2543 (delta 1671), pack-reused 0
Ricezione degli oggetti: 100% (2684/2684), 45.87 MiB | 5.06 MiB/s, fatto.
Risoluzione dei delta: 100% (1718/1718), fatto.
Investigating.
For reference, a git clone --mirror
of libs accounts for ~35MB.
There are no tags nor stale branches in kernel-crawler with big jsons; i don't really get this.
So, it seems json
s are still referenced somewhere (of course):
git rev-list --objects --all | grep json
177bdb29d7cb55dd4505fcad7a0007bfe61856d0 kernels/aarch64/list.json
965ad4147108c5029395e898ccf461ba041c005b kernels/x86_64/list.json
7b28341c4ac53ab817104075cd62af1b2f871ce0 kernels/aarch64/list.json
7cc73e95ff22468866340e0506d64b228a53f696 kernels/x86_64/list.json
7ada52e6b02702dae8e1204b4e45ebde01514c93 kernels/aarch64/list.json
9de80130013a3020e23ead46f53a5df954d7eebe kernels/x86_64/list.json
9fe2739e52952181a2cbc054266038c0db658ae6 kernels/aarch64/list.json
bc7c078463e2516c741ab3e8a281e4a539972017 kernels/x86_64/list.json
c4fc6ccbdaf699db51513cb7cfbd53c2eac3395f kernels/aarch64/list.json
03e2a731e31f7cbbb61160a09003c76c09556c37 kernels/x86_64/list.json
0cc64ddcb7091c05cc921436c56a3db355b57027 kernels/aarch64/list.json
e16424d453886c6537521b23cc9505aa467115cb kernels/x86_64/list.json
14de7d6c9a1fbb633ed261e72ba7df0b0d00131a kernels/aarch64/list.json
941fa5ff86ac237aac7a4d9862e47f794bc6aaaf kernels/x86_64/list.json
36a173cd61841f05f1b1f98f8006219ffe20678a kernels/aarch64/list.json
97b58cc0aaf6e5c2a359d41fed9d6ddf96722f12 kernels/x86_64/list.json
79ba50511cae9941f6ec7292523603f4fb2ede93 kernels/aarch64/list.json
d7ef54205c135ca8608a631d532771b76034b24e kernels/x86_64/list.json
a972a51e83371e047ef3d3557ccae586659453b5 kernels/aarch64/list.json
614b68865cd93c50abb357d5be412fc10e54fb17 kernels/x86_64/list.json
388dbcaf19db2bc1b45d124d82ee5295d989979d kernels/aarch64/list.json
47c06f73737b0958972257903cee5af8585055b1 kernels/x86_64/list.json
2055de08b7c0a34d68f71d0c89dde61a5c7c339d kernels/aarch64/list.json
c1eac473010235d8648b058e1610ee791ad99919 kernels/x86_64/list.json
51ddcc04923a4cc6de5ff0b9bbed481f46969afb kernels/aarch64/list.json
fd4af955206c2d04cd62199fab8c23f026d284f6 kernels/x86_64/list.json
These are not commits though (eg:
git log -1 --decorate 2055de08b7c0a34d68f71d0c89dde61a5c7c339d
nothing is shown).
EDIT:
LC_ALL="C" git for-each-ref --contains 2055de08b7c0a34d68f71d0c89dde61a5c7c339d
error: object 2055de08b7c0a34d68f71d0c89dde61a5c7c339d is a blob, not a commit
error: no such commit 2055de08b7c0a34d68f71d0c89dde61a5c7c339d
So, it seems jsons are still referenced somewhere (of course):
So, it seems like these might be related to the number of json pushing
PRs that were opened to main: exactly 13 with 2 jsons, so 26 entries (exactly same number seen above).
I think that we correctly cleaned up main
, but we couldn't clean up GitHub hidden PRs refs (and we can't force push them either afaik), so their blobs are still there.
Using https://rtyley.github.io/bfg-repo-cleaner/ works, ie: git rev-list
does not show any json file anymore, but then we cannot push (neither force push) because it gave us:
git push -f
Enumerazione degli oggetti in corso: 387, fatto.
Conteggio degli oggetti in corso: 100% (188/188), fatto.
Scrittura degli oggetti in corso: 100% (387/387), 35.72 KiB | 35.72 MiB/s, fatto.
387 oggetti totali (188 delta), 188 riutilizzati (188 delta), 199 riutilizzati nel file pack
remote: Resolving deltas: 100% (322/322), completed with 167 local objects.
To github.com:falcosecurity/kernel-crawler.git
! [remote rejected] refs/pull/1/head -> refs/pull/1/head (deny updating a hidden ref)
! [remote rejected] refs/pull/10/head -> refs/pull/10/head (deny updating a hidden ref)
...
for each opened PR.
JFI I think we can do nothing about synthetic references as they're read-only: https://github.com/rtyley/bfg-repo-cleaner/issues/36#issuecomment-37877829
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
Describe the bug
I have to manually add the
approved
label to kernel-crawler PRs.How to reproduce it
Approve any PR on kernel-crawler repo :)
Expected behaviour
The
approved
label should be automatically added.With @maxgio92 we discovered that there is some issue with the git client:
I think it is because the kernel-crawler repo hosts the huge json lists for the kernels for supported architectures and we need to somehow tweak the git client config.