Open m3dwards opened 10 months ago
cc @okjodom @sr-gi, I think it's worthwhile doing a once off cleanup?
I do agree. It's not worth having an unnecessary big repo because of files that were pushed on an accident
+1 on cleanup
What do you think of an interactive rebase to drop PRs #9 and #62 ?
That goes over my head git-wise, but I'll be ok with doing so if possible
having a go at it
I just experimented with this on a fresh clone of the repo
Interactive rebase to remove commits 72c4f11
then b87a0ae
.. 1a75d06
, followed by further rewrite to remove associated blobs was my starting step.
git rebase --interactive 4086f94` to drop `b87a0ae` .. `1a75d06` and `72c4f11
For blob clean up, git-filter-repo from the Stack Overflow thread work effectively. From the SO discussion, this tool provides the same capabilities as git filter-branch
python3 git-filter-repo --invert-paths --path-match activity-generator --force
python3 git-filter-repo --invert-paths --path-match js --force
This results in blob set 2.blobs.after.txt
whereas before, the list of blobs was 2.blobs.before.txt
I used the original rev-list command to list blobs in repo
git rev-list --objects --all |
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
sed -n 's/^blob //p' |
sort --numeric-sort --key=2 | file.txt
From here, I'm not sure how we'd pus this revised history to upstream and get forks, clones, to receive the same.
I noticed while creating branches that it was taking a while and after a quick look it seems it's because the repo is 247mb.
I ran the following commands to list the largest blobs and it looks like some builds were accidentally committed early on:
The following is a SO comment and post that discusses techniques for removing blobs from history: https://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-the-git-repository/61602985#61602985
As these files appeared to have been committed and pushed in error I would support their removal from the history.