Open rtyley opened 11 years ago
I just ran into this while trying out bfg on Semantic UI
It's a fiddly problem - you can update all the 'real' refs in your repo, but all the ones beginning 'refs/pull' are synthetic read-only refs created by GitHub - you can't update (and therefore 'clean') them if they're from outside your repository (like https://github.com/jlukic/Semantic-UI/pull/183, for instance).
So, if you're pushing your updated refs up to your repository, all the non-pull-request refs are accepted and fixed, but the Pull Request ref updates will be rejected.
I'm really not sure if there's anyway the BFG can make this experience nicer... as an open question, can youthink of something you would like to happen?
I think i might have done something terribly destructive..
I ran and pushed to remote with --strip-biggest-blobs 500
instead of --strip-blobs-bigger-than 1M
And now my previous commit history is littered with missing files https://github.com/jlukic/Semantic-UI/tree/9c2d248a1db821560aba68446d92eeef12087e3e/build/packaged/javascript
discussion on semantic issues https://github.com/jlukic/Semantic-UI/issues/220
I assumed the refs didnt push because the pull request refs were rejected. I wish that git gave better notice here.
I ended up cloning without pull refs using the guide above http://christoph.ruegg.name/blog/git-howto-mirror-a-github-repository-without-pull-refs.html
I think just pointing the issue out in the docs would be enough for other users, or maybe a separate walkthrough on how to clone while excluding pull refs.
Sorry to hear about the problems with the update of Semantic-UI. If your intended threshold was --strip-blobs-bigger-than 1M
it looks like there would have been ~85 objects to be removed - obviously those files would still have changed to become file.REMOVED.git-id
files in your history.
One thing the BFG does to protect you from unintended consequences is not alter the contents of your latest (HEAD) commit - so your current work can never be lost.
As it happens, your old history and files are pretty much not lost, precisely because of GitHub not allowing Pull Request refs to be overridden - meaning that in GitHub, at least as of Friday, refs that refer to your old history still existed, and doing a --mirror
clone would still retrieve all of the history that was in the repo up to the point of those PRs being requested. If all your pull-requests have been updated, that may no longer be the case, but I took a mirror clone on Friday when the data was available, and would be happy to send it to you if you want to go through the (rather fiddly) process of restoring your history to how it was before you ran the BFG. The hassle involved with updating all of your collaborators (once again) may mean it's not really worth doing however.
Incidentally, the usage instructions for the BFG, do advise taking a back-up mirror copy of the repo before proceeding - it's always possible to recover from that if the BFG does something that you don't want.
(I was going to make this commit myself, but can't find where the code to rtyley.github.io/bfg-repo-cleaner/ is hosted)
This mistake is probably commoner than it needs to be. Why not change the first example line (which most people will probably copy-paste and run) so that it deletes the biggest 1
files? Or better yet, --strip-blobs-bigger-than 10000M
so that people see what the output looks like without accidentally rm
ing?
(I know it's apparently reversible, but it's not exactly easy to figure out how)
+1 this.
The problem manifests like this:
See also:
http://christoph.ruegg.name/blog/2013/1/26/git-howto-mirror-a-github-repository-without-pull-refs.html
Cleaning refs that are - effectively - from other people's repos might not be possible, but we could consider renaming the refs to give a "here's how you fix your pull request" ref.