Open bam80 opened 7 years ago
This does not look like a bug. You might try asking on stackoverflow tagged with bfg-repo-cleaner
.
Yes, I realized later it was a feature. Think it should be explained better in the documentation, as it hard to understand at first..
Your problem here is that you are considering the output of git show --stat
, which shows the changes made in that commit, as representing all of the files present in your repository tree at the point-in-time of the commit.
BFG protects a commit by protecting the tree pointed-to by that commit, which is all directories and files at that point-in-time, not just the changes made by that commit.
Please close this issue.
Yes, I understand now, but obviously it's not only my problem - people seems face with it often. This is the evidence that the documentation should be improved, or at least corresponding issue raised. Only then I would close this issue. @rtyley what do you think?
Just wanted to check if you have read https://rtyley.github.io/bfg-repo-cleaner/#protected-commits . Do you have perhaps a pull-request or edits you would recommend there?
Maybe explain more clearly what dirty files is (they are not only files committed in protected commit), and that those dirty files eventually will be "moved" to that last commit, if it actually "contains" them. This behavior is not obvious, and shocks first time you face it
Actually dirt-files are only protected in a protected commit - but to accept that, you need to accept that a commit in Git points to a tree which is a graph of the files at the point-in-time of that commit. This is how Git works and is irrefutable. You need to dismiss the notion that a Git commit is just a diff from the previous commit.
What is somewhat surprising to new users is that BFG appears to "shift" dirt files to appear as though they were just added in the (now-rewritten) commit which protected them. This has been discussed in numerous issues but is not mentioned heavily in docs. This is caused by BFG protecting only the tree of the protected commit, therefore the dirt is removed from ancestor trees and so the resulting rewritten repo appears to re-add the dirt on the final, protected commit. It appears to be added because Git shows it in a log/diff to the previous commit, whereas in reality the tree has not changed. The change appears because the file's history in ancestor commits was removed.
There is an attempt to address this in #149 , which proposes to protect dirt-blobs (appearing in the tree of a protected commit) in all trees where they are found, not just the protected trees. This would at least push the appearance of the "add" of the dirt file back to the last commit in which it was modified, or if never modified, the commit where it was added (no change).
I'm confused with how such huge amount of dirty files contained in protected commit counted - it even bigger than files in total the commit originally had!
So it say commit 32fcf8d0 contains 126 dirty files, but actually it contained 5 files in total, and only one of which is durty:
After the command execution, the commit became to contain all these dirty files from previous commits, instead of just deleting them:
Either is something wrong here, or I totally misunderstood how the program works..