rtyley / bfg-repo-cleaner

Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
https://rtyley.github.io/bfg-repo-cleaner/
GNU General Public License v3.0
11.09k stars 549 forks source link

How to get rid of the "Former-commit-id" #102

Open senderic opened 9 years ago

senderic commented 9 years ago

I noticed after I ran bfg and scrubbed out sesntivie data, the overwritten commit started to read things like:

Former-commit-id: abcd123

And after clicking on the link to the former commit ID, I found the sensitive data was still there!

Example: https://github.com/esend7881/udacity-android-nanodegree--july2015-project1/commit/c46a009a990268419020cdba8aa00869a27f4c56 In that you can click on the former id and see the old code.

How can I delete all the overwritten commits entirely?

rtyley commented 9 years ago

Can you tell me what command line params you passed to the BFG? Delete files, strip blobs, replace text, etc?

rtyley commented 9 years ago

The BFG is only supposed to add Former-commit-id: commit footers when the operation is 'public', rather than 'private' data removal. By default, 'public' means removing 'big' files, ie removing files by size - but when you delete files by name, that's private by default. You can also pass a -private flag to force the BFG to treat the operation as private. So it's interesting to me to know whether you used --delete-files or not.

senderic commented 9 years ago

To answer your first question, I basically followed the steps on your home page https://rtyley.github.io/bfg-repo-cleaner/

Since the terminal is not with me now, this is what I remember:

$ git clone --mirror git://example.com/some-big-repo.git
$ java -jar bfg.jar --strip-biggest-blobs 1 some-big-repo.git
# - Note -- using "1" because the text I am scrubbing is small (and the repo is small too)
$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ cd ..
$ bfg --replace-text passwords.txt  my-repo.git
# - Where `password.txt` contains the password exposed.
$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ git push

This appeared to work -- the output text showed it was successful, but it created the Former-commit-id.

Could I redo the above steps on the repo as is, but pass in the -private flag? Would doing that get rid of all occurrences of the password, including in the former id's?

MarounMaroun commented 5 years ago

Any updates on this?

javabrett commented 5 years ago

Related: #139 #293 #140 .

bloopkin commented 5 years ago

This is an old question but here's what I used to fix that: $ git filter-branch --msg-filter 'sed -E "s/Former-commit-id: [0-9a-f]{40}//g"' --tag-name-filter cat -- --all It is quite long to run on a large repo however.

jasonnicholson commented 3 years ago

git filter-branch --msg-filter 'sed -E "s/Former-commit-id: [0-9a-f]{40}//g"' --tag-name-filter cat -- --all Is there a git filter-repo equivalent?

booleanbetrayal commented 1 year ago
git filter-repo --message-callback 'return re.sub(b"Former-commit-id: [0-9a-f]{40}.*", b"", message, flags=re.MULTILINE)'