rtyley / bfg-repo-cleaner

Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
https://rtyley.github.io/bfg-repo-cleaner/
GNU General Public License v3.0
10.95k stars 540 forks source link

Replace a whole file? #301

Open prdoyle opened 5 years ago

prdoyle commented 5 years ago

I'd like to replace a file with the one that should have been used in the first place. Is that supported?

A commit in the past added a file, X. Since then, there have been many changes and branches on top. I'd like to alter that commit so that file X had different contents, and re-apply all the same changes and branches on top.

Thanks!

javabrett commented 5 years ago

Are you able to share what the issue with the file is? Bulk, secrets, something else?

prdoyle commented 5 years ago

It's secrets, effectively. Suppose we committed the wrong file, and discovered much later. I'd like to rewrite history so it looks like we had the correct file there all along. That way, all the commits still make sense, and we don't have this weird time period where a bunch of the tests don't work because of a missing file.

prdoyle commented 5 years ago

I should clarify: the file in question has not changed since it was committed. When I said "there have been many changes", I meant changes to other files in the repo! Sorry, that was a very confusing way for me to describe the situation!

javabrett commented 5 years ago

I'm not aware of a feature that would allow a whole file to be replaced, rather than deleted, although I wouldn't expect it to be super-hard to add that. You might see if you can manage it with the regex feature but that sounds tricky for a full file.

You should note that deleting secrets (or replacing them) is somewhat overrated - any true secret which has been committed to a repo which is accessible by those who shouldn't have it, must of course be considered compromised and retired/replaced.

Rewriting Git history, even with BFG is kind-of a big deal for WIP etc. All your commit SHAs will change from the first change.

If you need old builds to change and work, I would consider implementing an auto-patch process instead, which looks at where you are in the history and if needed supplied the correct secret file or patches the code which uses it.

prdoyle commented 5 years ago

Thanks @javabrett. I've got a project that might one day go open source, and early on I used some sample data that, while not highly sensitive or confidential, wouldn't be appropriate to open source. Later, I replaced it with some artificial sample data, and I was hoping to alter the history so this current best practice was pushed back all the way to the start. Right now, I'm not worried about commit hashes changing, though that will become more important over the coming weeks as collaboration on the project ramps up internally.

javabrett commented 5 years ago

@prdoyle Check how many branches and how many commits you are working with. If the numbers are relatively modest, you might be better just running a rebase on each branch, stopping at the bad commit and fixing it before proceeding. It won't be super-fast, and may involve some keyboard repetition, but it will be super-simple to run.

DeeDeeG commented 5 years ago

Wouldn't it be possible to make a "replacements file" for bfg --replace-text to make this work?

(E.g. put every line of the original file on the left, as the "to-be-replaced" string, then put every respective line of the desired file contents on the right, as the "replacement string")

(Sorry if that's not the case, since I'm new to using bfg. But that seems like the sort of thing --replace-text would be good for, in theory.)

Edit to add: example "replacements" definition file: https://gist.github.com/w0rd-driven/60779ad557d9fd86331734f01c0f69f0

ad-m commented 4 years ago

This operation can be useful, for example, when confidential data has been placed in binary files, e.g. screenshots in documentation.

RokoMijic commented 4 years ago

Also interested in this for the purpose of modifying the .gitignore file for all of history

Is there a way to just add a line to a particular file for all of history with BFG?

TaxBusby commented 1 year ago

I'd love to be able to do this to replace a large file with a text file pointing to it in another storage location (e.g. replace a large file with an S3 link).

Replacing the file would allow me to keep the history intact for when the file was added/modified/deleted.