newren / git-filter-repo

Quickly rewrite git repository history (filter-branch replacement)
Other
8.55k stars 708 forks source link

Perform action only on deleted artifacts #444

Open ianwilliams1 opened 1 year ago

ianwilliams1 commented 1 year ago

Love the tool, so convenient, but asking for a little more convenience...

Feature request: Perform action only on deleted artifacts : --deleted only

I would like to apply a command such that I can purge all unwanted artifacts, but only after they have been deleted first. eg:

git-filter-repo --deleted-only --invert-paths --path-regex '.*\.(class|[ejw]ar|zip|z|gz)'
or:
git-filter-repo --deleted-only --strip-blobs-bigger-than 10M

I'm sure this use case is not unusual. As a Lead / Admin of a large group of mixed experience developers, we find often a mis-constructed ignore file has resulted in unwanted artifacts being committed, resulting in repo bloat.

The documentation reads:

Similarly, you could use --paths-from-file to delete many files. For example, you could run git filter-repo --analyze to get reports, look in one such as .git/filter-repo/analysis/path-deleted-sizes.txt and copy all the filenames into a file such as /tmp/files-i-dont-want-anymore.txt and then run

git filter-repo --invert-paths --paths-from-file /tmp/files-i-dont-want-anymore.txt

to delete them all.

But that means I must process the 'path-deleted-sizes.txt' through a regex, create the /tmp file and process again. I'd liek the convenience of a one-shot command, but with the safety net of knowing I am applying my criteria (regex, size, etc.) only to files that have already been deleted.

Hopefully the explanation (and contrived examples) is clear.

newren commented 1 year ago

It's an interesting idea, and might make sense for someone to create a contrib script for.

It would not make sense as part of the main tool because: