rtyley / bfg-repo-cleaner

Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
https://rtyley.github.io/bfg-repo-cleaner/
GNU General Public License v3.0
10.84k stars 535 forks source link

--delete-files does not remove file from certain commits #405

Open ashetkar opened 3 years ago

ashetkar commented 3 years ago

Did not find any existing issue which addresses this case. Please point me to one, if it exists.

I used bfg-1.13.0.jar to remove a xml file from my repository.

$ java -jar ~/Desktop/desk/bfg-1.13.0.jar --delete-files to-be-deleted.xml my-repo.git
Using repo : my-repo.git

Found 23198 objects to protect
Found 734 commit-pointing refs : HEAD, refs/heads/1.1.0-HF2, refs/heads/1.1.0-conflict-disk, ...

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit <commit-sha1> (protected by 'HEAD')

Cleaning
--------

Found 4153 commits
Cleaning commits:       100% (4153/4153)
Cleaning commits completed in 4,936 ms.

Updating 733 Refs
-----------------

    Ref                                       Before     After   
    -------------------------------------------------------------
    ...

Updating references:    100% (733/733)
...Ref update completed in 429 ms.

Commit Tree-Dirt History
------------------------

    Earliest                                              Latest
    |                                                          |
    DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD

    D = dirty commits (file tree fixed)
    m = modified commits (commit message or parents changed)
    . = clean commits (no changes to file tree)

                            Before     After   
    -------------------------------------------
    First modified commit | ... | ...
    Last dirty commit     | ... | ...

Deleted files
-------------

    Filename                Git id            
    ------------------------------------------
    to-be-removed.xml | ... (19.1 KB)

In total, 8629 object ids were changed. Full details are logged here:

    my-repo.git.bfg-report/2020-09-23/15-54-58

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

It seemed to have done its job until we found a past commit which still shows the file.

Anything I missed here? And a way to fix this? Also, how can one verify if it did as expected? Thanks!

Steps performed:

git clone --mirror git://github.com/my-repo.git
java -jar ~/bfg-1.13.0.jar --delete-files to-be-removed.xml my-repo.git
cd my-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git for-each-ref --format 'delete %(refname)' refs/pull | git update-ref --stdin
git config remote.origin.url https://github.com/my-repo.git
git push
Andrew-Kulpa commented 3 years ago

I noticed this issue too. I was removing some executables committed in a repository I was asked to migrate from another version control system. Using --delete-files did not fully remove those files from history.