rtyley / bfg-repo-cleaner

Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
https://rtyley.github.io/bfg-repo-cleaner/
GNU General Public License v3.0
11k stars 547 forks source link

Log what actually happened #275

Open xeruf opened 6 years ago

xeruf commented 6 years ago

When looking into the bfg-report of a deletion, I only find some hashes and other information that is not useful for me. What about putting a file that contains all changes (for example paths of deleted folders), so i can review what actually happened? It already does provide some helpful information for replacements, but not for deletions...

javabrett commented 6 years ago

That's odd. Have you reviewed to report file output, or only stdout/stderr?

xeruf commented 6 years ago

Both. In the report, there is only cache-stats.txt and object-id-map.old-new.txt, and I know that some stuff was deleted. But it only seems to generate somewhat helpful reports for replacements (changed-files.txt), and even that is a bit sparse.

Btw @javabrett have you taken over development? Was there some agreement/disagreement? This repo looks a bit stale to me

javabrett commented 6 years ago

Which version of BFG are you running? Are you certain files were deleted in your run?

You should see something like this in stdout/stderr:

Deleted files
-------------

    Filename                                       Git id        
    -------------------------------------------------------------
    some-file-path                               | e69de29b (0 B)

... and in the reports directory: deleted-files.txt

e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 some-file-path

If you aren't seeing a report, perhaps there were no file-deletions.

xeruf commented 6 years ago

there were deleted folders, I am sure of that. Maybe it's the fact that it was only folders?

javabrett commented 6 years ago

Probably good to start out with stating BFG version, the command you ran and better still a sample repo. Note also that BFG can currently only delete according to name (of a directory or a file), not by (absolute) path, so if that is what you were expecting that might be what is causing confusion.

xeruf commented 6 years ago

bfg version: 1.13.0 bfg --delete-folders archiv

Found 55 objects to protect
Found 2 commit-pointing refs : HEAD, refs/heads/master

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 7b7a9ddf (protected by 'HEAD')

Cleaning
--------

Found 62 commits
Cleaning commits:       100% (62/62)
Cleaning commits completed in 409 ms.

Updating 1 Ref
--------------

        Ref                 Before     After   
        ---------------------------------------
        refs/heads/master | 7b7a9ddf | 9a373b12

Updating references:    100% (1/1)
...Ref update completed in 17 ms.

Commit Tree-Dirt History
------------------------

        Earliest                                              Latest
        |                                                          |
        DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDm

        D = dirty commits (file tree fixed)
        m = modified commits (commit message or parents changed)
        . = clean commits (no changes to file tree)

                                Before     After   
        -------------------------------------------
        First modified commit | e494a2db | 9611ccac
        Last dirty commit     | a563833d | 6105be49

In total, 124 object ids were changed. Full details are logged here:

        /home/janek/Daten/Projects/monsterutilities.git.bfg-report/2018-05-31/05-36-53

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

obviously, since ids changed, there were deletions. but it doesn't show them.

javabrett commented 6 years ago

OK, --delete-folders, that makes sense. Currently the code cleans folders/trees (as opposed to files) by filter-to-retain, i.e. keep everything but what you asked to be deleted. In that way the filter doesn't even check to see if it hit any folders to delete. So you are right, this case is not currently reported.

If this were to be fixed it would require some code to explicitly compare each original/cleaned tree. Obviously that won't be a free operation either in terms of execution time cost.

xeruf commented 6 years ago

well it seems all about execution time here, but sometimes I just need to clean out some data from a personal project before open-sourcing it. I don't care about time then, because the repo isn't that big that that would make an impact, I just wanna be sure I don't delete more than I wanted by accident. So maybe there should be flags for speed vs extra functionality.

GMNGeoffrey commented 2 years ago

Also came here looking for this functionality. Safety seems pretty important for this sort of thing and not knowing what you are deleting seems very unsafe