newren / git-filter-repo

Quickly rewrite git repository history (filter-branch replacement)
Other
8.52k stars 708 forks source link

How to propagate renames back through history? #524

Closed vwkd closed 3 months ago

vwkd commented 11 months ago

Frequently I find myself wanting to clean up renames from history. Currently, I'm doing this manually. I either start from the root and go forwards in time, or start from the HEAD and go backwards in time. I find the previous commit that added (or renamed) a file and change the filename to that of the following rename. If all goes well, Git automatically updates any following commits modifying the file and removes the following rename.

Doing this manually on a non-trivial repo can get quite laborious. If the file was renamed multiple times, one has to repeat the steps multiple times. For single files it's doable enough with git log and git rebase. It becomes really laborious when cleaning up renames of entire folders. Now, one has to go back to every commit that added a file to that folder and change it. The work required essentially grows one-to-many instead of just one-to-one. More manual work also increases the chances for screwing up your repo by making a mistake somewhere.

It would be great if there was a tool that could handle this automatically. Probably it wouldn't be able to update other files or commit messages referencing the old filename which would still need to be done manually.

Could git-filter-repo support this? Are there any other tools that can do this?

Mrkvozrout commented 3 months ago

I am currently also fighting with similar use case. I have a subfolder (for a plugin) which got renamed half way in the commit history (from plugin-A to plugin-B). Now I want to extract this subfolder to separate repo, but I have troubles getting the renamed subfolder content moved into the root with full history.

Simple --subdirectory-filter plugin-A keeps only the first half of the history.

--subdirectory-filter plugin-B keeps only the second half of the history.

Next I tried renaming the subfolder (both names) to root manually with --path-rename plugin-A: --path-rename plugin-B: but it failed with an error (probably that doc-mentioned protection of file collisions, although the file content should be the same).

Writing this comment I thought about it further and tried some other command variants including --filename-callback with the same results/errors. But finally this worked for me:

So it seems there is a way with this tool to get help with renames, but I agree there might be even smarter feature to consolidate file names by following renames through the history.

newren commented 3 months ago

@Mrkvozrout : glad you found a solution; it's exactly what I would have suggested.

@vwkd : Several comments. First, you could run something of the form:

   git filter-repo --path-rename A:E \
                        --path-rename B:E \
                        --path-rename C:E \
                        --path-rename D:E \
                        --path-rename F:H \
                        --path-rename G:H \
                        --path-rename I:Z \
                        --path-rename J:Y \
                        --path-rename K:W \
                        --path-rename L:W \
                        --path-rename M:W \
                        --path-rename N:W \
                        --path-rename O:X

i.e. just making a long list of all the renames you want and giving them all to filter-repo to run. So long as you don't rename on top of other files that existed (e.g. maybe there was an E in really old versions of history, unrelated to the modern one, and that old version got deleted long before the renames from A -> B -> C -> D -> E), it should be okay. If there were colliding other paths at the target name, though, things could get kind of messed up.

But, you were probably asking about something that would help you generate this list. Running git filter-repo --analyze might help you generate the list (in the .git/filter-repo/analysis/renames.txt file), but no, I won't be making or accepting any kind of tool that automatically converts those; see the caveats in the .git/filter-repo/analysis/README.md file for renames. Further, what if one branch renames P->Q, and a different renames P->R? What should you rename it to in that case? What if those branches are merged together -- there's not an easy way to programmatically determine the rename resolutions within a merge commit, so what would you do in that case? In fact, the merge commit is problematic even if only one side renamed.

So, I don't consider it feasible to create such a general purpose rename handling tool. You can create something that might work for common repositories because they don't have some of the possible weirdnesses that could exist, but you'd be signing yourself up for a never-ending stream of investigating weird corner cases if you made it public.

I'm sure that's not what you wanted to hear, but hopefully some of the pointers I gave above are helpful.