laughinghan / git-subhistory

Interchangeably merge in and split out subtree history
41 stars 6 forks source link

git-subhistory — Interchangeably merge in and split out subtree history

by Han laughinghan@gmail.com

Introduction

git-subhistory manages virtual subproject repos inside a superproject repo. Like git-subtree but unlike git-submodule, git-subhistory is stateless, you pass the subproject's subdirectory (path/to/sub/) as an argument but the subdirectory isn't specially marked in any way. The subproject (Sub)'s files are tracked directly by the superproject (Main) repo like any other files, and no other Git tools know or care that path/to/sub/ contains a subproject.

You know how you can git log path/to/sub/ to see the history of just the stuff in path/to/sub/? Well, git-subhistory split creates actual commits of just that stuff. This separate commit history of just Sub can then be git push-ed to its own repo to be shared with other projects. If commits are added on top of that separate commit history, git-subhistory merge can merge those changes back into Main's commit history by inverting split, creating separate commits that make the same changes but inside path/to/sub/, which can then be merged in.

(git-subtree split is almost the same as git-subhistory split. git-subtree merge contrasts with git-subhistory merge in that while it also merges the changes correctly, it messes up the commit graph with duplicate commits and breaks Git's algorithms to find appropriate merge bases and determine fast-forwardness. See also "Comparison with git-subtree".)

Example

Let's say we have history like so:

                                                                                                [HEAD]
[initial commit]                                                                                [master]
o-------------------------------o-------------------------------o-------------------------------o
Add a Main thing                Add a Sub thing                 Add another Sub thing           Add another Main thing
 __________________________      __________________________      __________________________      __________________________
|                          |    |                          |    |                          |    |                          |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |
|  + a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |
|                          |    |  + path/to/sub/          |    |    path/to/sub/          |    |  + another-Main-thing    |
|                          |    |  +   a-Sub-thing         |    |      a-Sub-thing         |    |    path/to/sub/          |
|                          |    |                          |    |  +   another-Sub-thing   |    |      a-Sub-thing         |
|                          |    |                          |    |                          |    |      another-Sub-thing   |
|                          |    |                          |    |                          |    |                          |
|__________________________|    |__________________________|    |__________________________|    |__________________________|

We can split out the commit history of just Sub in path/to/sub/ by doing git-subhistory split path/to/sub/ -b subproj, which also creates a new branch subproj to point to it. This results in 2 disconnected histories, untouched master and sparkly new subproj:

                                                                                                [HEAD]
[initial commit]                                                                                [master]
o-------------------------------o-------------------------------o-------------------------------o
Add a Main thing                Add a Sub thing                 Add another Sub thing           Add another Main thing
 __________________________      __________________________      __________________________      __________________________
|                          |    |                          |    |                          |    |                          |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |
|  + a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |
|                          |    |  + path/to/sub/          |    |    path/to/sub/          |    |  + another-Main-thing    |
|                          |    |  +   a-Sub-thing         |    |      a-Sub-thing         |    |    path/to/sub/          |
|                          |    |                          |    |  +   another-Sub-thing   |    |      a-Sub-thing         |
|                          |    |                          |    |                          |    |      another-Sub-thing   |
|                          |    |                          |    |                          |    |                          |
|__________________________|    |__________________________|    |__________________________|    |__________________________|

                                [SPLIT_HEAD]
[initial commit]                [subproj]
o-------------------------------o
Add a Sub thing                 Add another Sub thing
 __________________________      __________________________
|                          |    |                          |
|  Files:                  |    |  Files:                  |
|  + a-Sub-thing           |    |    a-Sub-thing           |
|                          |    |  + another-Sub-thing     |
|                          |    |                          |
|                          |    |                          |
|__________________________|    |__________________________|

Say we push subproj to a public repo for just Sub, and implausibly, our code isn't perfect and bugfixes are contributed to the public repo. Now we've git pull-ed in upstream bugfixes:

                                                                                                [sub-upstream/master]  # remote-tracking branch
[initial commit]                                                                                [subproj]
o-------------------------------o-------------------------------o-------------------------------o
Add a Sub thing                 Add another Sub thing           Fix Sub somehow                 Fix Sub further
 __________________________      __________________________      __________________________      __________________________
|                          |    |                          |    |                          |    |                          |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |
|  + a-Sub-thing           |    |    a-Sub-thing           |    |    a-Sub-thing           |    |    a-Sub-thing           |
|                          |    |  + another-Sub-thing     |    |    another-Sub-thing     |    |    another-Sub-thing     |
|                          |    |                          |    |  + fix-Sub-somehow       |    |    fix-Sub-somehow       |
|                          |    |                          |    |                          |    |  + fix-Sub-further       |
|                          |    |                          |    |                          |    |                          |
|__________________________|    |__________________________|    |__________________________|    |__________________________|

Here's where the magic happens: we can easily merge these changes into Main using the Add another Sub thing commit as the merge base by doing git-subhistory merge path/to/sub/ subproj, resulting in:

                                                                                                                                                                                                  [HEAD]
[initial commit]                                                                                                                                                                                  [master]
o-------------------------------o-------------------------------o-------------------------------o--------------------------------------------------------------------------------------------------o
|                               |                               |\------------------------------|--------------------------------o--------------------------------o-------------------------------/|
Add a Main thing                Add a Sub thing                 Add another Sub thing           Add another Main thing           Fix Sub somehow                  Fix Sub further                  Merge subhistory branch 'subproj' under path/to/sub/
 __________________________      __________________________      __________________________      __________________________       __________________________       __________________________       __________________________
|                          |    |                          |    |                          |    |                          |     |                          |     |                          |     |                          |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |     |  Files:                  |     |  Files:                  |     |  Files:                  |
|  + a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |     |    a-Main-thing          |     |    a-Main-thing          |     |    a-Main-thing          |
|                          |    |  + path/to/sub/          |    |    path/to/sub/          |    |  + another-Main-thing    |     |    path/to/sub/          |     |    path/to/sub/          |     |  < another-Main-thing    |
|                          |    |  +   a-Sub-thing         |    |      a-Sub-thing         |    |    path/to/sub/          |     |      a-Sub-thing         |     |      a-Sub-thing         |     |    path/to/sub/          |
|                          |    |                          |    |  +   another-Sub-thing   |    |      a-Sub-thing         |     |      another-Sub-thing   |     |      another-Sub-thing   |     |      a-Sub-thing         |
|                          |    |                          |    |                          |    |      another-Sub-thing   |     |  +   fix-Sub-somehow     |     |      fix-Sub-somehow     |     |      another-Sub-thing   |
|                          |    |                          |    |                          |    |                          |     |                          |     |  +   fix-Sub-further     |     |  >   fix-Sub-somehow     |
|                          |    |                          |    |                          |    |                          |     | [Note: no                |     |                          |     |  >   fix-Sub-further     |
|                          |    |                          |    |                          |    |                          |     |  another-Main-thing yet] |     |                          |     |                          |
|__________________________|    |__________________________|    |__________________________|    |__________________________|     |__________________________|     |__________________________|     |__________________________|

See that? The commits on subproj that fixed Sub after adding another-Sub-thing were assimilated into commits that fix Sub inside path/to/sub/ after adding path/to/sub/another-Sub-thing, allowing them to merge cleanly like they clearly should, no commits duplicated.

This goes the other way, too. Say we make further changes to Sub:

                                                                                                                                                                                                                                    [HEAD]
[initial commit]                                                                                                                                                                                                                    [master]
o-------------------------------o-------------------------------o-------------------------------o--------------------------------------------------------------------------------------------------o--------------------------------o
|                               |                               |\------------------------------|--------------------------------o--------------------------------o-------------------------------/|                                |
Add a Main thing                Add a Sub thing                 Add another Sub thing           Add another Main thing           Fix Sub somehow                  Fix Sub further                  Merge subhistory branch ...      Add yet another Sub thing
 __________________________      __________________________      __________________________      __________________________       __________________________       __________________________       __________________________       _____________________________
|                          |    |                          |    |                          |    |                          |     |                          |     |                          |     |                          |     |                             |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |     |  Files:                  |     |  Files:                  |     |  Files:                  |     |  Files:                     |
|  + a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |     |    a-Main-thing          |     |    a-Main-thing          |     |    a-Main-thing          |     |    a-Main-thing             |
|                          |    |  + path/to/sub/          |    |    path/to/sub/          |    |  + another-Main-thing    |     |    path/to/sub/          |     |    path/to/sub/          |     |  < another-Main-thing    |     |    another-Main-thing       |
|                          |    |  +   a-Sub-thing         |    |      a-Sub-thing         |    |    path/to/sub/          |     |      a-Sub-thing         |     |      a-Sub-thing         |     |    path/to/sub/          |     |    path/to/sub/             |
|                          |    |                          |    |  +   another-Sub-thing   |    |      a-Sub-thing         |     |      another-Sub-thing   |     |      another-Sub-thing   |     |      a-Sub-thing         |     |      a-Sub-thing            |
|                          |    |                          |    |                          |    |      another-Sub-thing   |     |  +   fix-Sub-somehow     |     |      fix-Sub-somehow     |     |      another-Sub-thing   |     |      another-Sub-thing      |
|                          |    |                          |    |                          |    |                          |     |                          |     |  +   fix-Sub-further     |     |  >   fix-Sub-somehow     |     |      fix-Sub-somehow        |
|                          |    |                          |    |                          |    |                          |     | [Note: no                |     |                          |     |  >   fix-Sub-further     |     |      fix-Sub-further        |
|                          |    |                          |    |                          |    |                          |     |  another-Main-thing yet] |     |                          |     |                          |     |  +   yet-another-Sub-thing  |
|__________________________|    |__________________________|    |__________________________|    |__________________________|     |__________________________|     |__________________________|     |__________________________|     |_____________________________|

And then we git-subhistory split path/to/sub/ -b subproj2:

                                                                                                [sub-upstream/master]            [SPLIT_HEAD]
[initial commit]                                                                                [subproj]                        [subproj2]
o-------------------------------o-------------------------------o-------------------------------o--------------------------------o
Add a Sub thing                 Add another Sub thing           Fix Sub somehow                 Fix Sub further                  Add yet another Sub thing
 __________________________      __________________________      __________________________      __________________________       __________________________
|                          |    |                          |    |                          |    |                          |     |                          |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |     |  Files:                  |
|  + a-Sub-thing           |    |    a-Sub-thing           |    |    a-Sub-thing           |    |    a-Sub-thing           |     |    a-Sub-thing           |
|                          |    |  + another-Sub-thing     |    |    another-Sub-thing     |    |    another-Sub-thing     |     |    another-Sub-thing     |
|                          |    |                          |    |  + fix-Sub-somehow       |    |    fix-Sub-somehow       |     |    fix-Sub-somehow       |
|                          |    |                          |    |                          |    |  + fix-Sub-further       |     |    fix-Sub-further       |
|                          |    |                          |    |                          |    |                          |     |  + yet-another-Sub-thing |
|__________________________|    |__________________________|    |__________________________|    |__________________________|     |__________________________|

The assimilated commits are guaranteed to map back to the exact same commits they were assimilated from, down to the hashes. That means subproj2 will be a fast-forward from subproj, and if more bugfixes have been added upstream, Fix Sub further will be the merge base.

This is a data model guarantee, requiring no local state to enforce: someone else could pull your master into their clone and split them out, and they would still map to the exact same commits they were assimilated from, and the merge bases and fast-forwards will still be exactly right.

Subcommands

TODO:

Notes

Comparison with git-subtree

I actually think git-subtree was halfway there, it got splitting pretty much right, but then did merging wrong and had to needlessly complicate splitting to deal with the broken merging. If you went through the example above with equivalent git-subtree commands, the history after splitting would be identical right down to the hashes, but after merging, you'd get:

[initial commit]                                                                                                                 [initial commit]                                                                                                                    [master]
o-------------------------------o-------------------------------o-------------------------------o--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------o
|                               |                               |                               |                                o--------------------------------o--------------------------------o--------------------------------o-------------------------------/|
Add a Main thing                Add a Sub thing                 Add another Sub thing           Add another Main thing           Add a Sub thing                  Add another Sub thing            Fix Sub somehow                  Fix Sub further                  Merge commit 'd535a7c' under subtree path/to/sub/
 __________________________      __________________________      __________________________      __________________________       __________________________       __________________________       __________________________       __________________________       __________________________
|                          |    |                          |    |                          |    |                          |     |                          |     |                          |     |                          |     |                          |     |                          |
|  Files:                  |    |  Files:                  |    |  Files:                  |    |  Files:                  |     |  Files:                  |     |  Files:                  |     |  Files:                  |     |  Files:                  |     |  Files:                  |
|  + a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |    |    a-Main-thing          |     |  + a-Sub-thing           |     |    a-Sub-thing           |     |    a-Sub-thing           |     |    a-Sub-thing           |     |    a-Main-thing          |
|                          |    |  + path/to/sub/          |    |    path/to/sub/          |    |  + another-Main-thing    |     |                          |     |  + another-Sub-thing     |     |    another-Sub-thing     |     |    another-Sub-thing     |     |  < another-Main-thing    |
|                          |    |  +   a-Sub-thing         |    |      a-Sub-thing         |    |    path/to/sub/          |     |                          |     |                          |     |  + fix-Sub-somehow       |     |    fix-Sub-somehow       |     |    path/to/sub/          |
|                          |    |                          |    |  +   another-Sub-thing   |    |      a-Sub-thing         |     |                          |     |                          |     |                          |     |  + fix-Sub-further       |     |      a-Sub-thing         |
|                          |    |                          |    |                          |    |      another-Sub-thing   |     |                          |     |                          |     |                          |     |                          |     |      another-Sub-thing   |
|                          |    |                          |    |                          |    |                          |     |                          |     |                          |     |                          |     |                          |     |  >   fix-Sub-somehow     |
|                          |    |                          |    |                          |    |                          |     |                          |     |                          |     |                          |     |                          |     |  >   fix-Sub-further     |
|                          |    |                          |    |                          |    |                          |     |                          |     |                          |     |                          |     |                          |     |                          |
|__________________________|    |__________________________|    |__________________________|    |__________________________|     |__________________________|     |__________________________|     |__________________________|     |__________________________|     |__________________________|

Note the duplicate Add a Sub thing and Add another Sub thing commits.

What happened was, there were the originals, which add path/to/sub/a-Sub-thing and path/to/sub/another-Sub-thing to Main, right? They were split into commits which add a-Sub-thing and another-Sub-thing to Sub, just like you'd want.

But then, git-subtree merge did git merge --strategy subtree, which is a fine and dandy merge strategy, but git-merge always creates a merge commit whose parents are the commits passed to it, and in this case one is a commit to Main and one is a commit to Sub: notice how in the merge commit, the subproject files a-Sub-thing and another-Sub-thing are at different paths in the two parent commits.

Hence duplicated commits: the changes are to different paths.

By contrast, git-subhistory merge first does git-subhistory assimilate to generate synthetic commits to Main from the upstream bugfixes to Sub, so both parents of the merge commit are commits to Main. Importantly, the synthetic commits are generated on top of the original subproject commits, so instead of being duplicated, those originals are the merge base, like they should be!

(Asides:

)