hraban / tomono

Multi- To Mono-repository merge
https://tomono.0brg.net
GNU Affero General Public License v3.0
842 stars 138 forks source link

After using tomono, rebase and log contents are very different. #51

Closed UmbraMalison closed 1 year ago

UmbraMalison commented 1 year ago

Hi,

This is a question. I've used the tomono script on a collection of repos that used to be a mono repo, but has since spent many years as individual repos, but now once again would like to be a mono repo.

So there are many commits that could be fixedup as one commit, and this is obvious when viewed in chronological order over the set of all paths. (20 commits all saying the same thing!)

git log gives me that order, and this script appears to have done the job very well - thank you because I did get similar results myself but without support for branches, and the simplicity of your approach is very enlightening)

But so that I can do these fixup's I need to rebase, when I do the rebase I see the commits in the order they were applied rather than in the order they were authored or committed and I see no options to resolve that within the git rebase method. So this means those 20 commits are shown entire repo/paths apart in the list because in rebase you see repo/path#1 commits in chronological order, then repo/path#2...#n.

I wrote a script to write my own rebase todo file using git log, however, this is complicated due to merge commits, and the repo won't rebase without replaying the merges.

Do you have an ideas how I might progress? When you have used this script, did you then try to rebase your monorepo to tidy it up and if so did you encounter this problem, or is it unique for me?

hraban commented 1 year ago

I'm having a hard time following your example, sorry. Could you create a public repository on Github and set it up so it demonstrates a version of what you mean? Or perhaps a few: some small repos, a monorepo after automatic merge using tomono, and one with what you would want the intended effect to be?

Or draw up some sample git logs, that could help, too.

All the best

UmbraMalison commented 1 year ago

Here is my script that uses your tomono. https://github.com/UmbraMalison/mono-repo-tool/blob/main/mrt.sh

it creates a synthetic repo, which models my question above. (it doesn't model merge commits). when/if you run this script, after running tomono it will show the git log looking like below:

git log --oneline
c4aa824 change 3-group.
11e4da8 change 3-group.
0e3bd4d change 3-group.
260dad9 change 3-group.
a1501a0 change 3-group.
d43773c change 3-group.
fcb6d3e change 2-group.
6f92771 change 2-group.
07e5ee2 change 2-group.
fdb0db5 change 2-group.
f370c9b change 2-group.
94910ff change 2-group.
a5f5e91 change 1-group.
cbaae14 change 1-group.
72e2028 change 1-group.
43318b9 change 1-group.
bcb6251 change 1-group.
78f3df9 change 1-group.
201f72e (tag: repo6/v1) Initial Commit.
d70842f (tag: repo5/v1) Initial Commit.
4d6a140 (tag: repo4/v1) Initial Commit.
fdb689c (tag: repo3/v1) Initial Commit.
954abb7 (tag: repo2/v1) Initial Commit.
725f143 (tag: repo1/v1) Initial Commit.

monorepo1

Then it will run a rebase function in my script. this is the function I started to write that would identify commits to fixup, when I came to this question. So it actually does nothing other than git rebase --root --interactive, but maybe helps explain where I was heading.

the rebase todo file will look like this, the commits are now organised by their path as a product (i believe) of the mono repo process rather the chronological order we saw above.

gitrebase --root --interactive
pick 87c601a0572dedd1a780d0ea7c84aeaa6b18bbd9 Root commit for monorepo branch master # empty
pick 33e7dd568a25600240534b379818dcdb797229d1 Initial Commit.
pick 31250c228de37e7b9e26bd01dc58deb7a8b0d4b8 change 1-group.
pick 890fb9740bcce6b558153318aefcbb24abf470de change 2-group.
pick 2b09988f9574b16d7e6ef33afeee38305402e59e change 3-group.
pick 9a9de1ff5612ece21bb59d85979e0684ac3d1b5d change 4-group.
pick bef0a50e3679af094ae5b11e996154841c88df48 change 5-group.
pick 103a3d7cd9d4634ffd304d6709ef5b53b4e89a04 change 6-group.
pick 2e0deca6836ad166bc14e7ee4c2248ab17467d8f change 7-repo1.
pick ba8dd8cdb49c8edd4db5342c25687ea45b3dacf1 change 13-repo1.
pick d97971559344459a692677dd86b2ebd1fc9d3aa7 change 19-group.
pick eed805c69e56dc3c84400a660b9b3038eabe2be6 change 20-group.
pick e37a492e5489da9d593c7c64452323448d4d55c5 change 21-group.
pick e213a62a87e255c965ee07170bd7dfa5879e622e change 22-group.
pick 2c5bbb81403cd101f6a86b9c5f28ed071f78590a change 23-group.
pick 31e130b18370b7c9709d09d36ff31fe58a6b762e change 24-group.
pick d07ce2e60eb391d9027be37156d86e05a33c81a0 change 25-repo1.
pick 7f46f492f2aa1644101a8d26b089759ea74e71ec change 31-repo1.
pick 4d152f4952b3436b36cbafeb2a2b9e7cd04a6d68 Initial Commit.
pick 0db1b79f28bde0f6d6c7374be166398768aac650 change 1-group.
pick 4cc2e0a9f765f81da25f6c710e8b9a9c6eedfb4a change 2-group.
pick 2e8ebe710969e92a5c199fea6d7316d7b045089e change 3-group.
pick 3859c4a15433341a9280205d0f99d24d91e054ee change 4-group.
pick 96ccd7da6b7778529a2c91fbf342f2183853c2d4 change 5-group.
pick baabf1f89581aded24561ef25772d0c294fc5d96 change 6-group.
pick 2d792e6ea8443d5b2243fd63673083bf944524c2 change 8-repo2.
pick 0c2ab6ff25090aa3826466b295600ca2ac0f8a07 change 14-repo2.
pick 025c61e0f1597ea7a37decea2b38d5d26bf4222c change 19-group.
pick 2e0e2871ea3daa362ae15d07b18837ca4675af02 change 20-group.
pick c0739eb728417c26bbee426ee843794bd94fdb72 change 21-group.
pick 09be548be03eac43bbceeafbc9133eeb44538e07 change 22-group.
pick 84133d1895790a601898966f59948a94e4dea348 change 23-group.
pick 5ebc9e4c039ee8586dda89b407cdc63d043048de change 24-group.
pick c9e83dc1974502780956f989bf7ab0500ccb8a84 change 26-repo2.
pick fd66d4c3f61ecdd1dd52cd77639cf890a8e42de3 change 32-repo2.

The rebase completes successfully, and now the git log looks like the rebase order.

b8b2176 Initial Commit.
c910524 change 31-repo1.
667b516 change 25-repo1.
2e4d769 change 24-group.
c8de3e7 change 23-group.
5ce6f04 change 22-group.
936d610 change 21-group.
50d954b change 20-group.
69f54b0 change 19-group.
871c238 change 13-repo1.
b374ddf change 7-repo1.
d6b914a change 6-group.
12b9800 change 5-group.
c383d78 change 4-group.
3f2822a change 3-group.
c05983f change 2-group.
eed65e5 change 1-group.
e2dcd93 Initial Commit.
4aab150 Root commit for monorepo branch master

monorepo2

UmbraMalison commented 1 year ago

@hraban did I explain well enough the question? is the problem '4aab150'?

hraban commented 1 year ago

Hi @UmbraMalison it's a great elaboration on the issue, I'm just swamped in other work at the moment so give me a few days to get back to this. It's on my todo list.

hraban commented 1 year ago

Hi,

I ran your script but I end up with multiple directories, it's unclear which is the final resulting directory. The script is quite verbose so I can't really debug further.

Are you trying to get the same commit from your individual source repos, to get the same actual commit ID, so that the once-used-to-be-monorepo will also have its original same commits be the same in the final repo?

You should be able to just rely on git being deterministic for this to work: if the commit is actually identical, it should also get the same ID in the final output. The tomono script doesn't change anything about the commits, so if the commits are identical in your individual source repos, they will automatically be "deduplicated" in the final monorepo.

What might be tripping you up (and causing different IDs) is committerDate. This could have been changed when the original monorepo was split up into multiple separate repos. Check that with e.g. git log --format=fuller -1 ....

See the git graft tool for an example of this trick in action, using git rebase --committer-date-is-author-date. (which, in your case, I would run on the source repos before merging).

Good luck!

hraban commented 1 year ago

I'm going to close this question pending further info, please feel free to continue the conversation and/or reopen it. 👍