sageserpent-open / kineticMerge

Merge a heavily refactored codebase and stay sane.
MIT License
9 stars 1 forks source link

Support merging of file joins with conflicts. #11

Open sageserpent-open opened 1 year ago

sageserpent-open commented 1 year ago

Suppose we start with two files, A.txt and B.txt in some base commit.

On our branch we join these two files together to get just A.txt, but with some editing going on in the resulting file prior to committing it. So two files have become one with an edit mixed in, say in the section originally from A.txt (but it could equally well be the section that used to be B.txt).

On their branch, they edit A.txt in the same location (equally, they could edit B.txt in the same location) and then commit.

We expect the join to be merged cleanly, so there should be a single file A.txt - but the merge should exhibit conflicts due to the edits made in the same source location across the two branches.

We can placate Git by writing A.txt with the necessary conflict markers in the working directory, but we know that IntelliJ uses the staging area as the input for its own merge resolution tool. What do we put in?

Investigation of an edit conflict mixed in with a simple non-divergent rename on their branch shows that Git loads the staging area for the renamed file path, but uses blobs corresponding to the original filename for stage 1 (the base commit) and stage 2 (our branch) - the blob for stage 3 under the renamed file path is the same as for stage 1.

This makes sense, as Git as really tracking content; the filename at a given commit is a secondary concern - and this is verified by making a commit that is a pure file rename - the blob is reused, so multiple file paths in different commits can share the same blob.

So in the example above, we can use the obvious blobs for A.txt in its original and joined form on our branch to make the stage 1 and stage 2 entries - but how about stage 3? We could do the simple thing and just use A.txt as it is on their branch, ignoring B.txt completely. That will certainly work if the conflicts lie in the source code that used to belong to A.txt in the base commit.

If on the other hand it is the source code that used to belong to B.txt in the base commit, then this gets a bit messier - it won't show up in the stage 3 entry at all. Now if IntelliJ is looking only at the staged entries, then it will have no idea about the edits made to B.txt on their branch, and no conflict will be reported at all - it will simply see a nice clean contribution of our branches joining of B.txt on to A.txt and will merge that without a second thought.

So ... if Git is relaxed about blobs versus filenames, it should be possible (actually, some experimentation shows it is possible, and will work with IntelliJ) to build a synthetic blob that represents the fusion of A.txt and B.txt on their branch - essentially this anticipates what the merge will do to their content. This would then allow the edit made on their branch to sit in that synthetic blob.

sageserpent-open commented 1 year ago

It's OK to use synthetic blobs for staging entries as in order to finally complete the merge, some editing of the content has to be done - once the merge commit is made, that will be yet another blob, and the synthetic one will simply be loose.

In fact, running git prune will remove any such loose blob, so we aren't going to kill Git this way.

Either that, or the synthetic blob content will be accepted as the resolved merge, in which case, fine, we have a merge commit with a brand new blob that actually corresponds to 'real' code.