Closed sageserpent-open closed 8 months ago
Update: breaking out code motion across files into its own ticket: #26 .
It may be the case that divergent code motion can be accomodated, at least in some situations by allowing both moves to take place; this can also pick up edits from the other side of the move in both cases. It is then up to user to decide whether to delete either or both moves, or simply accept both.
This doesn't work with ambiguous matches because multiple edits may associate to several of the matches, but that can be left as unsupported.
In fact, having ambiguous matches with code motion can lead to a situation where a moved section is associated with multiple edits on the other side - so there are edit collisions on the same side. Stay away from this for now...
More pondering leads to this rule of thumb: if an all-sides match leads to a coincident edit in the merge with the left and right contributions to the match being moved to different locations, then we have two independent edits, each applying to the move on the opposite side.
Going further, if the two moves land in the same place and are picked up by the merge as either a coincident insertion or edit, then we have an edit collision that can be represented as an insertion / insertion conflict or an edit / edit conflict.
If the two moves land in the same place that happens to be a preservation or coincident insertion because of an ambiguous match, we don't care what happens as that's not supported yet, as long as the application doesn't crash.
Regarding edit collisions due to ambiguous matches, a quick-and-dirty approach is to store the rival edits as a set, so duplicated edits don't actually conflict. If a set turns out to have more than edit, they are simply concatenated in no particular order and a diagnostic can be issued. A deletion would form its own entry in the set too.
For now however, just store one edit or deletion for each side and issue a diagnostic if a collision occurs on the same side.
So in the implementation of the merge algebra for MergeResultDetectingMotion
, we have to build up a mapping from the moved element or elements to either an edit or a deletion - and we have to supply the moved elements from the match and ignore any left or right element passed in by the calling merge algoithm, because that favours the left side for preservations, coincident insertions and coincident deletions.
By using the left and right elements from the match, we can distinguish between edits and / or deletions associated with both sides.
Got tests for the simpler situations exercising MergeResultDetectingMotion
where there are just one edit or deletion to propagate in a move.
The more complex scenarios can wait (eg. a single edit on one side versus two distinct moves on the other side, or an edit on one side adjacent to a move destination versus a move on the other side).
The next thing to do is to wire in use of MergeResultDetectingMotion
into CodeMotionAnalysisExtension
and use the move mappings there - and write some example tests for CodeMotionAnalysisExtension
to shows moves, moved edits and moved deletions in operation...
Got a failure in CodeMotionAnalysisExtensionTest.codeMotion
, analysis follows...
BASE FILE:
package com.sageserpent.kineticmerge.core
import com.eed3si9n.expecty.Expecty
object ExpectyFlavouredAssert:
val assert: Expecty = new Expecty:
override val showLocation: Boolean = true
override val showTypes: Boolean = true
end assert
end ExpectyFlavouredAssert
LEFT FILE:
package com.sageserpent.kineticmerge.core
import com.eed3si9n.expecty.Expecty
object ExpectyFlavouredAssert:
val assert: Expecty = new Expecty:
// Swapped the next two lines around...
override val showTypes: Boolean = true
override val showLocation: Boolean = true
end assert
end ExpectyFlavouredAssert
RIGHT FILE:
package com.sageserpent.kineticmerge.core
import com.eed3si9n.expecty.Expecty
object ExpectyFlavouredAssert:
val assert: Expecty = new Expecty:
override val showLocation: Boolean = true
override val showTypes: Boolean = false // This edit should propagate.
end assert
end ExpectyFlavouredAssert
DESIRED MERGE BASED ON INTUITION:
package com.sageserpent.kineticmerge.core
import com.eed3si9n.expecty.Expecty
object ExpectyFlavouredAssert:
val assert: Expecty = new Expecty:
// Swapped the next two lines around...
override val showTypes: Boolean = false // This edit should propagate.
override val showLocation: Boolean = true
end assert
end ExpectyFlavouredAssert
Outcome seen as of Git commit SHA: 9c5dd269b1dd719b5878dedc03c823deb9933d16 ...
11:43:26.972 [main] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(*** STUNT DOUBLE ***,0,28) as it is common to all three sides.
11:43:26.974 [main] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(*** STUNT DOUBLE ***,28,7) into SectionImplementation(*** STUNT DOUBLE ***,28,11).
11:43:26.975 [main] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(*** STUNT DOUBLE ***,39,6) as it is common to all three sides.
11:43:26.975 [main] DEBUG c.s.kineticmerge.core.merge$ -- Edit conflict of SectionImplementation(*** STUNT DOUBLE ***,41,5) into SectionImplementation(*** STUNT DOUBLE ***,45,1) on the left and SectionImplementation(*** STUNT DOUBLE ***,41,12) on the right, coalescing with following left insertion of SectionImplementation(*** STUNT DOUBLE ***,46,7).
11:43:26.977 [main] DEBUG c.s.kineticmerge.core.merge$ -- Edit conflict of SectionImplementation(*** STUNT DOUBLE ***,41,5) into SectionImplementation(*** STUNT DOUBLE ***,46,7) on the left and SectionImplementation(*** STUNT DOUBLE ***,41,12) on the right, coalescing with following left insertion of SectionImplementation(*** STUNT DOUBLE ***,53,4).
11:43:26.977 [main] DEBUG c.s.kineticmerge.core.merge$ -- Edit conflict of SectionImplementation(*** STUNT DOUBLE ***,41,5) into SectionImplementation(*** STUNT DOUBLE ***,53,4) on the left and SectionImplementation(*** STUNT DOUBLE ***,41,12) on the right.
11:43:26.982 [main] DEBUG c.s.kineticmerge.core.merge$ -- Merge yielded:
MergeResultDetectingMotion(
coreMergeResult = MergedWithConflicts(
leftElements = Vector(
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 0, size = 28),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 28, size = 11),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 39, size = 6),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 45, size = 1),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 46, size = 7),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 53, size = 4)
),
rightElements = Vector(
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 0, size = 28),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 28, size = 11),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 39, size = 6),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 41, size = 12)
)
),
changesPropagatedThroughMotion = Map()
)
override val showTypes: Boolean =
was matched and was considered to be preserved in the same spot, contrary to the intuitive view that it would move on the left side towards the start of the file. It appeared as (..., 35, 6) on the base and right and (..., 39, 6) on the left.override val showLocation: Boolean = true
was considered to be moved to the end of the file instead, being dealt with as a left-edit at its source and with the moved section appearing in a conflict at the end. It appeared as (..., 28, 7) on the base and right and (..., 46, 7) on the left.true
for override val showTypes: Boolean = true
on the base and right was expected to match as a base + right match, being edited into false
. This didn't happen (and as a single token, wouldn't be picked up anyway as a match), but puzzlingly, the merge algorithm didn't get the chance to make a last-minute fallback match either...true<newline>end assert<newline>end ExpectyFlavouredAssert
became a section in its own right (..., 41, 5) ...true<newline>
and then end assert<newline>end ExpectyFlavouredAssert
became two sections (..., 45, 1) and (..., 53, 4) with the moved section (..., 46, 7) sitting between them ...false // This edit should propagate.<newline>end assert<newline>end ExpectyFlavouredAssert
came in as a final section (..., 41, 12).So we got a conflict at the end.
The minimum sure-fire match size was 5, and as there was just the one file, no smaller matches would have been considered. So the obvious match of end assert<newline>end ExpectyFlavouredAssert
across all sides didn't happen as that would be four tokens long with whitespace condensed into the preceding token.
The final section on each side was generated as a filler, hence the lack of consistency between them.
What happens with a minimum match size of 4? (Git commit SHA: 5a9a82be8b8efb3cbebd369a28f0a77deafd25c0)
Now we see an all-sides match for end assert<newline>end ExpectyFlavouredAssert
, good.
13:26:03.567 [main] DEBUG c.s.k.core.MappedContentSources -- Filling gap on side: base at path: *** STUNT DOUBLE *** prior to following section with: SectionImplementation(*** STUNT DOUBLE ***,41,1).
13:26:03.567 [main] DEBUG c.s.k.core.MappedContentSources -- Filling gap on side: left at path: *** STUNT DOUBLE *** prior to following section with: SectionImplementation(*** STUNT DOUBLE ***,28,11).
13:26:03.568 [main] DEBUG c.s.k.core.MappedContentSources -- Filling gap on side: left at path: *** STUNT DOUBLE *** prior to following section with: SectionImplementation(*** STUNT DOUBLE ***,45,1).
13:26:03.568 [main] DEBUG c.s.k.core.MappedContentSources -- Filling gap on side: right at path: *** STUNT DOUBLE *** prior to following section with: SectionImplementation(*** STUNT DOUBLE ***,41,8).
13:26:03.618 [main] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(*** STUNT DOUBLE ***,0,28) as it is common to all three sides.
13:26:03.620 [main] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(*** STUNT DOUBLE ***,28,7) into SectionImplementation(*** STUNT DOUBLE ***,28,11).
13:26:03.620 [main] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(*** STUNT DOUBLE ***,39,6) as it is common to all three sides.
13:26:03.621 [main] DEBUG c.s.kineticmerge.core.merge$ -- Right edit of SectionImplementation(*** STUNT DOUBLE ***,41,1) into SectionImplementation(*** STUNT DOUBLE ***,41,8).
13:26:03.621 [main] DEBUG c.s.kineticmerge.core.merge$ -- Left insertion of SectionImplementation(*** STUNT DOUBLE ***,46,7).
13:26:03.622 [main] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(*** STUNT DOUBLE ***,53,4) as it is common to all three sides.
13:26:03.625 [main] DEBUG c.s.kineticmerge.core.merge$ -- Merge yielded:
MergeResultDetectingMotion(
coreMergeResult = FullyMerged(
elements = Vector(
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 0, size = 28),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 28, size = 11),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 39, size = 6),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 41, size = 8),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 46, size = 7),
SectionImplementation(path = "*** STUNT DOUBLE ***", startOffset = 53, size = 4)
)
),
changesPropagatedThroughMotion = Map()
)
That is fully merged, seems promising ... but the test still fails!
Ah - the test expectations were too rigid - was using default equality to compare tokens, which takes whitespace into account. The tests were cutover to use the whitespace-insensitive Token.equality
and after backpatching the test to check whether the changes in expectations had any effect on legacy testing, this was merged in via Git commit SHA: a694056de420681c4de0f20e65e183e77c43482a.
Test now passes.
This is good, but I'm uneasy about how much has been proven.
Another wrinkle is that the nice intuitive picture of a single section moving on one side of the merge and being edited on the other generally won't be the case unless the minimum match size is quite large.
What will happen instead is that there will be three sections on each side - a leading and trailing one that are both matched on all sides, moving in tandem, and a middle one that has a pairwise match between the base and the side with the move - and an unmatched section that is the precisely the edit change. I expect this to work with the current implementation as of a694056de420681c4de0f20e65e183e77c43482a, but this needs testing...
As of Git commit SHA: d3e02cbe84038cda1245293b526447d8f96eb47c, CodeMotionAnalysisExtensionTest.codeMotion
has been bolstered to show the propagation of an edit through a section move, and it passes.
Code motion has arrived!
Pushing all further work on supporting edge cases for propagating edits and deletions through moves into a separate ticket #28 .
Favouring moves of small sections over moves of large sections by modifying LongestCommonSubsequence
results in a merge failure in CodeMotionAnalysisExtensionTest.codeMotion
, investigating this in case the change in behaviour of the LCS is exposing a fundamental defect in the downstream merge...
As of Git commit SHA: 9a2fddb45e1231b24a167707c55bf7378a4edc31 the longest common subsequence algorithm favours subsequences with longer section when choosing between longest common subsequences of the same length.
This is reflected in the downstream merge, where it is the smaller sections that get picked up as code motion when they interchange with larger sections.
The test failure noted above in CodeMotionAnalysisExtensionTest.codeMotion
is due to the lack of support for edit anchoring. What happened was that a large construct was moved on one side and a small suffix of it edited on the other - the unchanged prefix moved cleanly through the merge, being picked up as an all-sides match, but the small suffix was not picked up as a pairwise match on the base and on the other side from the edit.
This led to the suffix moving on the other side from the edit as is, and an ordinary delete versus edit conflict at the source of the move. Had the suffix been large enough to match, then this would have resulted in a move of the suffix in tandem with the prefix; the edit being propagated through the move.
Edit anchoring is a possible way of dealing with this - unmatched sections are considered to be anchored to adjacent matched sections, so as long as the matched sections move in tandem, they could pick up the unmatched section as a move, thus allowing edit propagation.
A new ticket #29 has been made for edit anchoring; for now, we have something worth getting out there...
This went out in release 1.1.0, Git commit SHA: 26a9c9a64451864b3373536b8f8d934ca869a71c.
Evidence (screencast of a demonstration of Kinetic Merge):
https://github.com/sageserpent-open/kineticMerge/assets/1765601/2d676e20-dce0-441a-b4a2-fcbaa8aff35d
https://github.com/sageserpent-open/kineticMerge/assets/1765601/783172ef-7688-45ab-83ab-2e0aa633ce01
https://github.com/sageserpent-open/kineticMerge/assets/1765601/370ddbce-98c7-42d2-8c81-f5c2756d051d
Closing now that evidence is uploaded.
This follows on from #1 , adding in the requirement dropped from that ticket:
Given a Git repository of files in multiple directories that has been worked on in two branches, provide a command line utility that merges one branch into another, taking into account code motion inside files.
Any divergent code motion is treated as an error and causes the merge to be aborted - the repository should be left as it was prior to the merge being attempted.
Code motion is only considered inside a file on a file-by-file basis - so when a method is extracted and placed elsewhere in the same file, it should be picked up as code motion in this ticket. Similarly for code rearrangement due to method sorting in something like IntelliJ.
Moving a class into its own file would not be picked up in this ticket, that job is for #26.