Closed sageserpent-open closed 2 weeks ago
The job here is to add a failing test that exposes any such problem in CodeMotionAnalysisExtensionTest
, then to implement to fix it.
MatchesContextTest
will probably need some work too as the implementation proceeds.
While we're at it, maybe it's time to join up the tracking of moves with change migration?
I'm not sure if this is feasible, though, and there is some code here that is probably saving the day for the change migration situation - perhaps that is good enough?
Idea: cutover the association of changes with potential destination keys to being an association with potential source keys.
Generalise the association, so that even straight moves without any changes have their sources added as a key without a change (neither deletion nor edit, so this becomes an optional associated value).
This allows potential move destinations to be tied back to their sources by looking at the relevant ambiguous matches and seeing if a corresponding key exists in the association.
Issue #91 raised this:
The assertion occurs after
CodeMotionAnalysis.of
has completed; it is inCodeMotionAnalysisExtension.merge
after the per-file merges have completed.As such, it can be picked up as a prelude to working on #90.
For now 78c6ba0 disables that postcondition so that the third point from the previous comment can be addressed...
Debugging this assertion reveals a malformed MoveDestinations
that contains a single destination on the right hand side without any source:
MoveDestinations(Set(),Set(),Set(Section(
label = "THEIRS: 7b4fd68698f8ed6b13b067cf87f6abb25db52da1",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 629, characterOffset = 48),
onePastEnd = TextPosition(line = 636, characterOffset = 14),
startTokenIndex = 2541,
sizeInTokens = 18,
content summary = """).get
),
... (
contentsByPath = Map(
"""
)),Set())
This is referenced in MoveDestinationsReport.moveDestinationsByMatches
with the key:
LeftAndRight(Section(
label = "OURS: 5dbb4faca2a2df4351c9f4892da59601913489f3",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 736, characterOffset = 61),
onePastEnd = TextPosition(line = 742, characterOffset = 12),
startTokenIndex = 3543,
sizeInTokens = 18,
content summary = """).get),
... (
contentsByPath = Map(
"""
),Section(
label = "THEIRS: 7b4fd68698f8ed6b13b067cf87f6abb25db52da1",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 629, characterOffset = 48),
onePastEnd = TextPosition(line = 636, characterOffset = 14),
startTokenIndex = 2541,
sizeInTokens = 18,
content summary = """).get
),
... (
contentsByPath = Map(
"""
))
So what happened to the left hand side of the left-right match?
To start with, consider the right - that was picked up as a right edit, being part of a coalesced run:
17:23:13.584 [io-compute-3] DEBUG c.s.kineticmerge.core.merge$ -- Right edit of Section(
label = "BASE: 1e55332f",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 666, characterOffset = 55),
onePastEnd = TextPosition(line = 669, characterOffset = 33),
startTokenIndex = 3266,
sizeInTokens = 10,
content = """expected)))
mergeResult match
case FullyMerged(result"""
) into Section(
label = "THEIRS: 7b4fd68698f8ed6b13b067cf87f6abb25db52da1",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 629, characterOffset = 48),
onePastEnd = TextPosition(line = 636, characterOffset = 14),
startTokenIndex = 2541,
sizeInTokens = 18,
content summary = """).get
),
... (
contentsByPath = Map(
"""
), coalescing with following insertion of Section(
label = "THEIRS: 7b4fd68698f8ed6b13b067cf87f6abb25db52da1",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 636, characterOffset = 14),
onePastEnd = TextPosition(line = 636, characterOffset = 47),
startTokenIndex = 2559,
sizeInTokens = 6,
content = "renamedPath -> tokens(leftContent"
) on the right.
The left hand side of the match however was picked up as an unrelated coincident insertion:
17:23:13.594 [io-compute-3] DEBUG c.s.kineticmerge.core.merge$ -- Coincident insertion of Section(
label = "OURS: 5dbb4faca2a2df4351c9f4892da59601913489f3",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 736, characterOffset = 61),
onePastEnd = TextPosition(line = 742, characterOffset = 12),
startTokenIndex = 3543,
sizeInTokens = 18,
content summary = """).get),
... (
contentsByPath = Map(
"""
) on the left and Section(
label = "THEIRS: 7b4fd68698f8ed6b13b067cf87f6abb25db52da1",
path = /Users/gerardmurphy/IdeaProjects/kineticMerge-dogfood-examples/src/test/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysisExtensionTest.scala,
start = TextPosition(line = 832, characterOffset = 39),
onePastEnd = TextPosition(line = 839, characterOffset = 8),
startTokenIndex = 3279,
sizeInTokens = 18,
content summary = """).get
),
... (
contentsByPath = Map(
"""
) on the right.
Regarding that last comment, it seems that not only can preservations be accidentally picked up by the move destination machinery, so can coincident insertions. Now it can be correct for a coincident insertion to form a move destination - a coincident move, to be specific, but here we have some unrelated right edit that is ambiguous as a match with the one constituting a plain coincident insertion that in itself is a degenerate move.
It doesn't seem right to consider the right edit as being a move destination when the other end of the match has been claimed by the coincident insertion.
A further oddity is why degenerate moves are even modelled - if there is no source for the move, we would not consider a left- or a right-insertion is being a degenerate move, so why consider coincident insertions (or edits) as defining a degenerate move if no source can be found? We can't propagate changes this way, nor can we anchor. What was the point of this?
As of 3d3ede67e3afb28471ac194c8177f5424532a95d, MoveDestinations
has been cut over to not allowing degenerate moves in its model, so all moves must have a source. This means that the coincident insertion and the right edit discussed amove are no longer considered to be moves, as neither have a source.
The pesky invariant and then later post-condition mentioned in #91 has been removed altogether - there is no longer anything to police, as the invariant in MoveDestinations
has been tightened to ensure there is always at least one source location.
The problematic merge from #91 now goes through - albeit very slowly, its performance is not in the purview of this ticket.
So back to main thrust of this ticket...
Now have a failing test with a preservation being mixed up with a genuine move in 5735cc4ac28427ee03e6d6745b37b90dc759ab2d...
TODO:
CodeMotionAnalysisExtension.merge
accordingly and gets its tests (and those in MainTest
) passing again.As of 73c3260bd55ec89aef4522c66fe4a96cc62edb55, all the tests pass bar CodeMotionAnalysisExtensionTest.codeMotionAmbiguousWithAPreservation
.
There is also some weirdness in how the move destinations report is populated, where there are multiple move destinations that share the same sources. I suspect this has been the case for some time; this is due to the code in MatchesContext
working back from a destination element to the matches that involve it - that's fine in itself, but means that if a source or sources gets moved to multiple destinations on the same side, then each destination will possess its own match (or ambiguous matches) that in reality are ambiguous with the other destination's matches too.
Given how MoveDestinations
tries to fuse divergent destinations as well as coincident destinations into the same MoveDestinations
instance sharing a common source or sources, any partitioning seems wrong...
Yes, looking on the main branch at c0087160df5be776af04479067b519b0e4525236 reveals the same behaviour - move destinations sharing the same sources are being partitioned. Let's fix that on the side and merge in...
Trying to fix the partitioning of move destinations sharing the same sources leads to 8b1217f125919e3b56fe8287ee38b546dca5cd2c. This introduced a breaking invariant and then fixes that, but has broken other tests.
Merging that work with the branch for this ticket also changes the outstanding test failure in CodeMotionAnalysisExtensionTest.codeMotionAmbiguousWithAPreservation
to something else.
Need to stop and think about this all...
Currently looking at 064605638faf519d2fbf694f74b18cab62fc44a1, which is the merge of both branches, namely changes (and straight moves) being associated with sources, and the move destinations report being subject to an invariant that distinct move destinations instances cannot share sources.
What's gone wrong on the second branch is this code that works out the matches that form a key to a move destination.
The idea was to get back to the sources and then fan back out to grab all the matches, thus avoiding partitioning. The problem is that the fan-out pulls in all matches, and these aren't necessarily genuine code motion. This is a similar problem to the original one that motivated this ticket, only this time it's occurring in the code that notes the move destinations, as opposed to noting the sources of the moves and any content that might need to be migrated!
What I've learned so far...
MoveDestinationsReport.leftMoveOf/rightMoveOf/coincidentMoveOf
may define the source of a move - but only if a suitable destination or destinations can be found. The existence of a match or matches involving the 'moved' element is necessary for a move, but not sufficient. It may be the case that the match is ambiguous with a preservation and nothing else, in which case we're seeing a deletion of some sort. (BTW - those method names are misleading anyway. Even in the code on the main branch, they are called speculatively to pick up potential moves.)As of d1973bb25fbbea61ba6f8359f04a7ad42b74ec55, got a work-in-progress implementation that passes all tests in MainTest
but fails several in CodeMotionAnalysisExtensionTest
, including the all-important CodeMotionAnalysisExtensionTest.codeMotionAmbiguousWithAPreservation
.
However, that last test does show the correct behaviour of keeping preservations and moves separate as regards edit / delete migrations, so this is progress.
What is bad is:
MoveDestinationsReport.evaluateSpeculativeSourcesAndDestinations
should spoon-feed the anchors to the insertion migration machinery, as it has a much better idea of what's going on. That would lead nicely to #83 too.What is good is:
MergeResultDetectingMotion
's delegating merge algebra has already gone and done a two-way resolution, which makes the substitution process more complex. If only the merge could accommodate unresolved coincident entries...Regarding that last point about coincident entries being represented in unresolved form, perhaps we could represent then as insertion versus insertion conflicts or edit versus edit conflicts in the implementation of MergeResultDetectingMotion
?
If so, that would allow three-way resolution to be done in evaluateSpeculativeSourcesAndDestinations.evaluateSpeculativeSourcesAndDestinations
.
I'm not sure whether to add a pair of substitutions for the left- and right-side contribution to the three-way resolution there, and then to add additional pairs of substitutions referring to the two-way resolution for all the speculative coincident merge destinations that didn't turn out to be moves after all?
This may need breaking out into its own ticket...
Finally got the dreaded CodeMotionAnalysisExtensionTest.codeMotionAmbiguousWithAPreservation
to pass as of 6a45339e22b04ca9471fea9c1aca58b1024d0f8f, but there straggler test failures in both MainTest
and CodeMotionAnalysisExtensionTest
, concerning insertion migration.
The implementation is a bit hacked, and evaluateSpeculativeSourcesAndDestinations.evaluateSpeculativeSourcesAndDestinations
should probably be recast to start with the sources to find matches, whittling these down against the destinations. That would allow a cleaner treatment of coincident move destionations.
There is still the outstanding resolution work to finesse, but that can go into its own ticket.
The question is, given how quintessentially nasty the insertion migration code is, is it worth making a final push on this ticket to get the tests passing, or should this segue into #83?
The story so far as of 99e73229c2d5c2b68b25cd1e17cf820744f9fc63.
Got MainTest
to pass in its entirety, but CodeMotionAnalysisExtensionTest
is falling down on the tests that have intra-file versus inter-file code motion with associated edits. The idea was for the insertion migration to pick up the edit instead and then suppress the original edit migration, at least I think that is what CodeMotionAnalysisExtensionTest.furtherMigrationOfAMigratedEditAsAnInsertion
relies on.
It seems that the idea of evaluating speculative content migrations by source against speculative move destinations is sound in itself, but this doesn't play well at all with the insertion migration. This may be down to the way that divergent moves are dealt with (because we have an intra- versus inter-file move across the left- and right-side), or perhaps the awkward treatment of the speculative conflicts where a speculative edit migration is picked up isn't playing well with insertion migration.
Another gotcha lurking in the wings as how coincident insertions / edits are dealt with - resolution is applied on the fly in a desparate attempt to conform to MergeResult
wanting to have sequences of plain elements in it. There is no way of representing two or three elements in unresolved form in the merge result, and I'm starting to think this would be much better when #41 is worked on, because that won't just choose a section when resolving them, it will synthesize one to pick up the whitespace adjustment. That won't play well with the existing code's approach of searching-and-replacing sections to perform edit substitution, as well as suppressing the migrated edit's original location (and migrated insertion's original location too), as well as suppressing destinations for migrated edits competing with an insertion migration.
Similarly, conflicts are a bit hokey - each side gets its own linear sequence of sections, but we know from the conflict handling in the merge algebra exactly where the conflicting runs are. This isn't quite the same as handling unresolved elements, but again feels messy.
Finally, I wonder whether the merge algebra from MergeResultDetectingMotion
should be split into two algebra implementations, one that drives evaluation of moves, and another that forwards on to the core merge algebra. The idea is to perform the per-file merges, evaluate the moves on aggregate, then tell the forwarding merge algebras to delegate to the core merge algebra, but with the full picture of the evaluated moves. This brings us back in spirit to the old merge algebra in MatchesContext
that could make binding decisions as to whether to record a conflict as is, or as being resolved. That all worked nicely with the old code!
As of ec9fef6afcf3f78b01ef768cf63a47463fb100fd, there is just one test case failing in CodeMotionAnalysisExtensionTest.codeMotionAcrossAFileRename
, and that concerns this situation:
// BASE - ORIGINAL FILE.
protected val heyDiddleDiddleInModernForm: String =
"""
|Hey diddle diddle,
|The cat and the fiddle,
|The cow jumped over the moon;
|The little dog laughed
|To see such sport,
|And the dish ran away with the spoon.
|""".stripMargin
// LEFT - RENAMED FILE.
protected val heyDiddleDiddleInPsychoticForm: String =
"""
|Hey diddle diddle,
|The cat and the fiddle,
|The cow, laughing madly, ran away with the spoon and then jump'd over the moon;
|The little dog laugh'd
|To see such craft,
|And the fork ran away with the spoon.
|""".stripMargin
// RIGHT - ORIGINAL FILE.
protected val heyDiddleDiddleWithIntraFileMove: String =
"""
|Hey diddle diddle,
|The cat and the fiddle,
|The cow jumped!
|The little dog laughed
|To see such sport,
|And the dish ran away with the spoon over the moon.
|""".stripMargin
// EXPECTED - RENAMED FILE.
protected val heyDiddleDiddleInPsychoticFormExpectedMerge: String =
"""
|Hey diddle diddle,
|The cat and the fiddle,
|The cow, laughing madly, ran away with the spoon and then jump'd!
|The little dog laugh'd
|To see such craft,
|And the fork ran away with the spoon over the moon.
|""".stripMargin
What the merge comes up with is:
Hey diddle diddle,
The cat and the fiddle,
The cow, laughing madly, !
ran away with the spoon over the moonand then jump'd !
The little dog laugh'd
To see such craft,
And the fork ran away with the spoon over the moon.
The thing is, bar the duplication of the !
, the actual result seems better than then expected one - the text over the moon
is inserted after ran away with the spoon
, and that anchor is moved on the left as an inter-file move, but with two destinations.
So why shouldn't the inserted text be migrated to the two anchor move destinations? Yes, it is part of a divergent insertion on the left and right, but so what? We no longer model such things as being interesting as they have no source nor associated migration content, so they are not considered special.
The treatment of the !
is part of the general hokum that is the existing insertion migration implementation.
It's time to admit this has gone far enough.
The oustanding failing test case has been disabled, closing this to continue the work in the context of #83.
This came up when thinking about how to implement #83 .
Change migration relies on
MatchesContext
to recognise the source of moves when there are deletions and edits to be migrated.MatchesContext
also recognises the destination of moves in the context of handling insertions and edits. Amongst other things, this forms of part of finding anchors as a prelude to performing insertion migration. The move destinations report is also built up from this.So far, so good - only both areas of code examine matches to complete the picture of either where the move is going (for change migration) or where the move has come from (for move destinations).
This isn't always correct when there are ambiguous matches; yes, it is expected to have a valid situation where a change might be migrated to several destinations (all on the same side of the overall merge), or to have several anchors with the same content (all on the same side of the overall merge) being moved to the same location. A mix of these is possible with multiple sources and destinations too.
Where it falls down is when there is an obvious move of content from a source location to a destination location - either intra-file or inter-file, but elsewhere there is a simple preservation of content that matches the moved content. We do not want to consider the parts of the preservation is being candidates for a move source or destination.