Closed sageserpent-open closed 5 months ago
Exhibit A from merging the same repository as in #23.
Using the default match threshold results in an odd merge that on reflection may still be an orphaned text merge:
The text whatever
comes from a completely unrelated location in the file at around line 222, but has ended up at line 371 in a merge conflict.
Using an explicit threshold via --match-threshold=10%
leads to the same outcome.
Repeating the merge with --match-threshold=0.01
yields this:
At least the rename is in the right place, sort of...
Repeating the merge with --match-threshold=0.007
yields the obviously correct merge. It should be shaken, not stirred, apparently.
Enabled logging with a temporary patch...
(Incidentally, might it be useful to switch on logging permanently for the application when run loose
from an IDE, but suppressed when run via the packaged Coursier executable?)
Running with --match-threshold=0.01
yields:
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Minimum match window size across all files over all sides: 507
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Minimum sure-fire match window size across all files over all sides: 507
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Maximum match window size across all files over all sides: 5076
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 2791, number of matches is: 835, looking for a more optimal match with estimated window size of: 3625.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 3625, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 2065, number of matches is: 807, looking for a more optimal match with estimated window size of: 2871.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 2871, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 1097, number of matches is: 354, looking for a more optimal match with estimated window size of: 1450.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 1450, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 742, number of matches is: 2, looking for a more optimal match with estimated window size of: 743.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 743, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 565, number of matches is: 32, looking for a more optimal match with estimated window size of: 596.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 596, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-15] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search for matches whose size is no less than the sure-fire match window size of: 507 has terminated; results are:
Running with --match-threshold=0.007
yields:
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Minimum match window size across all files over all sides: 35
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Minimum sure-fire match window size across all files over all sides: 35
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Maximum match window size across all files over all sides: 5076
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 2555, number of matches is: 1071, looking for a more optimal match with estimated window size of: 3625.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 3625, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 1829, number of matches is: 1043, looking for a more optimal match with estimated window size of: 2871.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 2871, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 743, number of matches is: 708, looking for a more optimal match with estimated window size of: 1450.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 1450, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 742, number of matches is: 2, looking for a more optimal match with estimated window size of: 743.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 743, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 388, number of matches is: 209, looking for a more optimal match with estimated window size of: 596.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 596, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 174, number of matches is: 127, looking for a more optimal match with estimated window size of: 270.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 270, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 152, number of matches is: 52, looking for a more optimal match with estimated window size of: 203.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 203, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 118, number of matches is: 16, looking for a more optimal match with estimated window size of: 133.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 133, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 83, number of matches is: 10, looking for a more optimal match with estimated window size of: 92.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 92, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 48, number of matches is: 6, looking for a more optimal match with estimated window size of: 51.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 51, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an improved match at window size: 42, number of matches is: 8, looking for a more optimal match with estimated window size of: 49.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search has found an optimal match at window size: 49, number of matches is: 1, restarting search to look for smaller matches.
[io-compute-12] INFO com.sageserpent.kineticmerge.core.CodeMotionAnalysis$ - Search for matches whose size is no less than the sure-fire match window size of: 35 has terminated; results are:
There were a lot of missed opportunities when the match threshold was 10%.
Given that the low level merge algorithm knows nothing about the fine structure of a section (although it can do a simple comparison of one unmatched section against another), it perhaps not so surprising that it gets confused with long runs of unmatched content.
There is a odd situation where if the merge algorithm is presented with just tokens, then it a) chokes on the number of tokens and b) is perhaps suboptimal at that level of detail, which was presented as the cause of the problem dealt with in issue #19. That led to the full implementation of CodeMotionAnalysis
and cut over of the merge to work at section level.
However, a lot of work was done on the low-level merge algorithm in that ticket subsequent to the cutover to section level merging, so maybe the initial diagnosis wasn't correct.
Furthermore, it may be the case that it is optimal to work at section level from the point of view of not choking the merge with umpteen tokens, but perhaps the unmatched sections should be exploded into smaller ones, possibly right down to single token sections - that might let the merge algorithm see inside the detail in the unmatched content and improve the matches there.
Trying this out in a spike yielded very disappointing results in the tests, and caused a stack overflow when running the application on the example above.
Perhaps the default match threshold should be estimated from the file sizes so as to ensure that all files see potential matches of 20 tokens or more?
Or could the presence of overly large unmatched sections trigger a diagnostic telling the user to decrease the match threshold down to some suggested level?
Maybe work in terms of a minimal match size across all files and forget about the match threshold? That would simplify the match search logic into the bargain.
Something completely different - if two sections across the left and right are not equal in terms of being part of the same match or having the same content, could they be merged at the token level by a recursive invocation of the merge algorithm on the section contents?
CodeMotionAnalysisTest.matchingSectionsAreFound
has had its expectations cut over so that they check the consistency of matches separately from the coverage of the expected common sequences - this allows a more flexible approach to the latter that can cope with an unexpectedly larger match being found that subsumes a common sequence, as well as runs of abutting small matches that combine to cover the a common sequence.Using a match threshold of 0.01 yields the following merge:
21:11:42.498 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,0,743) as it is common to all three sides.
21:11:42.499 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,743,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,743,1).
21:11:42.499 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,744,596) as it is common to all three sides.
21:11:42.500 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1340,110) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1340,111).
21:11:42.500 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Conflict between left deletion of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1450,1) and its edit on the right into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1450,1).
21:11:42.501 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1451,51) as it is common to all three sides.
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1502,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1502,1).
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1503,203) as it is common to all three sides.
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1706,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1706,1).
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1707,270) as it is common to all three sides.
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1977,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1977,1).
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1978,92) as it is common to all three sides.
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2070,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2070,1).
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2071,133) as it is common to all three sides.
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2204,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2204,1).
21:11:42.502 [io-compute-12] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2205,2871) as it is common to all three sides.
Note the long edit on the left from start offset of 1340, size 110 to make a new sequence from start offset 1340, 111.
This is followed by a conflict between a left deletion of 1450, size 1 and its edit on the right into 1450, size 1.
Possibly this is the whenever
token being edited into whatever
on the right, preceded by a long stretch of tokens that were the same on the base and right, but which didn't match on the left (probably because of the test method renames). The whenever
was considered part of the left-hand sides version of this long stretch, hence the size of 111 instead of 110. So whenever
was considered to have vanished on the left and been edited into whatever
on the right.
Using match threshold of zero by contrast yields the following merge:
21:28:42.347 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,0,743) as it is common to all three sides.
21:28:42.347 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,743,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,743,1).
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,744,596) as it is common to all three sides.
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1340,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1340,1).
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1341,28) as it is common to all three sides.
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1369,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1369,1).
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1370,30) as it is common to all three sides.
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1400,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1400,1).
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1401,49) as it is common to all three sides.
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Right edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1450,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1450,1).
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1451,51) as it is common to all three sides.
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1502,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1502,1).
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1503,203) as it is common to all three sides.
21:28:42.348 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1706,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1706,1).
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1707,270) as it is common to all three sides.
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1977,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1977,1).
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,1978,92) as it is common to all three sides.
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2070,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2070,1).
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2071,133) as it is common to all three sides.
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Left edit of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2204,1) into SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2204,1).
21:28:42.349 [io-compute-7] DEBUG c.s.kineticmerge.core.merge$ -- Preservation of SectionImplementation(/Users/gerardmurphy/IdeaProjects/americium-242m/src/test/java/com/sageserpent/americium/java/junit5/TrialsApiTests.java,2205,2871) as it is common to all three sides.
This time, the long section from start offset 1340 is broken down into lots of smaller left edits and preservations, including a right edit from start offset of 1450, size 1 to make a new single token sequence from start offset 1450, 1. This is preceded by a preservation from start offset 1401, size 49. So in this situation the more precise breakdown exposes the edit on the right.
Looks like the same long base and right pairwise match with base start offset of 0 and size 1450 was made in both situations, but this was broken down crudely with the higher match threshold by a couple of large all-sides matches, resulting in a base and right pairwise match at base start offset 1340, size 110. With the zero match threshold the breakdown was finer, resulting in amongst other things a base and right pairwise match at base start offset 1400, size 1 - which was followed by (and due to) an all-sides match at base start offset 1401, size 49.
That all-sides match would not have been found in the first situation, as the minimal window size there was 50!
The manually tested merge runs in a reasonable amount of time with a match threshold of zero, and this default doesn't break any tests. So the default match threshold has been moved down to zero, on the provision that monograph and digraph matches are now forbidden.
This is the state of play as of Git commit SHA d0677159f98a6e76dd2a704a5653be4834f1dde2.
Time to ship it!
Maybe not quite yet ... the current default match threshold of zero relies on their being a backstop minimal window size of 3 to prevent terrible performance / odd merges.
I'm no longer so sure that the concept of a match threshold is the best approach - it was inspired by Git's similarity setting, but when Kinetic Merge uses it, it means that the minimal window size varies from file to file - that makes it hard for an end user to know what it is they're getting. If say a piece of code moves from a very small file on one branch to a much bigger file on another (or the original file accumulates more content on another), then that code may pass the match threshold on the one branch and not on the other!
Another approach to tweaking the merge is to specify the backstop minimal window size as a command line parameter in its own right, setting the default to 3. Now if that is combined with the default match threshold of zero, then it will block all small-fry matches which due to the linear search are usually quite costly.
Although the window size is expressed in terms of tokens and not characters, as long as the user treats the setting as a magic adjustment, it may prove more workable.
The match threshold concept will be left as is, but will be overridden by the backstop if the threshold would lead to smaller window sizes. We'll see which provides the best form of control...
Introduced command line flag --minimum-match-size
, defaults to 4 in Git commit SHA: 30274b30fb95466648dac45727b4f0637e52392f.
Went out in release 0.2.0, Git commit SHA: 3c9f090f79de1bdc82c079c16e122f668685df29.
This follows on from #23.
Producing the manual test evidence for that ticket revealed some odd looking merges that didn't seem to have anything to do with the orphaned text problem that was the concern of that ticket.
So this ticket is about investigating those odd merges.
Are they bugs in the implementation? Or perhaps the implementation fufils its requirements, but the requirements themselves allow these odd cases?
Either way, the user expects to see a nice intuitive merge with no surprises.