Closed sageserpent-open closed 6 months ago
Looking at the breakage of MainTest.cleanMergeOfAFileModifiedInBothBranches
, something is wrong with merge
.
CodeMotionAnalysis
does the right thing (although more on that later), it takes the following content:
BASE
Hello, my old mucker!
Sections:
A - Hello, my old mucker!
6 tokens.
LEFT
Hello, my old mucker!
Pleased to see you, old boy.
Sections:
A - Hello, my old mucker!
6 tokens.
B - Pleased to see you, old boy.
8 tokens.
RIGHT
Hello, all and sundry.
Sections:
C - Hello, all and sundry.
6 tokens.
It matches A across the base and left side.
Note how the token Hello
did not form an all-sides match, as CodeMotionAnalysis
has deliberately engineered logic to stop an all-sides match when two of its putative sections are subsumed within a longer pairwise match - and this is exactly what is happening here.
Now, part of the justification for allowing a longer pairwise match to suppress a shorter all-pairs match is what the merge is expected to do. If we have a merge of:
BASE: A LEFT AB RIGHT C
Then I thought that this is interpreted as an edit of A into C on the right, plus a following insertion of B after A on the left - so the expected merge would be CB.
Applying that logic to presumably make it irrelevant as to whether we have a long pairwise match on Hello, my old mucker!
being edited into Hello, all and sundry.
, versus a short all-sides match of Hello,
being preserved and a short pairwise match of my old mucker!
being edited into all and sundry.
. The result is still the same.
So my suspicions about CodeMotionAnalysis
are allayed for now. It will be interesting though to see how this suppression of all-sides matches works out in the wild, but let's leave that for now.
Where it gets fishy is that the actual merge isn't CB - instead we see a conflict, the left hand being B and the right hand being C.
The merge algorithm has gone for a deletion of A on the right, followed by an insertion of C - the deletion goes through cleanly, but the insertion of C on the right conflicts with the insertion of B on the left.
I could have sworn a whole load of work went into supporting - and testing for - the detection of edits!
What happened... ?
Wrote MergeTest.simpleMergeOfAnEdit
in Git commit SHA 670a28bf63e7efc0e02d7bb0054d01cb5b061a9c, this reproduces the problem seen in MainTest
but down at the unit level.
This test was then back-ported to Git commit SHA fea076322a4c1de6a26216301116b66479cd683b which is based on an early version of Merge
, and that test passes. Modifying the test expectations as a temporary fault exhibits breakage, so it really is checking the result when it passes.
Did a painful linear progressive merge - couldn't use git bisect
as this would have to involve unshelving into each bisected commit - given issues with the SBT build compatibility and some merge conflicts, it was easier to just re-evolve the changes step-by-step on to a new branch.
This revealed that the test did pass in back-ported form, as I expected, but was broken in Git commit SHA 3c63dc3a6dd8260e736156e19a8dc5b110dadb12 (merged into 1acfb3ac226da58396d19189a7cf4b99d658ff37 for the reproduction against the back-ported test) Ironically, the commit message states:
Simplify the implementation of Merge
. Not sure if this entirely a good idea, as it trades extra code for explicit handling of corner cases against less code using implicit state transformations to do the same thing...
Ho-hum.
Given that all the other tests in MergeTest
pass, it is clear that MergeTest.fullMerge
isn't assiduous enough.
It would be worth detecting the failure via MergeTest.fullMerge
to restore confidence prior to going further.
Decided to just do some hacking, reverting the offending commit and merging it through to result in Git commit SHA 20c308b99e0d868ba29a5e2806162eda2d2f425b.
This is not fit for delivery, see below - but the results are encouraging.
Had to hack the demonstration repository (we'll get to that), but the part where Git reported a conflict and Kinetic Merge quietly made a mess now produces this:
packageExecutable := {
val packagingVersion = (ThisBuild / version).value
println(s"Packaging executable with version: $packagingVersion")
val localArtifactCoordinates =
s"${organization.value}:${name.value}_${scalaBinaryVersion.value}:$packagingVersion"
val executablePath = s"${target.value}${Path.sep}${name.value}"
s"cs bootstrap --verbose --bat=true --scala-version ${scalaBinaryVersion.value} -f $localArtifactCoordinates -o $executablePath" !
name.value
},
That's exactly what I wanted to see.
Exhibit D - CodeMotionAnalysisExtensionTest
console output as of Git commit SHA 20c308b99e0d868ba29a5e2806162eda2d2f425b:
"""
I wandered lonely as a cloud
That floats on high o'er vales and hills,
When all at once I saw a crowd,
A host, of small fishing boats;
Astride the sea, beneath the quay,
Rocking and swaying in the breeze.
Why this allusion?
I Havant a clue!
Along the margin of a bay:
Ten thousand (well, maybe not quite) I thought, 'Was this part of the job role?'.
'Should I be expected to deal with flowers?'
The waves beside them danced; but they
Out-did the sparkling waves in glee:
A poet could not but be gay,
In such a jocund company:
I gazed—and gazed—but thought only of
raising this in the next Zoom meeting.
For oft, when on my Aeron I slouch
In vacant or in pensive mood,
They flash upon that inward eye
Which is the bliss of solitude;
And then my heart with pleasure fills,
And sends an email to human resources.
"""
*********************************
"""
I wandered lonely as a cloud
That floats on high o'er vales and hills,
When all at once I saw a crowd,
A host, of small fishing boats;
Astride the sea, beneath the quay,
Rocking and swaying in the breeze.
Why this allusion?
I Havant a clue!
Along the margin of a bay:
Ten thousand (well, maybe not quite) sashays with the fishing boats.
"""
What were those caveats?
MergeTest.fullMerge
needs bolstering / rework to detect the bug introduced in 3c63dc3a6dd8260e736156e19a8dc5b110dadb12, then merging up with ongoing work.build.sbt
that set up library dependencies. It looks like this is a case of ambiguous matches that overlap, need to think about what exactly is overlapping and whether it could be avoided, or should somehow be disregarded. In the meantime, the repository has some nasty extra commits added to the destination and their branch to remove these pesky stanzas.MappedContentSources.filesByPathUtilising
, an exception of type OverlappingSections
is thrown. That was fine back when said class was actually FakeSources
and was a stunt double for CodeMotionAnalysisTest
- the exceptions were caught by the test and used to reject the test case. That doesn't play well in Main
and causes it to fault without rolling back the merge in progress. The code has been hacked to deal with this in Main
, but this needs thinking about and doing properly.build.sbt
after the merge - a common ancestor one in stage 1, the destination branch's one in stage 2 and theirs in stage 3. There is also a conflicted form of build.sbt
left in the working tree, only this doesn't actually have any conflicts. It appears that while Git can't resolve that merge when it has a go, the invocation of git merge-file
from Kinetic Merge is able to completely resolve the changes based on the staged files as created by Kinetic Merge. Now that is intended, but I'm surprised that was the outcome here - I was expecting either a direct, clean merge, or a conflicted one with markers. Here, the conflict is reported but git merge-file
quietly saves the day.CodeMotionAnalysis
? Hmm...git merge-file
might just be sorting out trivial conflicts due to the removed stanzas. Need to look into this.Test case 19 in CodeMotionAnalysis.matchingSectionsAreFound
- must try harder to find the optimal match in the genetic algorithm. Maybe there should be a single cutoff for match sizes across all files across all sides, and we can dispense with the genetic algorithm. Or use hinting there too...
The story so far as of Git commit SHA 420b84b6e87dc5f1780aed511f29278381623d13...
CodeMotionAnalysisExtensionTest
has been generalised to smoke test merging of prose and a simplified form of the SBT sources this ticket is about.MergeTest.fullMerge
.CodeMotionAnalysis
has been, 'enhanced' with a configuration parameter that allows it to either propagate exceptions - specifically the OverlappingSection
exceptions seen so often in the tests versus trapping them into an Either
so that Main
can utilise its roll-back logic. This also fixes the breakages in CodeMotionAnalysisTest
caused by the prevalance of small fry. The good news is that the previous enhancement results in a lot less of the rejected test cases without breaking anything.CodeMotionAnalysisExtensionTest
, possibly this is because the files are shortened.CodeMotionAnalysisExtensionTest
to pass. There should be some logic to configure how hard the genetic algorithm has to work based on file sizes and thresholds etc.I'm not really sure that the merge is in general trustworthy yet - need to do some general dogfooding, but the issue this ticket refers to seems to be on its way to being fixed.
Examining the number of evaluations of a phenotype shows that given the number of window sizes that come under the minimum sure-fire size, there are far more evaluations performed by the genetic algorithm than there are window sizes.
Example numbers from CodeMotionAnalysis.matchingSectionsAreFound
:
Sizes: [1, 45) - evaluations: 146 - 41 + 1 = 106 Sizes: [1, 8) - evaluations: 238 - 23 + 1 = 216 Sizes: [3, 140) - evaluations: 323 - 26 + 1 = 298 Sizes: [1, 68) - evaluations: 628 - 26 + 1 = 603 Sizes: [1, 27) - evaluations: 428 - 32 + 1 = 397 Sizes: [8, 154) - evaluations: 825 - 25 + 1 = 801 Sizes: [4, 106) - evaluations: 528 - 35 + 1 = 494
Example numbers from CodeMotionAnalysisExtensionTest.merging
:
Sizes: [1, 5) - evaluations: 40 - 17 + 1 = 24 Sizes: [1, 4) - evaluations: 24 - 17 + 1 = 8 Sizes: [1, 3) - evaluations: 0 Sizes: [1, 2) - evaluations: 0
In other cases, the genetic algorithm is not invoked at all.
Example numbers from MainTest
:
No invocation of genetic algorithm at all.
It is clear that a simple linear search down all the small-fry window sizes will be much more efficient than the genetic algorithm, and definitive in its result. Turning down the dials on the genetic algorithm to match performance is likely to miss a few matches, this has already been seen and has required the dials to go up.
The genetic algorithm is no longer used as of Git commit SHA c94b207d37c2a57ef074138c2332e0eb307f0b8e (and the tests pass, after fixing a bug in CodeMotionAnalysisTest.matchingSectionsAreFound
exposed by the implementation change in the SUT - this has been back-ported prior to the implementation change to check that the test passes in both cases, see Git commit SHA 87441fa1ea53d3a93f85962f6ce79ba72a101b29).
It was a surprise to see that performance is slightly worse overall in CodeMotionAnalysisTest.matchingSectionsAreFound
compared with the back-port. The back-port using the genetic algorithm achieves 34 test cases in 10 minutes, compared with 32 test cases in 10 minutes when using the linear search.
In general, using linear search beats the genetic algorithm, but there are two exceptions where the genetic algorithm commit takes 28 seconds / 23 seconds versus 1 minutes 20 seconds / 1 minute 36 seconds.
The window size ranges for the genetic algorithm / linear search are [10, 264), [8, 504). That's a far bit larger than the examples above.
Given that the linear search is systematic, and that CodeMotionAnalysisExtensionTest
completes 30 test cases in 8 seconds, I'm of a mind to just accept those two odd test cases and carry on with linear search.
NOTE: remember to replace DONEPhenotype
with something simpler, now that the genetic algorithm has gone. Would be nice to keep the invariant check and the helper method API, though.
What's been going on? MergeTest
has been overhauled significantly in an attempt to make it more comprehensible, as I couldn't follow what it was doing after having not worked on it for over two months. There's a lesson there...
The rework exposed a new bug in merge
in addition to the one that had slipped in; this led to a round of back and forth between merge
and MergeTest
where either a new bug in the SUT was found and corrected, or the test itself was shown to have a bug in its logic and was generating either faulty expected merges or misleading move sequences not consistent with a correct expected merge.
It is fair to say even with the rework, MergeTest
is still a very opaque and truculent thing; I've added comments to help demystify it, but it isn't going to serve the role as 'test-as-specification'. It is however really good now at rooting out bugs in both the SUT and also in my own thinking about the edge-cases for the merge. Line coverage of merge
is now 100% with the minor exception of - wait for it - an exception class for future divergence detection.
Consequently, I've added several explicit small tests to help plug the gap in a lack of specification-style tests, usually adding these after MergeTest.fullMerge
has shrunk down a failure to something that is easy to analyze. Got a bit delinquent about this towards the end, so I'm going to write up what the merge tries to achieve as code comments and cross-check with the specification tests once the dust has settled.
In the meantime, we have the unwrapping of Phenotype
from CodeMotionAnalysis
and the threshold configuration to do, and then I think we're good.
MainTest
and CodeMotionAnalysisExtensionTest
are still passing - the latter now fully merge the SBT sources in all test cases, which is a great result. The SBT source merges look sensible when randomly sampling them, which is heartening.
Manual testing of the example above using the build from Git commit SHA 491d8d6541b34dabbd9a67cd625cbbdbef31bfc5 yields the following excerpt:
Compile / resourceGenerators += Def.task {
val location = versionResource.value
val packagingVersion = (ThisBuild / version).value
println(
s"Generating version resource: $location for version: $packagingVersion"
)
IO.write(location, packagingVersion)
Seq(location)
}.taskValue,
packageExecutable := {
val packagingVersion = (ThisBuild / version).value
println(s"Packaging executable with version: $packagingVersion")
val localArtifactCoordinates =
s"${organization.value}:${name.value}_${scalaBinaryVersion.value}:$packagingVersion"
val executablePath = s"${target.value}${Path.sep}${name.value}"
s"cs bootstrap --verbose --bat=true --scala-version ${scalaBinaryVersion.value} -f $localArtifactCoordinates -o $executablePath" !
name.value
},
packageExecutable := (packageExecutable dependsOn publishLocal).value,
libraryDependencies += "com.github.scopt" %% "scopt" % "4.1.0",
libraryDependencies += "org.typelevel" %% "cats-collections-core" % "0.9.6",
libraryDependencies += "org.typelevel" %% "cats-core" % "2.10.0",
libraryDependencies += "org.typelevel" %% "cats-effect" % "3.5.2",
libraryDependencies ++= Seq(
"dev.optics" %% "monocle-core" % "3.2.0",
"dev.optics" %% "monocle-macro" % "3.2.0"
),
libraryDependencies += "org.scala-lang.modules" %% "scala-parser-combinators" % "2.3.0",
libraryDependencies += "com.lihaoyi" %% "os-lib" % "0.9.1",
That will do nicely for this ticket.
A couple of things left:
git merge-file
is makes to produce the conflicted merge file in the working directory yields a fully merged file. Looking at this, it seeems that a conflicting merge is generated, but the right hand file is just the same as the left hand one, only with an extra part appended. The contents of the appended part are also a bit puzzling. I'm unsure as to whether this is down to the work done on this ticket, or has always been there, but I'm inclined to put and end to this one and investigate under a following ticket.CodeMotionAnalysisTest.matchingSectionsAreFound:
With the cache: 32 cases in 10 minutes 2 seconds, 29 cases in 10 minutes 5 seconds, 29 cases in 10 minutes 2 seconds.
Without the cache: 33 cases in 10 minutes 3 seconds, 33 cases in 10 minutes 4 seconds, 30 cases in 10 minutes.
CodeMotionAnalysisExtensionTest:
With the cache: 8 seconds, 7 seconds
Without the cache: 7 seconds, 7 seconds
It is fair to say that the cache adds no value - probably because it is part of a chain of computation that already involves a cache for some heavyweight computation.
At long last, delivered in Git commit SHA: 0e12c1f0c3b819f1fa1ea7aec78da7bde7259ef3.
This went out in release 0.1.10, Git commit SHA 64fd5448f2f9e6d36953c86345d069d62d1d3a2f.
Doing some dogfood testing of Kinetic Merge on its own repository yielded this peculiar (and incorrect) successful merge result:
(The left hand is the base, the middle is what Git's merge produced and the right hand is what Kinetic Merge produced).
Doing the same with Git's merge produced a conflicted merge, which to be honest seems a better result.
Need to analyse this and decide: