sageserpent-open / kineticMerge

Merge a heavily refactored codebase and stay sane.
MIT License
12 stars 2 forks source link

Bad Merge of Kinetic Merge Repository. #37

Closed sageserpent-open closed 7 months ago

sageserpent-open commented 7 months ago

Version

1.1.1

Reproduction

git clone git@github.com:sageserpent-open/kineticMerge.git bugReproductionForKineticMerge
cd bugReproductionForKineticMerge
git checkout -b ourBranch problemMergeOurSide

<path to Kinetic Merge>/kinetic-merge --no-commit problemMergeTheirSide

git status

That command sequence yields the output:

Conflict - file .../bugReproductionForKineticMerge/src/main/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysis.scala was modified on our branch ourBranch and modified on their branch problemMergeTheirSide.
Merge conflicts found, handing over for manual resolution...

On branch ourBranch
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Changes to be committed:
    modified:   .gitignore
    modified:   src/main/scala/com/sageserpent/kineticmerge/ProgressRecording.scala

Unmerged paths:
  (use "git add <file>..." to mark resolution)
    both modified:   src/main/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysis.scala

Expected Behaviour

It would have been nice to get a clean merge, but even a conflicted merge with a few minor tweaks required would be OK. What results is much worse...

First, the good news - the changes on our side are a refactoring, making the method defined in the class MatchesAndTheirSections.withAllMatchesOfAtLeastTheSureFireWindowSize into a factory method of the same name within the companion object. On the other side there are various changes made in the body of the method in its original location.

These changes made on the other side merge seamlessly into the moved method's new location. Great!

However, the rest of the file is mangled:

Manually editing out these problems and compiling reveals that the methods MatchesAndTheirSections.withoutTheseMatches and MatchesAndTheirSections.withMatches have vanished!

sageserpent-open commented 7 months ago

Staging the conflicted merged file and doing a diff yields this:

diff --git a/src/main/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysis.scala b/src/main/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysis.scala
index eb9bc0b..b4d8b49 100644
--- a/src/main/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysis.scala
+++ b/src/main/scala/com/sageserpent/kineticmerge/core/CodeMotionAnalysis.scala
@@ -140,15 +140,11 @@ object CodeMotionAnalysis extends StrictLogging:
         Caffeine.newBuilder().build()

       def withAllMatchesOfAtLeastTheSureFireWindowSize() =
-        def progressForMatchSize(matchSize: Int): Int =
-          maximumFileSizeAcrossAllFilesOverAllSides - matchSize
-
         Using(
           progressRecording.newSession(
-            progressForMatchSize(
-              minimumSureFireWindowSizeAcrossAllFilesOverAllSides
-            )
-          )
+            label = "Minimum match size considered:",
+            maximumProgress = maximumFileSizeAcrossAllFilesOverAllSides
+          )(initialProgress = maximumFileSizeAcrossAllFilesOverAllSides)
         ) { progressRecordingSession =>
           @tailrec
           def withAllMatchesOfAtLeastTheSureFireWindowSize(
@@ -225,8 +221,7 @@ object CodeMotionAnalysis extends StrictLogging:
                       s"Search has found an optimal match at window size: $candidateWindowSize, number of matches is: $numberOfMatchesForTheGivenWindowSize, restarting search to look for smaller matches."
                     )
                     progressRecordingSession.upTo(
-                      progressForMatchSize(candidateWindowSize)
-                    )
+                      candidateWindowSize)
                     withAllMatchesOfAtLeastTheSureFireWindowSize(
                       stateAfterTryingCandidate,
                       looseExclusiveUpperBoundOnMaximumMatchSize =
@@ -252,9 +247,7 @@ object CodeMotionAnalysis extends StrictLogging:
                   s"Search for matches whose size is no less than the sure-fire match window size of: $minimumSureFireWindowSizeAcrossAllFilesOverAllSides has terminated; results are:\n${pprintCustomised(fallbackImprovedState)}"
                 )
                 progressRecordingSession.upTo(
-                  progressForMatchSize(
-                    minimumSureFireWindowSizeAcrossAllFilesOverAllSides
-                  )
+                  minimumSureFireWindowSizeAcrossAllFilesOverAllSides
                 )
                 fallbackImprovedState
               else
@@ -266,8 +259,7 @@ object CodeMotionAnalysis extends StrictLogging:
                   s"Search has found optimal matches at window size: $bestMatchSize, restarting search to look for smaller matches."
                 )
                 progressRecordingSession.upTo(
-                  progressForMatchSize(bestMatchSize)
-                )
+                  bestMatchSize)
                 withAllMatchesOfAtLeastTheSureFireWindowSize(
                   fallbackImprovedState,
                   looseExclusiveUpperBoundOnMaximumMatchSize = bestMatchSize
@@ -788,231 +780,35 @@ object CodeMotionAnalysis extends StrictLogging:
           .matchesAndTheirSections
       end withPairwiseMatchesEatenInto

-      private def withoutTheseMatches(
-          matches: Iterable[Match[Section[Element]]]
-      ): MatchesAndTheirSections =
-        matches.foldLeft(this) {
-          case (
-                matchesAndTheirSections,
-                allSides @ Match.AllSides(
-                  baseSection,
-                  leftSection,
-                  rightSection
-                )
-              ) =>
-            val basePath  = base.pathFor(baseSection)
-            val leftPath  = left.pathFor(leftSection)
-            val rightPath = right.pathFor(rightSection)
-            matchesAndTheirSections
-              .focus(_.baseSectionsByPath)
-              .modify(_.updatedWith(basePath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - baseSection)
-              })
-              .focus(_.leftSectionsByPath)
-              .modify(_.updatedWith(leftPath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - leftSection)
-              })
-              .focus(_.rightSectionsByPath)
-              .modify(_.updatedWith(rightPath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - rightSection)
-              })
-              .focus(_.sectionsAndTheirMatches)
-              .modify(
-                _.remove(baseSection, allSides)
-                  .remove(leftSection, allSides)
-                  .remove(rightSection, allSides)
-              )
-
-          case (
-                matchesAndTheirSections,
-                baseAndLeft @ Match.BaseAndLeft(baseSection, leftSection)
-              ) =>
-            val basePath = base.pathFor(baseSection)
-            val leftPath = left.pathFor(leftSection)
-            matchesAndTheirSections
-              .focus(_.baseSectionsByPath)
-              .modify(_.updatedWith(basePath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - baseSection)
-              })
-              .focus(_.leftSectionsByPath)
-              .modify(_.updatedWith(leftPath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - leftSection)
-              })
-              .focus(_.sectionsAndTheirMatches)
-              .modify(
-                _.remove(baseSection, baseAndLeft)
-                  .remove(leftSection, baseAndLeft)
-              )
-
-          case (
-                matchesAndTheirSections,
-                baseAndRight @ Match.BaseAndRight(baseSection, rightSection)
-              ) =>
-            val basePath  = base.pathFor(baseSection)
-            val rightPath = right.pathFor(rightSection)
-            matchesAndTheirSections
-              .focus(_.baseSectionsByPath)
-              .modify(_.updatedWith(basePath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - baseSection)
-              })
-              .focus(_.rightSectionsByPath)
-              .modify(_.updatedWith(rightPath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - rightSection)
-              })
-              .focus(_.sectionsAndTheirMatches)
-              .modify(
-                _.remove(baseSection, baseAndRight)
-                  .remove(rightSection, baseAndRight)
-              )
-
-          case (
-                matchesAndTheirSections,
-                leftAndRight @ Match.LeftAndRight(leftSection, rightSection)
-              ) =>
-            val leftPath  = left.pathFor(leftSection)
-            val rightPath = right.pathFor(rightSection)
-            matchesAndTheirSections
-              .focus(_.leftSectionsByPath)
-              .modify(_.updatedWith(leftPath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - leftSection)
-              })
-              .focus(_.rightSectionsByPath)
-              .modify(_.updatedWith(rightPath) { case Some(sectionsSeen) =>
-                Some(sectionsSeen - rightSection)
-              })
-              .focus(_.sectionsAndTheirMatches)
-              .modify(
-                _.remove(leftSection, leftAndRight)
-                  .remove(rightSection, leftAndRight)
-              )
-        }
-      end withoutTheseMatches
-
-      private def withMatches(
-          matches: collection.Set[Match[Section[Element]]]
-      ): MatchingResult =
-        val (
-          updatedMatchesAndTheirSections,
-          matchesWithoutRedundantPairwiseMatches
-        ) =
-          matches
-            .foldLeft(this)(_.withMatch(_))
-            .withoutRedundantPairwiseMatchesIn(matches)
-
-        MatchingResult(
-          matchesAndTheirSections = updatedMatchesAndTheirSections,
-          numberOfMatchesForTheGivenWindowSize =
-            matchesWithoutRedundantPairwiseMatches.size,
-          estimatedWindowSizeForOptimalMatch = updatedMatchesAndTheirSections
-            .estimateOptimalMatchSize(matchesWithoutRedundantPairwiseMatches)
-        )
-      end withMatches
-
-      // Cleans up the state when a putative all-sides match that would have
-      // been ambiguous on one side with another all-sides match is partially
-      // suppressed by a larger pairwise match. This situation results in a
-      // pairwise match that shares its sections on both sides with the other
-      // all-sides match; remove any such redundant pairwise matches.
-      private def withoutRedundantPairwiseMatchesIn(
-          matches: collection.Set[Match[Section[Element]]]
-      ): (MatchesAndTheirSections, collection.Set[Match[Section[Element]]]) =
-        val (redundantMatches, usefulMatches) =
-          val isAnAllSidesMatch: Match[Section[Element]] => Boolean = {
-            case _: Match.AllSides[Section[Element]] => true
-            case _                                   => false
-          }
+      // Eating into pairwise matches can create smaller pairwise matches that
+      // are partially subsumed by other larger pairwise matches. Prefer keeping
+      // the larger matches and remove the subsumed ones.
+      def cleanedUp: MatchesAndTheirSections =
+        val subsumedBaseSections = baseSectionsByPath.values
+          .flatMap(_.iterator)
+          .filter(subsumesBaseSection)
+        val subsumedLeftSections = leftSectionsByPath.values
+          .flatMap(_.iterator)
+          .filter(subsumesLeftSection)
+        val subsumedRightSections = rightSectionsByPath.values
+          .flatMap(_.iterator)
+          .filter(subsumesRightSection)

-          matches.partition {
-            case Match.BaseAndLeft(baseSection, leftSection) =>
-              sectionsAndTheirMatches
-                .get(baseSection)
-                .intersect(sectionsAndTheirMatches.get(leftSection))
-                .exists(isAnAllSidesMatch)
-            case Match.BaseAndRight(baseSection, rightSection) =>
-              sectionsAndTheirMatches
-                .get(baseSection)
-                .intersect(sectionsAndTheirMatches.get(rightSection))
-                .exists(isAnAllSidesMatch)
-            case Match.LeftAndRight(leftSection, rightSection) =>
-              sectionsAndTheirMatches
-                .get(leftSection)
-                .intersect(sectionsAndTheirMatches.get(rightSection))
-                .exists(isAnAllSidesMatch)
-            case _: Match.AllSides[Section[Element]] => false
-          }
-        end val
+        val matchesToRemove =
+          (subsumedBaseSections ++ subsumedLeftSections ++ subsumedRightSections)
+            .flatMap(sectionsAndTheirMatches.get)
+            .toSet

-        if redundantMatches.nonEmpty then
+        if matchesToRemove.nonEmpty then
           logger.debug(
-            s"Removing redundant pairwise matches:\n${pprintCustomised(redundantMatches)} as their sections also belong to all-sides matches."
+            s"Removing matches that have subsumed sections:\n${pprintCustomised(matchesToRemove)} as part of cleanup."
           )
         end if

-        withoutTheseMatches(redundantMatches) -> usefulMatches
-      end withoutRedundantPairwiseMatchesIn
-
-      // When the window size used to calculate matches is lower than the
-      // optimal match size, overlapping matches will be made that cover the
-      // elements of the optimal math. Estimate the size of the optimal match by
-      // coalescing the overlaps.
-      private def estimateOptimalMatchSize(
-          matches: collection.Set[Match[Section[Element]]]
-      ): Option[Int] =
-        // Deconstruct a throwaway instance of `MatchesAndTheirSections` made
-        // from just `matches` as a quick-and-dirty way of organising the
-        // matches' sections.
-        val MatchesAndTheirSections(
-          baseSectionsByPath,
-          leftSectionsByPath,
-          rightSectionsByPath,
-          _
-        ) = matches.foldLeft(empty)(_.withMatch(_))
-
-        val sectionsSeenOnAllPathsAcrossAllSides =
-          baseSectionsByPath.values ++ leftSectionsByPath.values ++ rightSectionsByPath.values
-
-        sectionsSeenOnAllPathsAcrossAllSides
-          .flatMap(maximumSizeOfCoalescedSections)
-          .maxOption
-      end estimateOptimalMatchSize
-
-      private def withMatch(
-          aMatch: Match[Section[Element]]
-      ): MatchesAndTheirSections =
-        aMatch match
-          case Match.AllSides(baseSection, leftSection, rightSection) =>
-            copy(
-              baseSectionsByPath = withBaseSection(baseSection),
-              leftSectionsByPath = withLeftSection(leftSection),
-              rightSectionsByPath = withRightSection(rightSection),
-              sectionsAndTheirMatches =
-                sectionsAndTheirMatches + (baseSection -> aMatch) + (leftSection -> aMatch) + (rightSection -> aMatch)
-            )
-          case Match.BaseAndLeft(baseSection, leftSection) =>
-            copy(
-              baseSectionsByPath = withBaseSection(baseSection),
-              leftSectionsByPath = withLeftSection(leftSection),
-              sectionsAndTheirMatches =
-                sectionsAndTheirMatches + (baseSection -> aMatch) + (leftSection -> aMatch)
-            )
-          case Match.BaseAndRight(baseSection, rightSection) =>
-            copy(
-              baseSectionsByPath = withBaseSection(baseSection),
-              rightSectionsByPath = withRightSection(rightSection),
-              sectionsAndTheirMatches =
-                sectionsAndTheirMatches + (baseSection -> aMatch) + (rightSection -> aMatch)
-            )
-          case Match.LeftAndRight(leftSection, rightSection) =>
-            copy(
-              leftSectionsByPath = withLeftSection(leftSection),
-              rightSectionsByPath = withRightSection(rightSection),
-              sectionsAndTheirMatches =
-                sectionsAndTheirMatches + (leftSection -> aMatch) + (rightSection -> aMatch)
-            )
-        end match
-      end withMatch
+        this.withoutTheseMatches(matchesToRemove)
+      end cleanedUp

-      // Eating into pairwise matches can create smaller pairwise matches that
+      private def // Eating into pairwise matches can create smaller pairwise matches that
       // are partially subsumed by other larger pairwise matches. Prefer keeping
       // the larger matches and remove the subsumed ones.
       def cleanedUp: MatchesAndTheirSections =
@@ -1651,7 +1447,11 @@ object CodeMotionAnalysis extends StrictLogging:
           )
         )
       end leftAndRightMatchOf
+<<<<<<< ourBranch
     end MatchesAndTheirSections
+=======
+    private def end MatchesAndTheirSections
+>>>>>>> problemMergeTheirSide

     given potentialMatchKeyOrder: Order[PotentialMatchKey] =
       given Order[Element] = elementOrder
sageserpent-open commented 7 months ago

Ironically, if the conflicted file is marked as resolved and a the diff shown above of the staged resolved versus the original is made, it is pretty straightforward to selectively revert the damage to get a good merge with the edits propagated through the code motion.

sageserpent-open commented 7 months ago

Some background that may be relevant - IntelliJ likes to permute the order of methods in large files on practically every single commit. It seems there is something in the method reordering that can't make up its mind quite what the perfect order is - so methods tend to move back and forth with each new commit. I'm not sure if they keep swapping places or are being permuted afresh each time.

So it is possible that on both branches, there are methods that have moved to different locations wrt the base version of the file. Those would be divergent moves, and they are most definitely not supported yet.

If that was the case, a diagnostic would have been nice...

sageserpent-open commented 7 months ago

IntelliJ's method sorting when applied to the sources of CodeMotionAnalysis is bonkers...

To simplify the exposition, let's break down the changes:

  1. The method that was moved from the class MatchesAndTheirSection to the companion object on our side, being edited on their side - that is withAllMatchesOfAtLeastTheSureFireWindowSize.
  2. Methods that were neither moved nor edited that act as anchors (at least from the point of view of IntelliJ doing a difff between versions). These are in order of appearance in the file: rightSections, matchesForWindowSize, matchFrom and leftAndRightMatchOf.
  3. Methods that weren't edited on either side, but were moved on both sides but in different ways by IntelliJ. To preserve sanity, let's just number them according to their order of appearance in the base side: 1 to 8.
sageserpent-open commented 7 months ago

In the base:

PREFIX rightSections withAllMatchesOfAtLeastTheSureFireWindowSize matchesForWindowSize 1 2 3 4 5 matchFrom ... leftAndRightMatchOf 6 7 8 SUFFIX

The dotted section was unchanged across versions. The PREFIX and SUFFIX were merged nicely.

sageserpent-open commented 7 months ago

On our side:

PREFIX EDITED WITH MOVE DESTINATION OF withAllMatchesOfAtLeastTheSureFireWindowSize rightSections 6 7 3 1 2 4 5 8 matchesForWindowSize matchFrom ... leftAndRightMatchOf EDITED SUFFIX

So the method sorting grabbed all of 1 to 5 and 6 to 8 and stirred them around nicely into one contiguous mess of porridge.

sageserpent-open commented 7 months ago

On their side:

PREFIX rightSections withAllMatchesOfAtLeastTheSureFireWindowSize - EDITED 6 7 8 3 matchesForWindowSize matchFrom ... leftAndRightMatchOf 1 2 4 5 SUFFIX

This time the method sorting grabbed all of 1 to 5 and 6 to 8, shuffled them into a different order than from our side and threw them back into the original two piles with no regard to their original pile.

sageserpent-open commented 7 months ago

It is fair to say that the method reordering seemed to be dependent on whether the Sun was in the Seventh House, using the alignment of Jupiter and Mars as a tiebreaker.

sageserpent-open commented 7 months ago

Examination of the merge log reveals something odd from KineticMerge - it regards withAllMatchesOfAtLeastTheSureFireWindowSize as having stayed in place (albeit being edited). For that to happen, that would imply a whole lot of editing / deletion / insertion / movement of practically everything else around it, which does seem to be the case.

sageserpent-open commented 7 months ago

Based on the observation that withAllMatchesOfAtLeastTheSureFireWindowSize appeared to stay in place in the merge with other methods moving around it, tried an experiment where the ordering for CommonSubsequenceSize was modified to be: Ordering.by(size => size.elementSizeSumTiebreaker).

This means that the sum of the computed element sizes is the sole arbiter of how good a longest common subsequence is, rather than being a tiebreaker when two sequences have equal length.

Surprisingly, this didn't break any tests and resulted in the merge discussed by this ticket treating withAllMatchesOfAtLeastTheSureFireWindowSize as a moved section, which seems intuitively more satisfying.

However, the small edits made in that moved methods were propagated as deletions, so the end result there wasn't quite as good. I wonder whether #29 would have sorted that out?

The rest of the file was as mangled as ever - so it seems that while the lack of motion of withAllMatchesOfAtLeastTheSureFireWindowSize was of interest, it is the diverging code motion elsewhere that is the real problem.

For now, the metric used by LongestCommonSubsequence will be left as it is, but this is worth reconsidering.

sageserpent-open commented 7 months ago

Further examination of the merge (without the experimental hack mentioned in the previous comment) showed three divergent moves of long sections through which deletions were propagated. These sections include 1 withMatches, 3 withoutTheseMatches and 4 estimateOptimalMatchSize as well as lot of subsequent methods , and seem to line up nicely with what was dropped from the resulting conflicted merge.

Other methods such as 6 withAllSmallFryMatches and 7 withPairwiseMatchesEatenInto were moved without apparent divergence.

Method 8 cleanedUp was divergently moved, but without propagating any edits or deletions.

A conflict arose at our move source for 8 cleanedUp with their move destination of 4 estimateOptimalMatchSize. That needs reviewing; if 8 is moving off somewhere and 4 is moving in, then this isn't really a delete versus edit conflict, surely?

Looking more closely at the logs, the conflict in question was coalesced from two left-deletion versus right-edit conflicts and a following right insertion (of 4), so this probably one for ticket #28 when it lands.

sageserpent-open commented 7 months ago

So for now, we aren't supporting divergent moves yet. Nor are we handling the more complex propagation required by #28.

So apart from keeping the possibility of changing the metric for a longest common subsequence in mind, this is a won't-fix situation.

The moral of the story might be to disable this insane permutation of methods by (presumably) IntelliJ in the meantime.

sageserpent-open commented 7 months ago

Reopening after some contemplation.

This merge won’t ever be a clean one due to the divergent moves, but the outcome of the divergent moves was to have supposedly coincident deletions propagated to both move destinations for each of 1, 3 and 4.

This hides the divergent moves and leaves the user puzzled. Wouldn’t it be better to simply leave the two moves at their destinations?

The logic for propagating deletions (and edits, for that matter) is to pair a move with an obvious deletion (or edit) at the source of the move. That is fine in itself, but if a section moves on both sides of the merge, then it really hasn’t been deleted (or edited) on any side - it is present on both our and their side.

This is exactly the line of reasoning for convergent moves, but is currently not used for divergent ones. Instead we have phantom deletions being propagated, or edits that are really just insertions at the move source.

While it won’t solve the problem of divergence, changing the handling of divergent moves might make the handling of these problem merges easier - if both moves are honoured, all the user has to do is delete one of them to get a clean merge. All the better if the divergence can be flagged with an informative message.

Let’s try this out and see how it works in practice…

sageserpent-open commented 7 months ago

Git commit SHA: 457e24255fe99d4f4caa57fb016e4b2623dbb3cb changes the behaviour of MergeResultDetectingMotion to not propagate deletions or edits through divergent moves.

While working on this, I noticed that this approach was already used not only for convergent moves, but also for handling left/right edits and deletions when a section moves on the same side as its edit or deletion - the moved section is left as is without propagating the deletion or edit.

Making the change in behaviour (after changing MergeResultDetectingMotionTest first, then implementing to fix the breakages) does not cause any other test failures, so that is heartening.

The example for the ticket was redone, the result is still conflicted, but at least I can see the divergent moves. They are a mess though.

Next step - generate a diagnostic for each divergent move...

sageserpent-open commented 7 months ago

Generating a diagnostic can wait for #38.

One thing that is a bit worrying - it's all very well when there are divergent moves as regarding each side's move destination as blocking any potential edit or deletion on the same side from the point of view of the other side. However, what should happen in the future of one move destination is selected?

If the move on the opposite side as the edit or deletion is selected, does that then convert into a propagation of the edit or deletion, because it is no longer being blocked by the dropped move destination?

This was the reasoning for allowing propagations through divergent moves as a way of hedging bets - the edit or deletion would be propagated through the move on the other side, but left in place on its own side just in case that side ended up dropping its move.

It depends on whether one thinks of dropping a move on one side as leaving it in its original base location on that side - in which case there should be no propagation even after one move destination is dropped.

sageserpent-open commented 7 months ago

The changes to MergeResultDetectingMotion went out in release 1.1.2, Git commit SHA: a46c47794b9fcc2532400d0e321749384a909921.

sageserpent-open commented 6 months ago

This went out in release 1.2.0, Git commit SHA: 4d1935beef2b5923417904895884de8c2e749c6e.