sageserpent-open / kineticMerge

Merge a heavily refactored codebase and stay sane.
MIT License
9 stars 1 forks source link

Deal with small ambiguous matches gracefully. #31

Open sageserpent-open opened 3 months ago

sageserpent-open commented 3 months ago

This addresses the need for Git commit SHA: 75e445a4684b0fab5d4ee79a3280f49087c57e24.

The minimum match size had to be jemmied up to 5 to avoid ambiguous matches of the snippet:

            @Override
            public void 

that is a prefix of several methods in the test source inputs.

The presence of ambiguous matches isn't a problem in itself, but one of the methods is deleted, along with the ambiguous prefix. This should propagate the deletion, except that because of the ambiguity, execution reaches this dead end.

The job is to rectify this - either:

  1. Find a way of propagating the deletion to just the 'right' destination.
  2. Ignore ambiguous matches when propagating (and issue a useful diagnostic suggesting that the match size should be increased).
  3. Automatically retry the merge with a larger minimal match size.

Ticket #6 addresses this too, but something expedient should be done first.

sageserpent-open commented 3 months ago

There is an underlying theme here - when there are ambiguous matches that involve two or more moved sections, how do we know what the move destinations are? We can state that any section that is part of the longest common subsequence is not a source or destination of a move, but multiple moves can cross over.

There is a similar problem when an ambiguous match involves a section that is deleted and another that is moved. Which one is which?

Common sense helps in that we can try to correlate the sections via their containing file names - this can work for simple cases, at least.

sageserpent-open commented 3 months ago

The existing implementation as of Git commit SHA: 09e28d0263f9c23466461f3689ee9d2ad3661e8f does not exclude matched sections that are part of the longest common subsequence from being considered as move destinations, although it does exclude them from being treated as move sources.

Fixing that might provide a lightweight solution that could cover some of the more common problems, although it won't work with potentially crossed-over moves, or moves versus deletions.

sageserpent-open commented 3 months ago

A cheap and cheerful solution is to propagate all of the conflicting changes (including a deletion) for an ambiguous match to all of the destinations. The user has the context to remove all but the pertinent change at each destination, or even to revert the change altogether.

We don't want to have to do this all the time where moves aren't crossed-over, but once the change in the previous comment is in, this might be a workable solution for the time being.