Closed sageserpent-open closed 1 month ago
Looked at the logs with mimimum match sizes of 5 and 6 in CodeMotionAnalysisExtensionTest.merging
as of: aedc282e5d802f3082dd2b605694bfc81378ce6f, as well as of the earlier commit: 98ad215623561b7a39e5db0faeb1f37abb68f1bb when gap were filled with single sections only.
This reveals the same kind of problem seen back in #19 - the merge algorithm works well when content is gathered up into big sections, regardless of whether those are matched sections or gap fillers. Introducing runs of one-token gap filler sections not only clobbers the merge in terms of performance, but forces it to align on single tokens that may or may not be good choices.
Suppose we identify all gap filler content on the base that is a substring of some gap filler content on the left or right, splitting the left and right gap fills into chunkier sections. This is essentially the prefix / suffix idea from before, only generalises to allow matches in the middle. The point is, having chunky gap fills might yield decent merges in general, as with the older implementation, but still grant us the fix from #42.
That approach worked and was delivered in 1fc422108b89e39cc6608eb1b16027e0c3bf6e4f.
This went out in release 1.2.0, Git commit SHA: 4d1935beef2b5923417904895884de8c2e749c6e.
This follows on from #42.
The work done there introduced the notion of filling content gaps between matched sections in
MappedContentSources
with lots of one-token sections; previously the gaps were filled with single sections, each section spanning the gap in one go.While that sorted out the problems addressed by #42, it yields some odd looking merges in
CodeMotionAnalysisExtensionTest.merging
when the minimum match size is low.Running the manual test discussed here: https://github.com/sageserpent-open/kineticMerge/issues/30#issuecomment-2106144845 also yields mixed results - in some ways the merge is improved, in others it looks suspect.
The job here is to get nicer merge results without breaking the tests introduced in #42.
One approach might be to look for common prefixes and / or suffices in gaps, then to split the gaps into an optional prefix, body and an optional suffix - that should be enough to keep the tests from #42 happy, but might avoid the messy merges.