Question: how to automatically resolve whitespace differences where feasible?

This is essentially duplicating what git merge can do with its various options:

ignore-space-change
ignore-all-space
ignore-space-at-eol
ignore-cr-at-eol

That are passed to its ort strategy. Before proceeding with this, it would be nice to know exactly where in Git's merge plumbing this stuff gets used and how it could be applied to Kinetic Merge.

Given that code may move between files, just how sacred is the whitespace anyway? I'm not sure that duplicating exactly what Git does is the best option here.

What are Git's merge's rules for precedence when merging with whitespace changes when an option to ignore whitespace is in force?

OURS	THEIRS	OUTCOME
Whitespace changes	Whitespace changes	Our whitespace changes
Whitespace changes	Non-trivial changes	Their non-trivial changes
Non-trivial changes	Whitespace changes	Our non-trivial changes
No change	Whitespace changes	No change
Whitespace changes	No change	Our whitespace changes
No change	Non-trivial changes	Their non-trivial changes
Non-trivial changes	No change	Our non-trivial changes
Non-trivial changes	Non-trivial changes	Conflict

This came up in #30 when reviewing the results of MainTest.anEditAndADeletionPropagatingThroughAFileSplit - see anEditAndADeletionPropagatingThroughAFileSplit.

Need to review that in the context of the ticket. Speaking as a user, I would want to see the edit on the right in the tested merge make its way into the move destination contributed by the left, which isn't the case with the current dominance rule in the code (ca10050d36cbc7e956d287f6d905130d9decffa5).

Something to be mindful of - the example discussed in #30 is for a merge where our branch - the left is the one that has the condensation, their - the right branch has the whitespace edit.

This is the first test case executed by that test.

If the sense of merging is reversed, so the condensation is merged into the edits, then I'd expect a better looking result.

The behaviour we want in the last comment - namely to propagate the whitespace change on the right (theirs) to the move destination on the left (ours) goes against what Git specifies in the table above - well, assuming that table fairly summarizes what Git does.

Given the merge would look better with the desired behavour, let's go with it and not follow the table to the letter...

Scouting in the codebase, it seems that are at least two parts to this last piece of work, corresponding to two use-cases:

Consider a simple edit of whitespace only in their branch - now we allow breaking strict compatibility with Git, this means the edit's changes should make their way into the merge result. This means in turn that the preservation code in the underlying merge algebra should sniff at the two alternative contributions and decide which one has whitespace differences wrt to the base contribution. If both do, then the left contribution is taken, which aligns with what Git does, as is also the case when just the left contribution has a whitespace change. If only the right contribution is different, then it goes through to the merge result. This is a bit icky, because this whole notion of whitespace is completely abstracted away by the time we get to the merge algebra. Also, what about the base contribution? It is not passed to the preservation operation - although this could be arranged.
Consider an edit of whitespace only in their branch with a move of that section in our branch - we have to use the same logic as for the simple edit case, but for the destination's contribution versus their edit. Similarly for when the edit is on our side and the move is on theirs. So the leftDeletion code and rightDeletion code need to be cutover to find the destination and then do the required sniffing.

For the sake of centralising logic (and testing of that logic), I'm wondering whether to treat the first situation as being a kind of edit (because from the user's point of view, that's what it is) when just one side makes a whitespace edit wrt the base. If both or neither do, then it gets treated as a preservation. Then again, what about coincident edits? Do we do the same thing thing there? Or push it all down into the underlying merge algebra? Um, ah...

Musing ... consider using an abstract strategy to fuse the left and right contributions and pass this into the underlying merge algebra. Cutover MergeTest so that it double checks that the resulting merge actually uses the strategy by re-using the strategy in its expectations - or switching from a left-bias to a right-bias in different trials. This allows preservations and coincident edits to be handled in the usual fashion.

The code up in the merge algebra for MatchesContext needs to use the same strategy to build a edit to be migrated to the move destination when handling a deletion (specifically when one side has made a whitespace-only edit and the other has moved the section). The edit is the fusion of the move destination's section with the whitespace-edited section.

It would be nice to suppress any migrated whitespace-only edits when the fusion leads to the same result as the move destination. Could this be done with a simple equality check?

TODO: fix the logging message for preservations in merge.of - it currently refers to the left side only, this needs to change given this ticket...

As of b5188269cb933218c85f8dbd73235f1bd49ecd86, got support for the simple edit case without code motion.

Evidence from manual inspection of merged file when running MainTest.anEditAndADeletionPropagatingThroughAFileCondensation via a debugger and suspending before cleanup of the temporary repository (using Git commit SHA f890d3181764d034c7989ae0b4e3624b8867498a):


package com.sageserpent.americium.java;

import com.google.common.base.Preconditions;

import java.time.Duration;
import java.time.Instant;
import java.util.function.Consumer;
import java.util.function.Function;
import java.util.function.Predicate;

import static scala.jdk.javaapi.DurationConverters.toJava;

/**
 * Strategy used to limit the emission of cases by the implementation of
 * {@link Trials}. These are supplied by client code when calling
 * {@link Trials#withStrategy(Function)}.
 *
 * @apiNote Instances are expected to be stateful, so they should not be
 * reused when calling the aforementioned overloads.
 */
public interface CasesLimitStrategy {
    /**
     * Limits test case emission using a time budget that starts when the
     * strategy is first consulted via {@link CasesLimitStrategy#moreToDo()}.
     *
     * @param timeBudget How long to allow a testing cycle to continue to
     *                   emit cases.
     * @return A fresh strategy instance - the time budget is not consumed
     * until the first call to {@link CasesLimitStrategy#moreToDo()}.
     */
    static CasesLimitStrategy timed(final Duration timeBudget) {
        return new CasesLimitStrategy() {
            Instant deadline = Instant.MAX;

            @Override
            public boolean moreToDo() {
                if (deadline.equals(Instant.MAX)) {
                    deadline = Instant.now().plus(timeBudget);
                }

                return !Instant.now().isAfter(deadline);
            }

            @Override
            public void noteRejectionOfCase() {

            }

            @Override
            public void noteEmissionOfCase() {

            }

            @Override
            public void noteStarvation() {

            }

            @Override
            public boolean legacyMethod(int whatIsThisFor){ return true; }
        };
    }

    /**
     * Limits test case emission using a time budget that starts when the
     * strategy is first consulted via {@link CasesLimitStrategy#moreToDo()}.
     *
     * @param timeBudget How long to allow a testing cycle to continue to
     *                   emit cases.
     * @return A fresh strategy instance - the time budget is not consumed
     * until the first call to {@link CasesLimitStrategy#moreToDo()}.
     */
    static CasesLimitStrategy timed(
            final scala.concurrent.duration.FiniteDuration timeBudget) {
        return timed(toJava(timeBudget));
    }

    /**
     * Emulation of Scalacheck's approach to limiting emission of test cases.
     *
     * @param maximumNumberOfCases   *Upper* limit on the number of cases
     *                               emitted. <b>For Scalacheck aficionados:
     *                               the name reflects the fact that this is
     *                               a limit, contrast with Scalacheck's
     *                               {@code minSuccessfulTests}.</b>
     * @param maximumStarvationRatio Maximum ratio of case starvation versus
     *                               case emission.
     * @return A fresh strategy instance.
     * @implNote Like Scalacheck, the strategy will allow {@code
     * maximumNumberOfCases * maximumStarvationRatio} starvation to take
     * place before giving up.
     */
    static CasesLimitStrategy counted(int maximumNumberOfCases,
                                      double maximumStarvationRatio) {
        return new CasesLimitStrategy() {
            int numberOfCasesEmitted = 0;
            int starvationCount = 0;

            {
                Preconditions.checkArgument(0 <= maximumNumberOfCases);
                Preconditions.checkArgument(0 <= maximumStarvationRatio);
            }

            @Override
            public boolean moreToDo() {
                return maximumNumberOfCases > numberOfCasesEmitted &&
                       starvationCount <=
                       maximumNumberOfCases * maximumStarvationRatio;
            }

            @Override
            public void noteRejectionOfCase() {
                numberOfCasesEmitted -= 1;
                starvationCount += 1;
            }

            @Override
            public void noteEmissionOfCase() {
                numberOfCasesEmitted += 1;
            }

            @Override
            public void noteStarvation() {
                starvationCount += 1;
            }

            @Override
            public boolean legacyMethod(int whatIsThisFor){ return true; }
        };
    }

/**
     * Query used by the implementation of {@link Trials} to control the
     * emission of new cases.
     *
     * @return True to signal that more cases should be emitted, false to
     * stop emission.
     * @apiNote Once a call returns false, there should be no further
     * interaction with the strategy by the implementation of {@link Trials}
     * except for additional calls to this method.
     */
    boolean moreToDo();

    /**
     * Notes that inlined case filtration in a test body has rejected a case.
     *
     * @apiNote This is <b>not</b> called when the filtration provided by
     * {@link Trials#filter(Predicate)} rejects a case. When this method is
     * called, there should have been a corresponding call to
     * {@link CasesLimitStrategy#noteEmissionOfCase} concerning the same
     * implied test case that is being backed out of by this method's call.
     */
    void noteRejectionOfCase();

    /**
     * Notes that a case has been successfully emitted. The case is
     * guaranteed to have been constructed in a different way from all others
     * emitted within a call to
     * {@link Trials.SupplyToSyntax#supplyTo(Consumer)}.
     *
     * @apiNote Although each emitted case has been uniquely constructed,
     * this does not mean that it is definitely unique in terms of equality;
     * for one thing, the equality may be unable to distinguish between
     * instances constructed in different ways and for another, the rendition
     * of a test case may flatten information causing collisions between test
     * cases built in different ways.
     */
    void noteEmissionOfCase();

    /**
     * Notes that a case has not been successfully emitted. This can be due
     * to it being a duplicate of an earlier case emitted previously in a
     * call to {@link Trials.SupplyToSyntax#supplyTo(Consumer)}, or may be
     * due to the filtration provided by {@link Trials#filter(Predicate)}
     * rejecting a case, or may be due to the complexity limit being breached.
     *
     * @apiNote This is  <b>not</b> called due to inlined test filtration -
     * that is handled by {@link CasesLimitStrategy#noteRejectionOfCase}.
     */
    void noteStarvation();

    boolean legacyMethod(int whatIsThisFor);
}

The same file, only this time with the sense of the merge reversed:


package com.sageserpent.americium.java;

import com.google.common.base.Preconditions;

import java.time.Duration;
import java.time.Instant;
import java.util.function.Consumer;
import java.util.function.Function;
import java.util.function.Predicate;

import static scala.jdk.javaapi.DurationConverters.toJava;

/**
 * Strategy used to limit the emission of cases by the implementation of
 * {@link Trials}. These are supplied by client code when calling
 * {@link Trials#withStrategy(Function)}.
 *
 * @apiNote Instances are expected to be stateful, so they should not be
 * reused when calling the aforementioned overloads.
 */
public interface CasesLimitStrategy {
    /**
     * Limits test case emission using a time budget that starts when the
     * strategy is first consulted via {@link CasesLimitStrategy#moreToDo()}.
     *
     * @param timeBudget How long to allow a testing cycle to continue to
     *                   emit cases.
     * @return A fresh strategy instance - the time budget is not consumed
     * until the first call to {@link CasesLimitStrategy#moreToDo()}.
     */
    static CasesLimitStrategy timed(final Duration timeBudget) {
        return new CasesLimitStrategy() {
            Instant deadline = Instant.MAX;

            @Override
            public boolean moreToDo() {
                if (deadline.equals(Instant.MAX)) {
                    deadline = Instant.now().plus(timeBudget);
                }

                return !Instant.now().isAfter(deadline);
            }

            @Override
            public void noteRejectionOfCase() {

            }

            @Override
            public void noteEmissionOfCase() {

            }

            @Override
            public void noteStarvation() {

            }

            @Override
            public boolean legacyMethod(int whatIsThisFor){ return true; }
        };
    }

    /**
     * Limits test case emission using a time budget that starts when the
     * strategy is first consulted via {@link CasesLimitStrategy#moreToDo()}.
     *
     * @param timeBudget How long to allow a testing cycle to continue to
     *                   emit cases.
     * @return A fresh strategy instance - the time budget is not consumed
     * until the first call to {@link CasesLimitStrategy#moreToDo()}.
     */
    static CasesLimitStrategy timed(
            final scala.concurrent.duration.FiniteDuration timeBudget) {
        return timed(toJava(timeBudget));
    }

    /**
     * Emulation of Scalacheck's approach to limiting emission of test cases.
     *
     * @param maximumNumberOfCases   *Upper* limit on the number of cases
     *                               emitted. <b>For Scalacheck aficionados:
     *                               the name reflects the fact that this is
     *                               a limit, contrast with Scalacheck's
     *                               {@code minSuccessfulTests}.</b>
     * @param maximumStarvationRatio Maximum ratio of case starvation versus
     *                               case emission.
     * @return A fresh strategy instance.
     * @implNote Like Scalacheck, the strategy will allow {@code
     * maximumNumberOfCases * maximumStarvationRatio} starvation to take
     * place before giving up.
     */
    static CasesLimitStrategy counted(int maximumNumberOfCases,
                                      double maximumStarvationRatio) {
        return new CasesLimitStrategy() {
            int numberOfCasesEmitted = 0;
            int starvationCount = 0;

            {
                Preconditions.checkArgument(0 <= maximumNumberOfCases);
                Preconditions.checkArgument(0 <= maximumStarvationRatio);
            }

            @Override
            public boolean moreToDo() {
                return maximumNumberOfCases > numberOfCasesEmitted &&
                       starvationCount <=
                       maximumNumberOfCases * maximumStarvationRatio;
            }

            @Override
            public void noteRejectionOfCase() {
                numberOfCasesEmitted -= 1;
                starvationCount += 1;
            }

            @Override
            public void noteEmissionOfCase() {
                numberOfCasesEmitted += 1;
            }

            @Override
            public void noteStarvation() {
                starvationCount += 1;
            }

            @Override
            public boolean legacyMethod(int whatIsThisFor){ return true; }
        };
    }

/**
     * Query used by the implementation of {@link Trials} to control the
     * emission of new cases.
     *
     * @return True to signal that more cases should be emitted, false to
     * stop emission.
     * @apiNote Once a call returns false, there should be no further
     * interaction with the strategy by the implementation of {@link Trials}
     * except for additional calls to this method.
     */
    boolean moreToDo();

    /**
     * Notes that inlined case filtration in a test body has rejected a case.
     *
     * @apiNote This is <b>not</b> called when the filtration provided by
     * {@link Trials#filter(Predicate)} rejects a case. When this method is
     * called, there should have been a corresponding call to
     * {@link CasesLimitStrategy#noteEmissionOfCase} concerning the same
     * implied test case that is being backed out of by this method's call.
     */
    void noteRejectionOfCase();

    /**
     * Notes that a case has been successfully emitted. The case is
     * guaranteed to have been constructed in a different way from all others
     * emitted within a call to
     * {@link Trials.SupplyToSyntax#supplyTo(Consumer)}.
     *
     * @apiNote Although each emitted case has been uniquely constructed,
     * this does not mean that it is definitely unique in terms of equality;
     * for one thing, the equality may be unable to distinguish between
     * instances constructed in different ways and for another, the rendition
     * of a test case may flatten information causing collisions between test
     * cases built in different ways.
     */
    void noteEmissionOfCase();

    /**
     * Notes that a case has not been successfully emitted. This can be due
     * to it being a duplicate of an earlier case emitted previously in a
     * call to {@link Trials.SupplyToSyntax#supplyTo(Consumer)}, or may be
     * due to the filtration provided by {@link Trials#filter(Predicate)}
     * rejecting a case, or may be due to the complexity limit being breached.
     *
     * @apiNote This is  <b>not</b> called due to inlined test filtration -
     * that is handled by {@link CasesLimitStrategy#noteRejectionOfCase}.
     */
    void noteStarvation();

    boolean legacyMethod(int whatIsThisFor);
}

The expected content, as discussed in #30:

package com.sageserpent.americium.java;

import com.google.common.base.Preconditions;

import java.time.Duration;
import java.time.Instant;
import java.util.function.Consumer;
import java.util.function.Function;
import java.util.function.Predicate;

import static scala.jdk.javaapi.DurationConverters.toJava;

/**
 * Strategy used to limit the emission of cases by the implementation of
 * {@link Trials}. These are supplied by client code when calling
 * {@link Trials#withStrategy(Function)}.
 *
 * @apiNote Instances are expected to be stateful, so they should not be
 * reused when calling the aforementioned overloads.
 */
public interface CasesLimitStrategy {
    /**
     * Limits test case emission using a time budget that starts when the
     * strategy is first consulted via {@link CasesLimitStrategy#moreToDo()}.
     *
     * @param timeBudget How long to allow a testing cycle to continue to
     *                   emit cases.
     * @return A fresh strategy instance - the time budget is not consumed
     * until the first call to {@link CasesLimitStrategy#moreToDo()}.
     */
    static CasesLimitStrategy timed(final Duration timeBudget) {
        return new CasesLimitStrategy() {
            Instant deadline = Instant.MAX;

            @Override
            public boolean moreToDo() {
                if (deadline.equals(Instant.MAX)) {
                    deadline = Instant.now().plus(timeBudget);
                }

                return !Instant.now().isAfter(deadline);
            }

            @Override
            public void noteRejectionOfCase() {

            }

            @Override
            public void noteEmissionOfCase() {

            }

            @Override
            public void noteStarvation() {

            }

            @Override
            public boolean legacyMethod(int whatIsThisFor){ return true; }
        };
    }

    /**
     * Limits test case emission using a time budget that starts when the
     * strategy is first consulted via {@link CasesLimitStrategy#moreToDo()}.
     *
     * @param timeBudget How long to allow a testing cycle to continue to
     *                   emit cases.
     * @return A fresh strategy instance - the time budget is not consumed
     * until the first call to {@link CasesLimitStrategy#moreToDo()}.
     */
    static CasesLimitStrategy timed(
            final scala.concurrent.duration.FiniteDuration timeBudget) {
        return timed(toJava(timeBudget));
    }

    /**
     * Emulation of Scalacheck's approach to limiting emission of test cases.
     *
     * @param maximumNumberOfCases   *Upper* limit on the number of cases
     *                               emitted. <b>For Scalacheck aficionados:
     *                               the name reflects the fact that this is
     *                               a limit, contrast with Scalacheck's
     *                               {@code minSuccessfulTests}.</b>
     * @param maximumStarvationRatio Maximum ratio of case starvation versus
     *                               case emission.
     * @return A fresh strategy instance.
     * @implNote Like Scalacheck, the strategy will allow {@code
     * maximumNumberOfCases * maximumStarvationRatio} starvation to take
     * place before giving up.
     */
    static CasesLimitStrategy counted(int maximumNumberOfCases,
                                      double maximumStarvationRatio) {
        return new CasesLimitStrategy() {
            int numberOfCasesEmitted = 0;
            int starvationCount = 0;

            {
                Preconditions.checkArgument(0 <= maximumNumberOfCases);
                Preconditions.checkArgument(0 <= maximumStarvationRatio);
            }

            @Override
            public boolean moreToDo() {
                return maximumNumberOfCases > numberOfCasesEmitted &&
                       starvationCount <=
                       maximumNumberOfCases * maximumStarvationRatio;
            }

            @Override
            public void noteRejectionOfCase() {
                numberOfCasesEmitted -= 1;
                starvationCount += 1;
            }

            @Override
            public void noteEmissionOfCase() {
                numberOfCasesEmitted += 1;
            }

            @Override
            public void noteStarvation() {
                starvationCount += 1;
            }

            @Override
            public boolean legacyMethod(int whatIsThisFor){ return true; }
        };
    }

    /**
     * Query used by the implementation of {@link Trials} to control the
     * emission of new cases.
     *
     * @return True to signal that more cases should be emitted, false to
     * stop emission.
     * @apiNote Once a call returns false, there should be no further
     * interaction with the strategy by the implementation of {@link Trials}
     * except for additional calls to this method.
     */
    boolean moreToDo();

    /**
     * Notes that inlined case filtration in a test body has rejected a case.
     *
     * @apiNote This is <b>not</b> called when the filtration provided by
     * {@link Trials#filter(Predicate)} rejects a case. When this method is
     * called, there should have been a corresponding call to
     * {@link CasesLimitStrategy#noteEmissionOfCase} concerning the same
     * implied test case that is being backed out of by this method's call.
     */
    void noteRejectionOfCase();

    /**
     * Notes that a case has been successfully emitted. The case is
     * guaranteed to have been constructed in a different way from all others
     * emitted within a call to
     * {@link Trials.SupplyToSyntax#supplyTo(Consumer)}.
     *
     * @apiNote Although each emitted case has been uniquely constructed,
     * this does not mean that it is definitely unique in terms of equality;
     * for one thing, the equality may be unable to distinguish between
     * instances constructed in different ways and for another, the rendition
     * of a test case may flatten information causing collisions between test
     * cases built in different ways.
     */
    void noteEmissionOfCase();

    /**
     * Notes that a case has not been successfully emitted. This can be due
     * to it being a duplicate of an earlier case emitted previously in a
     * call to {@link Trials.SupplyToSyntax#supplyTo(Consumer)}, or may be
     * due to the filtration provided by {@link Trials#filter(Predicate)}
     * rejecting a case, or may be due to the complexity limit being breached.
     *
     * @apiNote This is  <b>not</b> called due to inlined test filtration -
     * that is handled by {@link CasesLimitStrategy#noteRejectionOfCase}.
     */
    void noteStarvation();

    boolean legacyMethod(int whatIsThisFor);
}

All that remains is to bolster MatchesContextTest so that all the code paths added in to handle one-sided deletions and edits have injected faults handled...

Merged into main; fast-forwarded to Git commit SHA: d91e5675bf1b11bfe40c6e1587f93c6e20d769cc.

sageserpent-open / kineticMerge

Question: how to automatically resolve whitespace differences where feasible? #9