hcoles / pitest

State of the art mutation testing system for the JVM
Apache License 2.0
1.65k stars 351 forks source link

Pitest vs Defensive Programming #89

Open gvsmirnov opened 10 years ago

gvsmirnov commented 10 years ago

Consider the following snippet (reduced from actual production code):

public int evaluate(final int distance) {
    if(distance == 0) {
        throw new IllegalArgumentException("Distance should be non-zero");
    }

    //...
    if(distance > 0) {
        result += calculateDistanceAdjustment(distance);
    }
    //...

    return result;
}

To which PIT will generate a false positive:

> if(distance > 0) {
changed conditional boundary → SURVIVED

(distance is guaranteed to be non-zero in that line)

Surely, that is not something easily fixed, but the issue arises in projects which use defensive programming. Or just check arguments in the beginning of the method, for that mater.

StefanPenndorf commented 10 years ago

What you request in gerneral is to determine equivalent mutations or to determine equivalence of different programs which is np complete and thus cannot be determined.

Of course there are simple cases and special cases that can be solved like the problem you described. But what about removing the method call in line 6 in the following programm?

public int compute() {
  if(inCache()) {
    return valueFromCache();
  }
  final int value = expensiveComputation();
  putInCache(value); // line 6
  return value;
}

Removing line 6 does not change the functional correctness of the program. It changes performance only.

Even the program you described can be refactored to a similar version which is hard to check for an equivalent mutation (for example by introducing additional methods).

The direct approach

A direct check for the equivalence in your sample must be smart enough to recognize, that distance is a final variable and thus cannot be changed as well as that it cannot be zero at the second check because otherwise an IAE whould have been thrown before. Such a check requires to track all variables and the complete flow of the program as well as a smart deduction engine to determine the possible values of each and every variable at each and every line.

Such a deduction engine might find simple equivalent mutations but will fail if more public methods come into play or if logic is split accross multiple classes (or do you check post conditions of all return values as well?).

Detecting equivalent mutants via effect on coverage

Another approach proposed by David Schuler and Andreas Zeller (see their paper) is to meassure the effect of the mutant to code coverage: If a mutant does not effect the coverage of the tests run it is more likely to be an equivalent mutant. This approach will not produce exact classifications like the "direct approach" but might provide a ranking which survivors should be examined first.

Avoiding equivalent mutants (in your example)

I think that problem can be addressed by separating the calculation and the contract/sanity check by introducing a new concept "Distance". If you create an immutable value class Distance you can do the check for zero in the constructor.

public class Distance {
  private final int distanceValue;

  public Distance(final int distanceValue) {
    if(distanceValue == 0) {
      throw new IllegalArgumentException("Distance value must be non-zero.");
    }
    this.distanceValue=distanceValue;
  }

  public boolean isPositive() {
    return distanceValue > 0;
  }
}
.....
public int evaluate(final Distance distance) {
    if(distance == null) {
        throw new IllegalArgumentException("Distance must not be null.");
    }
    //...
    if(distance.isPositive()) {
        result += calculateDistanceAdjustment(distance);
    }
    //...
    return result;
}

But I also know this approach with lots of small classes is not every one's cup of tea.

gvsmirnov commented 10 years ago

@KyleRogers Thanks for the detailed answer. Yeah, I do realize that it is (probably) not something that can be solved in general. On the bright side, there are always some heuristics which can help reduce the number of false positives. What I have posted here is just a mention of a relatively large cluster of those that do occur in real projects.