infection / infection

PHP Mutation Testing library
https://infection.github.io
BSD 3-Clause "New" or "Revised" License
2.05k stars 161 forks source link

Add config option to count timeouts as mutant escapes #1133

Open CyberiaResurrection opened 4 years ago

CyberiaResurrection commented 4 years ago

Is your feature request related to a problem? Please describe. I'm concerned that mutation runs on CI environments may inadvertently be overstating MSI scores.

I keep seeing a wide disparity in number of timeouts between running Infection locally (2 timeouts out of ~1200 mutants) vs running on CI (98 timeouts out of ~1000 mutants - that job only mutates files that were changed in the PR) - a timeout rate nearly 60x greater on CI. This is most likely due to the differing processing power available to each case.

Describe the solution you'd like As I understand it, Infection currently counts a mutant against the run's MSI score if it didn't produce some visible misbehaviour (test failure, fatal error, timeout).

I'm asking for an option, settable in either infection-json.dist or supplied via command line, that alters the sense of MSI calculation. This would simply not count timeouts as "visible misbehaviour".

Referring back to the sample Travis job I attached, only the 871 killed mutants (and the zero fatal errors) would count towards the overall MSI (which would be 87.6% under my proposal) and covered-code MSI (which would be 89.1%), not the reported values of 97% and 98%, respectively.

Although my initial use case arises out or running under a CI environment, I would be reluctant to default it to being set for CI runs. That feels too much like setting policy, not merely providing mechanism.

Describe alternatives you've considered

Additional context Sample CI timeout-laden run: https://travis-ci.org/Algo-Web/POData-Laravel/jobs/659174053?utm_medium=notification&utm_source=github_status

theofidry commented 4 years ago

I don't think a timeout should count as an escape: if you have a mutant that makes your for condition an endless loop, you don't have any test that can effectively kill it (and neither should you).

But having a big timeout disparity in the CI definitely indicates an issue. Have you tried with increasing the timeout from 10 to a bigger number to see if it reduces them?

sanmai commented 4 years ago

Let's consider why we have timeouts. Change < to > in a cycle, and there you have it, an infinite cycle. Add some expensive and memory intensive operations inside this cycle, and there is a bomb, waiting to blow. Now you have a mutated process, running in an infinite loop, trying to consume every resource your machine has. And one can reasonably expect CI machines to not have much in this regard.

Now, consider you decided to "fix" this problem by raising a timeout. What's going to happen? Now you will have this vicious process running not for 25 seconds, but for a full minute, all the time putting your system under even greater stress. Even more stress will make every other process slower, hence causing even more timeouts. So, raising a timeout is almost never a solution.

But if you lower the timeout, or introduce a stricter memory limit, you will limit damage caused by runaway mutations, and thus the number of other timing out mutations.

Bottom line

What else can we do?

Seeing this issue, and experiencing this problem myself, I can tell this is actual problem, which is not well understood by the users.

CyberiaResurrection commented 4 years ago

Bumping the timeout to 30 seconds does roughly halve the timeouts - 46 timeouts out of 1000 mutants on the run below, compared to 98 out of 994 mutants on the original run I linked.

https://travis-ci.org/Algo-Web/POData-Laravel/jobs/659388718?utm_medium=notification&utm_source=github_status

CyberiaResurrection commented 4 years ago

@sanmai , I originally didn't want to "fix" the problem by boosting the timeout - that's a diagnostic step that @theofidry asked me to do. My original question/proposal was a config option to count timeouts, at whatever threshold you may have set, as failed-to-detect mutants.

sanmai commented 4 years ago

If you'd want to add something like --max-runaway=N, I'd say it's a good idea. It will let you weed out botched builds for sure, where including these processes into MSI scores may or may not offer SNR improvement. Remember, there are probabilities involved: depending on the state of your project Infection may run tests and/or mutations in a different order.

But don't take only mine word for it. Let's hear what others want to say.

CyberiaResurrection commented 4 years ago

Oh yeah, definitely.

I have a co-maintainer who does tend to get a little over-excited at sharp increases in coverage, whether statement or mutant.

theofidry commented 4 years ago

I took a bit of time to think about the issues and have a bit of chat with others on the topic. There is far from being a consensus on the subject, but there is one thing that is clear to me: I'm less and less convinced that relying on the MSI, min MSI or another additional option is a good idea.

Ultimately a MSI is a very loose metric. It's a nice indicator, but people tend to take too much as "it's over X, so my code is good I can just merge it". Maybe the work here should be done not on trying to add another option to attempt to control that metric, but on the reporting itself. For example:

danepowell commented 7 months ago

I would love a --max-timeouts option. On my project, I'd have zero tolerance for timeouts or runaways since they are insidious: over time, they gradually make your tests slower and slower, especially if you already have a high timeout.