stryker-mutator / stryker-js

Mutation testing for JavaScript and friends
https://stryker-mutator.io
Apache License 2.0
2.58k stars 250 forks source link

Small variation in mutation score between coverageAnalysis perTest and all #249

Closed nicojs closed 6 years ago

nicojs commented 7 years ago

There is a small variation in mutation score between coverageAnalysis 'perTest' and 'all'. I cannot explain this right away, we need more investigation.

On the master branch:

Per test:

1591 total mutants.
603 mutants survived.
  of which 288 were not covered by the tests.
4 mutant(s) caused an error and were therefore not accounted for in the mutation score.
47 mutants timed out.
937 mutants killed.
Ran 27.32 tests per mutant on average.
Mutation score based on covered code: 75.75%
Mutation score based on all code: 62.00%
[2017-03-02 07:54:47.168] [INFO] Stryker - Done in 2 minutes 29 seconds.

all:

1591 total mutants.
595 mutants survived.
  of which 288 were not covered by the tests.
4 mutant(s) caused an error and were therefore not accounted for in the mutation score.
47 mutants timed out.
945 mutants killed.
Ran 108.93 tests per mutant on average.
Mutation score based on covered code: 76.37%
Mutation score based on all code: 62.51%

On the "feat-baseline-coverage" branch: Per test:

1624 total mutants.
617 mutants survived.
  of which 296 were not covered by the tests.
4 mutant(s) caused an error and were therefore not accounted for in the mutation score.
47 mutants timed out.
956 mutants killed.
Ran 48.65 tests per mutant on average.
Mutation score based on covered code: 75.76%
Mutation score based on all code: 61.91%
[2017-03-02 07:39:22.690] [INFO] Stryker - Done in 2 minutes 41 seconds.

all:

1624 total mutants.
611 mutants survived.
  of which 296 were not covered by the tests.
4 mutant(s) caused an error and were therefore not accounted for in the mutation score.
47 mutants timed out.
962 mutants killed.
Ran 110.40 tests per mutant on average.
Mutation score based on covered code: 76.21%
Mutation score based on all code: 62.28%
[2017-03-02 07:35:56.898] [INFO] Stryker - Done in 3 minutes 39 seconds.
riezebosch commented 7 years ago

all:

Ran all tests for this mutant.
73 total mutants.
1 mutants survived.
3 mutants timed out.
69 mutants killed.
Ran 6.11 tests per mutant on average.
Mutation score based on covered code: 98.63%
Mutation score based on all code: 98.63%
[2017-05-01 09:04:50.656] [INFO] Stryker - Done in 9 seconds.

perTest:

73 total mutants.
28 mutants survived.
2 mutants timed out.
43 mutants killed.
Ran 3.81 tests per mutant on average.
Mutation score based on covered code: 61.64%
Mutation score based on all code: 61.64%
[2017-05-01 09:05:44.889] [INFO] Stryker - Done in 9 seconds.

package.json:

    "stryker": "^0.5.9",
    "stryker-api": "^0.4.2",
    "stryker-html-reporter": "^0.3.0",
    "stryker-mocha-runner": "^0.2.0",

Provided a repo via an alternate channel.

nicojs commented 7 years ago

I had some time, so i generated 2 html reports. See attachment: reports.zip

An example of differences is in the ConfigValidator class. I'm planning of investigating this further in the (near) future.

nicojs commented 7 years ago

I found what the problem was after a marathon debug session.

The bug lies within the stryker-mocha-framework plugin. This plugin is responsible for filtering out tests to run. We do this by id. Which is the ordinal index based on the initial test run.

The way the stryker-mocha-framework filters out the tests is by intercepting the addTest (which is called every time it is executed). The assumption was that the order in which its are executed corresponds to the index of the test results reported from the initial test run. However, this is a wrong assumption.

Take this small sweat for example:

describe('a', () => {
  describe('a.b', () => {
    it('a.b.0', () => {
     });
  });
  it('a.0', () => {
  } 
});

In this example the test results would be in this order: ['a.b.0', 'a.0']. However: the order in which the its are executed are: ['a.0', 'a.b.0'].

This is a major bug. We should fix it asap.

nicojs commented 7 years ago

reports_different_coverage_analysis.zip

I've ran the 3 coverage analysis strategies after the fix of #413 and they work great! See the upload