lemurheavy / coveralls-public

The public issue tracker for coveralls.io
http://coveralls.io
124 stars 7 forks source link

Make coverage assessment sensitive to concise rewrites. #1325

Open alexandermorgan opened 5 years ago

alexandermorgan commented 5 years ago

Feature request: It would be nice if coverage assessments took into account whether new changes really introduce any new untested lines. There are several other issues about the way the coverage number is arrived at, and the threshold for a build failure, but this is different. Imagine a repo has 100 functions, and one of those functions looks like this:

def stringifyRange(x):
    result = []
    for i in range(x):
        result.append(str(i))
    return result

Let's assume this function is tested and has 100% line coverage, but the overall repo has 90% coverage. Then someone rewrites the function like so:

def stringifyRange(x):
    return [str(i) for i in range(x)]

That function just went from 4 lines to 1, but with no functional change, and all the tests still pass. But the coverage will still decrease marginally because three fewer lines are now tested in the whole repo. A repo can change the minimum coverage decrease for a build to be considered failed, but a change like this should probably never trigger a failure. A decrease should not be considered a failure if all changed lines are contained in functions that are 100% covered before and after a change. There's a big difference between a coverage decrease where a new untested line of code is added, and one reduces the number of tested lines but that does not introduce any untested lines and leaves all altered functions at 100% coverage. Since you would want to calculate coverage in the same way, the changed assessment could be that qualifying changes would not be considered failed builds, and maybe the coverage decrease would appear in gray or black instead of red.

ped7g commented 5 years ago

I wouldn't mind even less intelligent hack, that when (per-file) missed lines counter is same as before, but relevant/covered lines went down, it would use that black/grey classification (for the particular file, and for whole build, unless there's "real" decrease in other modified file).

For example build like this: https://coveralls.io/builds/25027942

(I refactored almost-identical code from three places into single function in different file, lowering total amount of relevant/covered lines, while not touching missed lines at all)

I understand this may lead in extreme case to reporting grey build when the real code quality went slightly down, but if basically all of the correct refactors removing duplicity in code lead to failed builds, it feels more wrong to me, than occasional grey one which which deserved red.