JuliaCI / Coverage.jl

Take Julia code coverage and memory allocation results, do useful things with them
MIT License
174 stars 68 forks source link

Coverage information does not seem to be right #187

Closed ronisbr closed 5 years ago

ronisbr commented 5 years ago

Hi guys!

After the new version of Coverage, the coverage information does not seem to be right. Take a look, for example, at the function compose_rotation here:

https://coveralls.io/builds/19787299/source?filename=src/compose_rotations.jl#L53

It says that it is not tested at all. However, take a look at the test here:

https://github.com/JuliaSpace/ReferenceFrameRotations.jl/blob/6fd8162b51958ed809fe36c268eb11bd0a506a43/test/test_compose_rotations_quaternions.jl#L49

The last version of Coverage was saying that ReferenceFrameRotations.jl has a coverage of 100% and now it is 74%.

Notice that the same thing happened with Codecov.

ronisbr commented 5 years ago

By the way, I am also wondering why it is marking this line as it was not executed:

https://codecov.io/gh/JuliaSpace/ReferenceFrameRotations.jl/src/master/src/DCM.jl#L345

whereas the surrounding lines were executed.

ronisbr commented 5 years ago

Using --inline=no seems to fix the problem. However, the tests take ages to complete. Is there any other workaround?

ronisbr commented 5 years ago

And I could not use --inline=no in Travis. It is working only on local analysis. The command I used was:

julia --check-bounds=yes --color=yes --inline=no -e "VERSION >= v\"0.7.0-DEV.5183\" && using Pkg; Pkg.test(\"ReferenceFrameRotations\", coverage=true)"
ararslan commented 5 years ago

Sounds like https://github.com/JuliaLang/julia/issues/28192

ronisbr commented 5 years ago

@ararslan yes, maybe. However, I started to see the problem today. The last coverage result, which was 2 days ago, was working perfectly.

ararslan commented 5 years ago

cc @fingolfin

ararslan commented 5 years ago

Yeah looking around at various PRs with coverage integration, I think we're going to need to revert the latest changes to this package unless we can come up with a quick fix. Something is very wrong.

fingolfin commented 5 years ago

The old coverage was "accidentally correct". The problem is that due to a bug in Coverage.jl, it reported lines that do not contain code, and lines that are not covered, the same way. You can see that in e.g. https://coveralls.io/builds/18398285/source?filename=src/compose_rotations.jl only one line is marked green, but no lines red; so that means it thinks that exactly that one line contains code. Which is of course wrong.

When I fixed this, the other lines of code started to be marked as "is code". But Julia still reports that there is no coverage for them, so now they get marked as "uncovered code".

That's unfortunate, but other than either enabling --inline=no, the only other fix I can think of is to improve how Julia reports code coverage. Specifically, it could try to attribute coverage to inlined functions (I think for C/C++, GCC somehow manages that quite well).

I do not see any way JuliaCoverage.jl can help with that, other than reverting the fix and returning to broken coverage reporting. Which will result in great coverage scores for everybody, with a lot of uncovered code not reported as such. I.e., pick what kind of error you prefer, first or second order :/.

fingolfin commented 5 years ago

@ararslan I am not sure anything is wrong in Coverage.jl, other than it not hiding the effects of https://github.com/JuliaLang/julia/issues/28192 anymore.

That said, one could of course try to mixup things further, and e.g. not report macros as "code"; then the example @ronisbr reports for src/compose_rotations.jl would be "fixed". But not https://codecov.io/gh/JuliaSpace/ReferenceFrameRotations.jl/src/master/src/DCM.jl#L345

ronisbr commented 5 years ago

I understood, thanks @fingolfin . Indeed, running with --inline=no fixed the problem. However, I was not able to run this no Travis. When I tried to enable, it made no difference.

fingolfin commented 5 years ago

As a matter of fact, I wonder if any project using Coverage.jl with Julia 0.7 or 1.0 had a coverage score other than 100% prior to my fixes; I don't see how, at least. While that looks great, it is also completely useless...

ronisbr commented 5 years ago

I see. The local runs with inline=no showed 98% of coverage. I just need then to make this work in Travis somehow... ideas?

fingolfin commented 5 years ago

What did you try? I am looking at https://github.com/JuliaSpace/ReferenceFrameRotations.jl/blob/master/.travis.yml now. I assume you edited the four calls to Julia there to insert --inline=no? And that then did not help (tested how? via a PR to your own project?)

KristofferC commented 5 years ago

Currently, Pkg would have to be tweaked to run the tests with --inline=no.

ronisbr commented 5 years ago

Hi @fingolfin,

I modified the script in travis to execute the following command in my project:

julia --check-bounds=yes --color=yes --inline=no -e "VERSION >= v\"0.7.0-DEV.5183\" && using Pkg; Pkg.test(\"ReferenceFrameRotations\", coverage=true)"

I have reverted the commit because I saw no difference.

@KristofferC,

I understood that something inside Pkg must change to fix this. I have two questions:

  1. Is there any workaround, since it works locally?
  2. Should I open an issue at Julia project?
KristofferC commented 5 years ago

No super good workaround at the moment.

Please open an issue at Pkg.jl.

ronisbr commented 5 years ago

Done, should we close this one since it is not related to this package?

tkoolen commented 5 years ago

While I get the sentiment, I personally prefer accidentally correct over way off until the real issue is fixed in Base. It wasn't perfect, but it certainly was closer to the truth.

fingolfin commented 5 years ago

But always reporting 100% is only closer to the truth if you already know your coverage is above 50%. I'd also argue that it is always useless. You might as well turn off coverage completely.

fingolfin commented 5 years ago

I started a PR for Pkg here: https://github.com/JuliaLang/Pkg.jl/pull/866

tkoolen commented 5 years ago

Thank you for the Pkg PR and also your efforts in maintaining this package.

Coverage 0.6.0 was not always reporting 100% for me, or for many other packages. Check out e.g.

https://codecov.io/github/JuliaArrays/StaticArrays.jl?branch=master

I'm just saying that stuff like https://codecov.io/gh/JuliaArrays/StaticArrays.jl/src/1683f798e0dcb8725768407053a45ae48434fed7/src/eigen.jl#L39 makes me completely distrust the current results. Let's see how much not inlining helps. I'm certainly not saying that the situation before was good though.

ronisbr commented 5 years ago

For ReferenceFrameRotations, when I tested the coverage off-line using inline=no, the results seem much more accurate. I could check this because the functions and the tests are simple.

fingolfin commented 5 years ago

@tkoolen yes, for some functions, the coverage will indeed be less than 100%; but still, coverage is massively overreported in most cases. Just try adding a big function to your code which is not called from anywhere. Doing that should decrease coverage, obviously. But it won't with Coverage.jl 0.6.0, because that function won't be recognized as code.

I agree that https://codecov.io/gh/JuliaArrays/StaticArrays.jl/src/1683f798e0dcb8725768407053a45ae48434fed7/src/eigen.jl#L39 looks bad, but I am also pretty sure it'll improved dramatically with inlining turned off. What you see there is typical for what happens to coverage with inlining turned on.

But please also consider the converse: https://codecov.io/gh/JuliaArrays/StaticArrays.jl/src/86097997bd846f0b9f4331cd55ba8a233375173b/src/FixedSizeArrays.jl reports coverage of 82.14% -- this is with Coverage.jl 0.6.0. With 0.7.0 it is down to 38.33% -- and indeed, note the 24 line function fsa_ast which is completely uncovered.

Personally, I'd rather get lower coverage statics, but not miss uncovered code, than false statistics which lure me into a false sense of code being covered. Of course what I really prefer is accurate coverage data, as I get it for my code written in other languages. I hope Julia and the Julia ecosystem can catch up to that (which is why I got active on this -- the coverage reports we were seeing were all completely useless).

fingolfin commented 5 years ago

I guess a compromise workaround for now could be to do this: instead of marking all code in all functions as, well, code (which causes the coverage drop people are seeing), "only" do so for code in functions for which Julia has marked zero lines as covered. That would hide e.g. the problem in https://codecov.io/gh/JuliaArrays/StaticArrays.jl/src/1683f798e0dcb8725768407053a45ae48434fed7/src/eigen.jl or in https://codecov.io/gh/JuliaSpace/ReferenceFrameRotations.jl/src/master/src/DCM.jl#L345 but would still at least report functions which are never called.

However, I suspect that it still be incorrect in the sense that there would still be code in functions that has no coverage, and yet is reported as having coverage. However, I don't have an example for that right now, so I might be wrong.

It would also still be wrong the other direction, because some functions apparently may be called but completely inlined and thus not reported, as happens e.g. here: https://coveralls.io/builds/19787299/source?filename=src/compose_rotations.jl#L53

tkoolen commented 5 years ago

instead of marking all code in all functions as, well, code (which causes the coverage drop people are seeing), "only" do so for code in functions for which Julia has marked zero lines as covered

Yeah, that's not a bad idea.

sbromberger commented 5 years ago

I wonder if any project using Coverage.jl with Julia 0.7 or 1.0 had a coverage score other than 100% prior to my fixes

Prior to yesterday/today, LightGraphs had a coverage score of 99.8% but it should've been 100%. Now it's got a coverage score of 66%, and no coverage changes have occurred. The coverage reports show lines that are definitely covered but show no coverage.

bramtayl commented 5 years ago

I miss coverage. Anything I can do to help out here?

tkoolen commented 5 years ago

Should be much better on Julia 1.1 due to https://github.com/JuliaLang/julia/pull/30251, although I haven't tried yet. Are you still seeing issues on 1.1?

tpapp commented 5 years ago

I can confirm that coverage is much more accurate with 1.1.

rschwarz commented 5 years ago

By the way (slightly off-topic), when I use Travis to run my tests on both Julia 1.0 and 1.1, both will submit coverage reports. Which ones will be used by codecov or coveralls? The one that is executed most recently?

bramtayl commented 5 years ago

Hmm. Well I'm trying to push up coverage of LightQuery right now. It's tested on 1.1 and nightly. Some parts of the coverage report definitely do not look right, e.g. see https://codecov.io/gh/bramtayl/LightQuery.jl/src/master/src/LightQuery.jl#L277

adamglos92 commented 5 years ago

I've seen improvement in my code as well, unfortunately some trivial lines are not counted, e.g., see https://coveralls.io/builds/21400908/source?filename=src/dynamics.jl#L86

fingolfin commented 5 years ago

@adamglos92 I believe this is an issue in Julia, not in Coverage.jl

fingolfin commented 5 years ago

@bramtayl so the issue is that the module LightQuery line is marked as code? That seems indeed a bit odd (but also harmless).

fingolfin commented 5 years ago

@leethargo Codecov accumulates all coverage reports together into one; that's very handy if one has code that only is used in certain configurations or environments. Coveralls also in principle can do this, but it's very easy to trip up, as I recently learned the hard way, so I'll refrain from making claims about here ;-)

bramtayl commented 5 years ago

I did some digging into this a bit ago. I ran coverage locally and took a look at the results. What I found was that locally, a lot of lines that were code weren't actually marked as code, so my coverage was 100%. And then online at codecov, there were definitely many lines marked as uncovered that were indeed covered. I spent a lot of time trying to get coverage up to 100% but it currently sits at 73% on codecov.

sbromberger commented 5 years ago

I'm still reporting out 93% even though we've got ~100% code coverage. There are simple lines that are simply not counted. I don't know how to work around or fix this, but we're going to have to remove the PR checks that ensure a certain level of coverage unless we can get more accurate reporting.

fingolfin commented 5 years ago

You may want to check out PR #208. In it, I added an environment variable DISABLE_AMEND_COVERAGE_FROM_SRC which you can set to yes to disable the heuristic which marks additional lines as code. While I still think this is problematic on its own, I think it's only fair to let everybody choose which way they prefer, at least until we one day hopefully have a proper solution

Example configuration for AppVeyor:

environment:
  DISABLE_AMEND_COVERAGE_FROM_SRC: yes

Example configuration for Travis:

env:
  global:
    - DISABLE_AMEND_COVERAGE_FROM_SRC=yes

It'd be great if some of you could test if this works for you as intended; then we could put it into the next Coverage.jl release (hopefully out this week), at least assuming @ararslan is fine with it, too.

sbromberger commented 5 years ago

It'd be great if some of you could test if this works for you as intended

ref: https://github.com/JuliaGraphs/LightGraphs.jl/pull/1163

fingolfin commented 5 years ago

I forgot to mention: of course to test with DISABLE_AMEND_COVERAGE_FROM_SRC right now, you'll have to make sure that my mh/DISABLE_AMEND_COVERAGE_FROM_SRC branch of Coverage.jl is installed, not that last release version, which of course does not yet have this experimental change... I'll try to either provide instructions for that, or perhaps we can just make a release with DISABLE_AMEND_COVERAGE_FROM_SRC in it, to make it easier for people to test this... @ararslan thoughts?

ararslan commented 5 years ago

Having the change on master with some documentation and a note clearly stating that it's experimental seems like a reasonable course of action to me.

bramtayl commented 5 years ago

set to yes to disable the heuristic which marks additional lines as code. While I still think this is problematic on its own

Can you give more details on this, @fingolfin ?

fingolfin commented 5 years ago

@bramtayl details on what? on why the heuristic is needed (and thus why disabling it is problematic)? This was extensively discussed in this issue already, and also in several other issues and PRs. In addition, I've added something about it to the README in PR #208.

fingolfin commented 5 years ago

I have just released Coverage.jl version 0.9.0. I hope many of you will see some improvement in coverage thanks to PR #210. I urge you to try that first. If you are still unhappy, you can try the DISABLE_AMEND_COVERAGE_FROM_SRC env var described above (and also described in README.md), to see if helps; but I'll repeat once more that this also risks missing completely uncovered functions, as those won't be counted towards coverage anymore.

ronisbr commented 5 years ago

@fingolfin Thanks! It is indeed much better on my tests.

sbromberger commented 5 years ago

yup. we're back up to 99.11% coverage now. Thanks!

fingolfin commented 5 years ago

Glad to hear this. And @sbromberger tested DISABLE_AMEND_COVERAGE_FROM_SRC, and you can see the bad thing that happens if our workaround heuristic is completely disabled using DISABLE_AMEND_COVERAGE_FROM_SRC: the coverage percentage goes up, because unused functions are ignored. Example: https://codecov.io/gh/JuliaGraphs/LightGraphs.jl/src/c376fd3c947b41f1eea0315a05fff6c069711cd0/src/Parallel/shortestpaths/johnson.jl#L40>

Without the option, and just Coverage.jl 0.9.0, the function is correctly marked as code that is not executed: https://codecov.io/gh/JuliaGraphs/LightGraphs.jl/src/master/src/Parallel/shortestpaths/johnson.jl#L40

fingolfin commented 5 years ago

I think we could close this now... @ararslan ?

sbromberger commented 5 years ago

Just a final question: what should we expect if we turn off DISABLE_AMEND_COVERAGE_FROM_SRC ?

fingolfin commented 5 years ago

I am not sure I understand the question... turn off the DISABLE flag? As in, not use it? And what do you want to know about that?!