twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.48k stars 703 forks source link

Enable Codecoverage Upload (Jacoco/Codecov) #1981

Closed daniel-sudz closed 2 years ago

daniel-sudz commented 2 years ago

basically moving changes over from https://github.com/twitter/scalding/pull/1922 based on https://github.com/twitter/scalding/issues/1977

codecov-commenter commented 2 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (develop@525321c). Click here to learn what that means. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             develop    #1981   +/-   ##
==========================================
  Coverage           ?   40.49%           
  Complexity         ?     1353           
==========================================
  Files              ?      356           
  Lines              ?    26221           
  Branches           ?     4579           
==========================================
  Hits               ?    10617           
  Misses             ?    13938           
  Partials           ?     1666           

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 525321c...5db2df2. Read the comment docs.

johnynek commented 2 years ago

Coverage seems quite low. Are we sure it is aggregating all the projects correctly? Many subprojects wind up testing the others.

daniel-sudz commented 2 years ago

Coverage seems quite low. Are we sure it is aggregating all the projects correctly? Many subprojects wind up testing the others.

there's currently one part in the CI that is failing. It should be a bit higher because of it. I'm not 100% sure why it's failing just get a "step failed" in cascading. I suspect it's due to very high memory usage due to coverage. I can currently reproduce the failure locally when switching coverage on/off before running the test. I'm playing with memory settings but currently not much success. See https://github.com/scoverage/sbt-scoverage/issues/50

daniel-sudz commented 2 years ago

Coverage seems quite low. Are we sure it is aggregating all the projects correctly? Many subprojects wind up testing the others.

there's currently one part in the CI that is failing. It should be a bit higher because of it. I'm not 100% sure why it's failing just get a "step failed" in cascading. I suspect it's due to very high memory usage due to coverage. I can currently reproduce the failure locally when switching coverage on/off before running the test. I'm playing with memory settings but currently not much success. See scoverage/sbt-scoverage#50

after tuning the JVM opts, jacking the stack frame size to 64M (crazy!) seemed to let the failing test pass. It looks like others have had similar issues: https://github.com/scoverage/sbt-scoverage/issues/276 https://github.com/scoverage/sbt-scoverage/issues/422

edit: I though I had coverage enabled for above but I did not so I still have no resolution for sbt-scoverage

johnynek commented 2 years ago

huh:

https://github.com/twitter/scalding/runs/5664208563?check_suite_focus=true#step:7:1367

seems one of the jobs fails (in both 2.11 and 2.12) with coverage... I wonder if somehow the instrumentation is making something non-serializable...

daniel-sudz commented 2 years ago

huh:

https://github.com/twitter/scalding/runs/5664208563?check_suite_focus=true#step:7:1367

seems one of the jobs fails (in both 2.11 and 2.12) with coverage... I wonder if somehow the instrumentation is making something non-serializable...

that's completely possible. I've tested https://www.scala-sbt.org/sbt-jacoco/index.html as an alternative on the failing test and it seems to work. I'm going to try it in CI.

The downside of jacoco is that it knows nothing about scala (targets bytecode) so it might be slightly less accurate.

daniel-sudz commented 2 years ago

@johnynek does 40% sound reasonable? It's recording 0% for maple (there is no test folder) and 0% for scalding-avro (no test folder) and 0% for scalding-thrift-macros (there are some tests but maybe it can't trace macros well).

I think this is similar to what scoverage was reporting.

Report is here: https://codecov.io/gh/twitter/scalding/tree/5db2df2a50b4a92c0fbf934a5dc13b06229624e9 Screen Shot 2022-03-24 at 12 28 30 PM

the reference from 2 years ago was 68%: https://codecov.io/gh/twitter/scalding/tree/f1e9670175047724bf23727ab17f807303281150

Screen Shot 2022-03-24 at 12 34 38 PM

johnynek commented 2 years ago

I don't see how it could have fallen so much, but maybe.

I was looking through, it marks a lot of lines as partially covered, e.g.:

https://app.codecov.io/gh/twitter/scalding/blob/5db2df2a50b4a92c0fbf934a5dc13b06229624e9/scalding-core/src/main/scala/com/twitter/scalding/typed/memory_backend/MemoryBackend.scala

are those being counted?

Anyway, this is an improvement, and we can iterate. Thanks for spending the time on this. We can later see if we can find examples of tests that definitely exercise code, but it doesn't see and add an issue at that time.

daniel-sudz commented 2 years ago

I don't see how it could have fallen so much, but maybe.

I was looking through, it marks a lot of lines as partially covered, e.g.:

https://app.codecov.io/gh/twitter/scalding/blob/5db2df2a50b4a92c0fbf934a5dc13b06229624e9/scalding-core/src/main/scala/com/twitter/scalding/typed/memory_backend/MemoryBackend.scala

are those being counted?

Anyway, this is an improvement, and we can iterate. Thanks for spending the time on this. We can later see if we can find examples of tests that definitely exercise code, but it doesn't see and add an issue at that time.

my guess would be that the partial lines are the ones that can throw exceptions. Ie. they are being executed but not all paths covered

johnynek commented 2 years ago

Thanks!