scoverage / gradle-scoverage

A plugin to enable the use of Scoverage in a gradle Scala project
Apache License 2.0
53 stars 38 forks source link

Make Scoverage "test" task cacheable and fix aggregated module scoverage report. #202

Open SheliakLyr opened 1 year ago

SheliakLyr commented 1 year ago

Hi,

I have discovered that "test" task (when running scoverage) could not be cached in gradle build cache. This is a big problem for me and I cannot use this plugin without this feature.

In order to make this task cacheable, I needed to correctly handle additional outputs (workDir). Unfortunately, it is difficult:

  1. compile task produces file dataDir/scoverage.coverage with metadata.
  2. test task produces files dataDir/scoverage.measurements.* with measurement data (basically a list of statements ids).

The fact that both tasks write to the same directory (it cannot be changed) makes it difficult to correctly describe task outputs. I tried using fileTree().matching(), but this cannot be cached (there is an open issue in gradle for that: https://github.com/gradle/gradle/issues/9131). Thus, a complex workaround is needed :(

Solution:

  1. Compile task defines dataDir/scoverage.coverage as an output. It is cached.
  2. Test task has an additional action that copies scoverage.measurements files to another directory. That directory is registered an an output.
  3. For building a report, we need to join scoverage.coverage and measurements files, so I added an additional task scoverageSyncMetaWithOutputs.
  4. Output from scoverageSyncMetaWithOutputs is then passed to report generation.

This could be simplified: Step (3) may be replaced by different report generation than CoverageAggregator.aggregate, which assumes that metadata and measurements are located in the same directory.

This solves the caching problems and my test build is dramatically faster when rebuilding - as expected. However, some tests in this repository started failing. I discovered that current report generation for multiple test tasks is incorrect.

The problem lies in the method CoverageAggregator.aggregate. It can only be used to merge disjointed scoverage data. By "disjointed" I mean reports for different modules (separate compilation units). It cannot be used to merge reports from multiple "test" tasks in a single module.

To understand why, check its implementation. When merging, statements ids are are converted in order to make them unique. However, when scoverage.coverage files are the same, statements are supposed to overlap. That is expected because sources are the same.

I solved this problem by changing globalReportTask. Instead of collecting all dataDirs from a single module, it uses a single file that is populated by measurement files from all test tasks in that module.

Lessons learned and ideas:

  1. I think that the best solution would be to change scoverage-scalac-plugin. It should have two configuration options instead of just one "dataDir". One for metadata and another for measurements files. Also, second one could be runtime-configurable (via env property).
  2. The test task is cacheable, but probably on just one machine. I can address it in following MR.

Known(expected?) problems:

Test tasks share the same workdir, which may cause problems when running them in parallel. Improving scoverage-scalac-plugin should solve this issue, but right know user must enforce some ordering between those tasks.

maiflai commented 1 year ago

thanks, sorry for the delay. I still need to think a little more about this.

I also think that fixing upstream would be the best solution.

SheliakLyr commented 9 months ago

@maiflai Any further thougths? Should I create an alternative MRs for upstreams (scala3 and scoverage-plugin for scala 2)?

BTW: We are using these proposed changes daily without any problems.