jenkinsci / code-coverage-api-plugin

Deprecated Jenkins Code Coverage Plugin
https://plugins.jenkins.io/code-coverage-api/
MIT License
112 stars 75 forks source link

Stored source code files should use a unique path #657

Open medianick opened 1 year ago

medianick commented 1 year ago

When multiple coverage files are merged, then the covered files might have the same name (and relative path) but use a different absolute path. The plugin currently has no way to choose the correct file.

Original issue:

Jenkins and plugins versions report

Jenkins 2.387.2

Plugins - plugin-util-api: 3.1.0, - font-awesome-api: 6.3.0-1, - bootstrap5-api: 5.2.2-1, - jquery3-api: 3.6.3-1, - mina-sshd-api-common: 2.9.2-50.va_0e1f42659a_a, - configuration-as-code: 1612.v1d254764cf6b, - mina-sshd-api-core: 2.9.2-50.va_0e1f42659a_a, - echarts-api: 5.4.0-2, - resource-disposer: 0.21, - ws-cleanup: 0.44, - workflow-job: 1284.v2fe8ed4573d4, - build-timeout: 1.28, - credentials-binding: 523.vd859a_4b_122e6, - timestamper: 1.22, - aws-java-sdk-minimal: 1.12.406-370.v8f993c987059, - aws-java-sdk-elasticbeanstalk: 1.12.406-370.v8f993c987059, - workflow-durable-task-step: 1234.v019404b_3832a, - aws-java-sdk-sqs: 1.12.406-370.v8f993c987059, - workflow-cps: 3641.vf58904a_b_b_5d8, - pipeline-groovy-lib: 629.vb_5627b_ee2104, - aws-java-sdk-sns: 1.12.406-370.v8f993c987059, - aws-java-sdk-iam: 1.12.406-370.v8f993c987059, - aws-java-sdk-ecs: 1.12.406-370.v8f993c987059, - aws-java-sdk-ecr: 1.12.406-370.v8f993c987059, - aws-java-sdk-efs: 1.12.406-370.v8f993c987059, - github-branch-source: 1701.v00cc8184df93, - aws-java-sdk-ssm: 1.12.406-370.v8f993c987059, - aws-java-sdk-logs: 1.12.406-370.v8f993c987059, - pipeline-rest-api: 2.31, - authentication-tokens: 1.4, - pipeline-stage-view: 2.31, - aws-java-sdk-codebuild: 1.12.406-370.v8f993c987059, - aws-java-sdk-cloudformation: 1.12.406-370.v8f993c987059, - aws-java-sdk-ec2: 1.12.406-370.v8f993c987059, - aws-java-sdk: 1.12.406-370.v8f993c987059, - pipeline-model-api: 2.2123.va_31cb_3b_80ef8, - pipeline-model-extensions: 2.2123.va_31cb_3b_80ef8, - pipeline-stage-tags-metadata: 2.2123.va_31cb_3b_80ef8, - pipeline-model-definition: 2.2123.va_31cb_3b_80ef8, - datadog: 5.3.0, - jersey2-api: 2.39-1, - embeddable-build-status: 339.v1edb_5e63da_45, - gradle: 2.3.2, - artifactory: 3.18.0, - implied-labels: 0.11, - pipeline-build-step: 487.va_823138eee8b_, - data-tables-api: 1.13.3-2, - forensics-api: 2.0.1, - prism-api: 1.29.0-3, - code-coverage-api: 4.0.1, - analysis-model-api: 11.0.2, - warnings-ng: 10.0.3, - pipeline-graph-view: 175.vb_1a_b_b_cd0fb_86, - pipeline-github: 2.8-138.d766e30bb08b, - text-finder: 1.23, - throttle-concurrents: 2.12, - versioncolumn: 95.v82f3985cd6e1, - email-ext: 2.95, - ec2: 2.0.6, - lockable-resources: 1131.vb_7c3d377e723, - jfrog: 1.2.1

What Operating System are you using (both controller, and any agents involved in the problem)?

Amazon Linux 2

Reproduction steps

Switched deprecated method like this:

publishCoverage(
    adapters: [coberturaReportAdapter(mergeToOneReport: true, path: '**/*cobertura*.xml')],
    calculateDiffForChangeRequests: true,
    failNoReports: true,
    globalThresholds: [[thresholdTarget: 'Line', unstableThreshold: 89.0]]
)

to what I expected to be its equivalent:

recordCoverage(
    qualityGates: [[metric: 'LINE', threshold: 89.0]],
    tools: [[parser: 'COBERTURA', pattern: '**/*cobertura*.xml']]
)

Expected Results

Expected absolute line-level coverage (89%) to be reported.

Actual Results

Got an error stating this:

Cannot compute maximum of coverages LINE: 100.00% (9/9) and LINE: 100.00% (3/3) since total differs

Anything else?

The build generates 10 different Cobertura report files that are then combined in a single publishCoverage step (which I'm aiming to replace with reportCoverage). None of the Line coverages are 100%; the only 100% coverages are for Module (by definition, I expect), and in a few cases, Package.

medianick commented 1 year ago

Upon further investigation, this was actually the result of the pattern (**/*cobertura*.xml) matching two copies of the same file -- we use ReportGenerator to merge separate reports, and had overlooked the fact that the same file was being archived twice from different locations. Apparently, publishCoverage did not care about that situation, but recordCoverage now does. At any rate, this doesn't seem like a bug anymore.

medianick commented 1 year ago

I'm actually going to reopen this, after additional investigation. There had indeed been an unwanted duplication of a file, which the **/*cobertura*.xml pattern was finding, but after eliminating that duplicate, this error still occurs. This now appears to be the result of two different (and legitimate) cobertura-coverage.xml files being included, coming from two different projects in the build. If I alter the pattern: argument so that only one of them is selected, the error goes away.

My best guess is that this is occurring because both coverage reports contain some of the same filenames, e.g., src/Routes.tsx, even though those are legitimately different files from different locations (as one can see in the differing <source> values in the respective files).

medianick commented 1 year ago

I'm happy to supply more information, including redacted samples of the actual XML files, but don't know what would be the most useful.

uhafner commented 1 year ago

When there are multiple files specified for one coverage step, then the results of those files are merged. If these files use different module names, then a new tree node will be created, that contains all the modules. But if the modules have the same name, then they are merged using a file by file operation. It looks that in your case the results of different files that use the same unique ID (relative path) are merged which will cause this error.

Workarounds for your XML reports:

medianick commented 1 year ago

Yes, my intent had been to have one recordCoverage step that consumes data from a set of different Cobertura XML files, just as I previously had one publishCoverage step (now deprecated). I could use multiple recordCoverage calls but would then have different GitHub checks published and different Coverage Reports in Jenkins; I was hoping to have only one.

In this context, what do you mean by "module"? That is not a label in the Cobertura XML reports, which use terms like "package", "class", "method", and "line". I'm open to different module names if it's something I can easily control.

uhafner commented 1 year ago

You are right, it seems that Cobertura does not provide a name for a report in its DTD. So currently I am using "-" as name for each report (and this produces a name clash if there are the same packages present). Maybe it makes more sense to use the actual report XML file name as module name.

When I change that, then the exception should not be thrown anymore. However, the UI will be still not capable of showing the results per module. Currently, the UI collects the results of all reports into a single model. In your case this means that the results of different files with the same package and file name will be shown in a single page. I'm not sure what would be the best way to change that for the users. We discussed that in #236 and finally came to the solution, that in such cases the best way would be to use multiple steps. If these results are recorded in a single step we need a UI concept to select the modules.

medianick commented 1 year ago

Yes, my top-level package name in this case is src -- reflecting the "src" folder that exists under each of the two projects that are experiencing this phenomenon. I will look into whether I can run the tests from a higher directory level to force those package names to be more meaningful, as two levels above "src" is indeed a directory with a unique, project-specific name.

uhafner commented 1 year ago

Which tool or languages are you using to create the reports? I think it would be helpful if all users would get better package or directory names out of the box. I'm using a package detector in my warnings plugin to identify packages for tools and languages that do not support them natively. Maybe we can use something similar for your tool/language chain. I'm not familiar with typescript and react but I would assume that they have a similar concept of modules, packages, and classes?

medianick commented 1 year ago

We're running Jest via npm. This issue seems to be a similar one; in that case one of the solutions was to modify the report files via PowerShell so that the package name attributes used the folder structure, which would allow for disambiguation (in case of collisions) and better visualization regardless.

medianick commented 1 year ago

We've resolved this for now like this:

sed -e 's/package name="src/package name="MyModuleName/g' -i coverage/cobertura-coverage.xml

This is run immediately after the coverage reports are generated, and prior to any use of recordCoverage. In this case, it was a Linux-based build, so sed was available, but we have similar options via PowerShell if needed on Windows.

uhafner commented 1 year ago

I see, thanks for sharing!

Can those module names somehow be derived from your workspace? For Maven, Gradle, OSGi I have scanners that detect those modules automatically. Maybe I can add another one for typescript (or NPM?) projects.

medianick commented 1 year ago

There are a few possibilities I can think of:

What isn't obvious is the way you would determine a simple module name (e.g., ModuleA or ModuleB) from these paths, given the things that could differ (workspace path, coverage directory). But there is at least a reliable way to see that these reports pertain to different paths, so any files within that happen to have the same names (e.g., Routes.tsx) should not be treated as if they were the same file.

medianick commented 1 year ago

Following up on my sed approach above; this is sufficient to avoid the exception described at the start of this issue, as the package names themselves in the files are now disambiguated. However, files with the same name (i.e., the same <class name and filename attributes) are still combined -- or rather, one replaces the other -- when viewing the Coverage Report. For example, I have two projects with a Routes.tsx file, whose entries in the Cobertura XML reports look like this:

ModuleA:

<class name="Routes.tsx" filename="src/Routes.tsx" line-rate="0" branch-rate="1">

ModuleB:

<class name="Routes.tsx" filename="src/Routes.tsx" line-rate="0.8056" branch-rate="0.9375">

These are now underneath different <package name values, with my sed trick above, and the Coverage Report UI correctly lists them separately, but the painted source code itself is duplicated -- it shows the contents of the file from ModuleA twice.

I imagine this is because it searched for a matching file twice (since it appears twice, across these two reports) but found the same one each time. That might not be solvable with a single invocation of recordCoverage as I'm doing now; the only thing in the XML file itself that distinguishes the two is the source element value.

markferry commented 1 year ago

This also breaks for python projects when combining coverage reports produced by test runs using different python versions - a common scenario.

The workaround is to combine coverage databases beforehand instead of relying on this plugin to combine reports. The pytest-cov project itself provides an example using tox.

[testenv]
setenv =
    py{38,39}: COVERAGE_FILE = .coverage.{envname}

[testenv:report]
skip_install = true
deps = coverage
commands =
    coverage combine   # reads .coverage.*, outputs .coverage
    coverage xml     # reads .coverage, outputs coverage.xml

EDIT: but note you'll have to handle path remapping between nodes yourself. Fortunately coverage.py makes this painless:

# .coveragerc
[paths]
source =
    src/
    */site-packages/
    */src/
uhafner commented 1 year ago

This also breaks for python projects when combining coverage reports produced by test runs using different python versions - a common scenario.

You are referring to the exception, aren't you? For this exception please also see #729.