MicrosoftPremier / VstsExtensions

Documentation and issue tracking for Microsoft Premier Services Visual Studio Team Services Extensions
MIT License
59 stars 14 forks source link

How to use the 'Build Quality Checks' task in a multi-job Azure DevOps YAML pipeline. #244

Open suryawanshi1999 opened 2 weeks ago

suryawanshi1999 commented 2 weeks ago

Struture and expected behavior I have a YAML pipeline with five build jobs:

  1. Build Job 1 (Microsoft-hosted): Builds the .NET solution and publishes test libraries to pipeline artifacts.

  2. Build Jobs 2 and 3 (Self-hosted): Download the test library artifact and run Oracle tests with VSTest@3 (codeCoverageEnabled: true).

  3. Build Job 4 (Microsoft-hosted): Downloads the test library artifact and runs MSS tests with VSTest@3 (codeCoverageEnabled: true).

    Jobs 2, 3, and 4 run in parallel.

  4. Build Job 5: Runs the 'Build Quality Checks' task (version 9) after Jobs 1-4 are complete.

Issue/Problem The 'Build Quality Checks' task displays fluctuating code coverage percentages, often resulting in pipeline failures due to inconsistent readings—sometimes showing higher values and other times lower.

Analysis conducted so far (on my side) The 'Build Quality Checks' task calculates the percentage based on the last successful build job. For instance, since Jobs 2, 3, and 4 run in parallel, the coverage percentage is determined by whichever job completes last.

This is what I have observed so far.

Expected result We want to display and calculate the accurate code coverage percentage from the 'Build Quality Checks' task, taking into account the results from all Build Jobs 2, 3, and 4.

ReneSchumacher commented 2 weeks ago

Hi @suryawanshi1999,

this issue cannot be solved with the current implementation of BQC, at least based on my current understanding of your pipeline setup.

BQC does not evaluate coverage internally. Instead, it relies on Azure DevOps and its APIs to read the coverage values from Azure DevOps Services. Azure DevOps does not store coverage per job. Coverage data is stored per pipeline run and coverage values from different jobs are (sometimes) merged into a coverage summary object. I write "sometimes" because it depends on the type of coverage and the structure of your pipeline.

E.g., .NET coverage created by VSTest (.coverage files) should be merged based on platform and configuration. I.e., if you run two test suites, one for x64/debug and one for x64/release, the coverage summary object will contain two individual results for the two platform/configuration combinations. On the other hand, the coverage values will be summed up if both test suites target the same platform/configuration (e.g., x64/release) and you won't see the individual coverage results from the two test runs. For other coverage types (e.g., Cobertura), Azure DevOps usually only stores the first coverage result that was published and discards all additional coverage results. The latest versions of the built-in tasks (e.g., the publish code coverage results task) have the ability to merge multiple coverage result files on the agent and then send the aggregated values to Azure DevOps.

As you can see, the way Azure DevOps tracks your coverage depends on the type of coverage, the task/tool that creates the coverage, and the task/tool that is used to publish the coverage to Azure DevOps. In addition, what is displayed in the Coverage tab of the pipeline summary page is not necessarily what is stored in Azure DevOps since the coverage report is generated by a tool called Report Generator, which processes all coverage files that have been uploaded as pipeline artifacts to generate the report. Thus, you will often see different values on the build summary page (upper right corner) and the coverage tab as shown below:

Summary Page BQC Data (API) Report Generator Results

Currently, the only way to fix this is to ensure that all data is properly merged. This can be achieved by either making sure that only .coverage files are used or all coverage results are merged on the agent before they are published to Azure DevOps.

We are working on a version of BQC that uses custom coverage parsing but don't have an ETA for this yet.

suryawanshi1999 commented 2 weeks ago

Hi @ReneSchumacher, thanks for updating this information. You mentioned that this can be achieved by either ensuring only .coverage files are used or merging all coverage results on the agent before publishing to Azure DevOps.

To confirm, are you suggesting that for each build job, my VSTest@3 task publishing individual .coverage files, like these examples:

F:\Agents\02\_work\_temp\TestResults\c36de822-bc76-4c07-87e3-3818a8acb563\aksingh_2024-11-01.13_11_19.coverage
F:\Agents\02\_work\_temp\TestResults\ced3eb0b-b0c3-44f9-9073-be95c558dd09\aksingh_2024-11-01.13_13_13.coverage
F:\Agents\02\_work\_temp\TestResults\b64270e1-2c07-4882-9c79-e16abe1048f2\aksingh_2024-11-01.13_16_44.coverage
F:\Agents\01\_work\_temp\TestResults\b780b819-2cc0-4bb4-a6f9-e33fabeefdad\aksingh_2024-11-01.13_12_52.coverage
F:\Agents\01\_work\_temp\TestResults\f5056638-c0b8-4d34-9cfb-30df2ce34a68\aksingh_2024-11-01.13_18_16.coverage
D:\a\_temp\TestResults\0c5f52e2-df49-42d7-a996-0229426a6abe\VssAdministrator_fv-az497-392_2024-11-01.11_26_41.coverage

Would I then use a task to merge these files into a single .coverage file and use a Publish Pipeline Artifacts task to make it available to the final build job where the BQC task is added?

If so, could you please tell me the name of the task that can merge .coverage files, and could you provide an example of its usage?

ReneSchumacher commented 2 weeks ago

If you ensure that all tests are using the .coverage format, then the merge job in Azure DevOps Services should take care of merging the results correctly. You just need to decide if you want one aggregated value (i.e., all coverage results are tagged with the same platform and configuration) or multiple values for the individual test runs (i.e., each coverage result uses a different platform and configuration).

Platform and configuration can be configured in the VSTest task and similarly in BQC. If you aggregate everything (most likely by leaving platform and configuration empty), you should use a single BQC task with empty platform and configuration. When you want to distinguish between the test runs, you need multiple VSTest tasks with different values for platform and configuration and the same number of BQC tasks with matching platform and configuration values set.