MicrosoftPremier / VstsExtensions

Documentation and issue tracking for Microsoft Premier Services Visual Studio Team Services Extensions
MIT License
56 stars 14 forks source link

Build Quality Checks - add option to ignore absence of coverage data #207

Open melezhik opened 1 year ago

melezhik commented 1 year ago

Right now in case of absence of code coverage data, the plugins times out and prints out:

Waiting for code coverage data...

I would like to have an option to not to wait for a coverage data and mark entire task status as success in case of no coverage data presents: ( something like dont_wait_for_data or whatever you like to name it)

    - task: mspremier.BuildQualityChecks.QualityChecks-task.BuildQualityChecks@8
      displayName: 'Check build quality'
      inputs:
        checkCoverage: true
        coverageFailOption: fixed
        coverageType: lines
        coverageThreshold: 50
        dont_wait_for_data: true

This would be helpful in case developers for some reasons have not yet implemented test coverage in their unit tests but we still want to run those tests and don't hang on this step. Once test coverage is implemented by a developer this plugin should work as normal ...

ReneSchumacher commented 1 year ago

Hi @melezhik,

this might be a problem. Azure DevOps processes coverage data asynchronously in the background. Therefore, there are many situations in which the data is not yet available wen BQC task runs. If we would give up waiting right away, the task would just end successfully even if there was coverage data and it should probably have failed.

My recommendation would be to always just enable code coverage when you have tests. You won't need any special implementation, it's usually just a switch to turn on coverage data. You should then switch to previous build coverage instead of a fixed threshold. This gives teams the time to slowly increase coverage instead of having to put in all the effort to reach the minimum.

I'll play around with a couple scenarios for build without coverage to see if the task can know that there definitely won't be any coverage and there is no need to wait.

melezhik commented 1 year ago

this might be a problem. Azure DevOps processes coverage data asynchronously in the background. Therefore, there are many situations in which the data is not yet available wen BQC task runs.

And this is not a problem IF a user explicitly says that they don't want to wait and effectively skip BQC task by specifying the flag:

dont_wait_for_data: true

In that case, BQC wait, say, for a 10 seconds, no data? then just stop here and mark BQC step as a success.

The reason I want that, is some of our repos don't have unit tests at all, so I I add BQC task to our pipelines now, they will hang and eventually fail . But if we had this flag, I'd add BQC with that flag enabled and pipelines would work normally as they do, and when developer enable code coverage in their side, BQC tasks in pipelines will "switch" to check mode, where if any code coverage data comes they just check it and this won't require any modifications in pipelines code

ReneSchumacher commented 1 year ago

I do get the general idea, but BQC will not just "switch" over. What might happen instead is this:

  1. You start with no coverage data and BQC set to 50% coverage threshold with no_wait set to true -> everything is fine
  2. You add the first tests and Azure DevOps is fast enough, so BQC picks up the coverage data (let's say 50%) -> everything is fine
  3. You add more code and decrease coverage to 40%, but this time Azure DevOps is slow and BQC does not pick up the data -> pipeline succeeds while it should have failed because coverage is below threshold

The third part is the problem. We want to make sure that BQC is as reliable as possible. And it might not be, unless you remove the no_wait property. However, if you do this, you could also just disable and enable the task.

I'm not saying we won't do anything like this, I just want to make sure that the task does not become unreliable when we add an option like this. We have enough issues with the coverage backend of Azure DevOps already 😞

melezhik commented 1 year ago

You add more code and decrease coverage to 40%, but this time Azure DevOps is slow and BQC does not pick up the data -> pipeline succeeds while it should have failed because coverage is below threshold

It is all fine, I doubt this is going to be the case for us. We publish code coverage as local files data, why this feeding of data to BQC should be slow? it happens right before this BQC step and data is located on the same build agent, the same file system

ReneSchumacher commented 1 year ago

Coverage data is not evaluated on the agent. Your test task publishes coverage data to Azure DevOps (either automatically when using the Visual Studio Test task or something similar or you do it explicitly by running the Publish Code Coverage Results task). BQC then uses the Azure DevOps API to query that data from Azure DevOps. That way, we don't have to parse the various coverage formats ourselves, but rely on the functionality in Azure DevOps to aggregate and normalize the data.

This publishing and the processing of the coverage data in Azure DevOps takes place asynchronously. In most cases, it's fast enough, so that BQC is able to read data on first try. However, depending on the size of your coverage data sets and the load in Azure DevOps, processing the data in the backend may take a couple seconds (in rare cases even close to a minute or more). Therefore, we built a waiting mechanism into BQC that queries data until the backend signals that all processing is done. Only then we run our policies.

Thus, we cannot simply stop waiting. We would have to figure out if data has been published in the first place, but there is currently no API available to do this. That's why it is slightly more complicated.

melezhik commented 1 year ago

Thus, we cannot simply stop waiting.

I am not saying to stop waiting. I am saying wait a bit (we can configure this timeout also btw) and if no data arrived by reaching a timeout - just don't fail and mark a step as success if dont_wait_for_data is true

we can even rephrase this using slightly better words:

timeout => 10 # timeout 10 secound
fail_if_no_data => false # true by default 

this will absolutely cover our case