Azure / bicep

Bicep is a declarative language for describing and deploying Azure resources
MIT License
3.16k stars 728 forks source link

Proposal: Bicep Testing Framework #11966

Open sydkar opened 9 months ago

sydkar commented 9 months ago

Proposal - Testing in Bicep

By Emily Redmond

Note: This document represents a product proposal for the future direction of testing in Bicep. You can find an overview of existing experimental testing functionality here. All testing functionality and future proposals are subject to change. If you have any feedback, please leave a comment below so we can track it for future discussion.

Executive Summary

Testing provides the ability to confirm code acts as a programmer intends. With test-driven development, a well-adopted and well-loved principle in software development, a developer can be confident that the actual results of a program are in line with expected results.

The Bicep language strives to draw inspiration from the best practices of software development and make them available in Infrastructure as Code to improve the DevOps experience. Thus, we will introduce a client-side unit-testing framework to the Bicep language through two new keywords, test and assert and a new file type <.biceptest> that requires no Azure connection or deployment to test Bicep code.

Customer research inspired and validated this approach, as Bicep users have indicated that saving time, failing fast, and being confident with deployment changes is important to their workflow. Moreover, customers complain that long authoring cycles that end in failure or incorrectly configured resources is a pain point.

Problem Statement

Long authoring-deployment cycles, ie. dev-test cycles, are a pain point for Bicep users. Users complain that the time for some deployments can exceed 10 minutes, nearing 30 minutes in some cases like AKS and Cosmos DB deployments. Frustratingly, these deployments may then fail halfway through or near the end of this long deployment process due to an incorrectly configured resource.

Users have no way to “fail fast” and catch errors before deployment today. Further, they have no way to validate their Bicep code offline on the client-side (ie. locally without a connection to Azure).

If a deployment completes successfully and a user realizes after the resources have been created that they used an incorrect naming convention, misconfigured certain properties, or simply made a typo, there is no way to easily change the resource settings or roll back the deployment. In many cases, the user must completely tear down or delete the created resources and return to the Bicep authoring stage, thus restarting the 10+ minute deployment cycle.

For example, imagine a Bicep dev has named a resource “${location}-${env}-service-app” and realizes after successfully deploying the resource they did not follow their organization’s expected naming convention “${env}-{location}-service-app”.

Goals

The overarching goal of introducing testing to Bicep is to further improve the quality and maturity of the Bicep language. Goals specific to our proposed testing solution include:

A non-goal of this project is to replicate test frameworks in other established software development languages. While we want to draw inspiration from and learn from users’ perception of existing testing frameworks, we want to build a cloud infrastructure-specific approach.

Target Personas

Bicep DevOps Engineers

User persona: DevOps engineers and Cloud Architects who deploy infrastructure for app teams and broader organization at the enterprise-level, particularly teams or contractors who have multiple developers contributing to complex templates.

Testing need:

Definitions

Push left – To move testing or any component of development ‘left’ in the process diagram towards the beginning of the development cycle to save the developer time or effort and catch errors or inconsistencies as soon as possible and with less friction. “Failing fast” is related to pushing testing left

Assert / Assertion – Common testing framework that allows a developer to provide an expected or intended output and compare it to the actual results of code output via a Boolean expression

Mocking – Run-time functions and results can be mocked on the client-side (ie. Without connection to Azure), thus allowing a developer to test code components on the client-side that are usually dependent on run-time functions

Test Scopes:

Cloud Test Phases:

Solution

Proposed Solution:

Introduce assert and test keywords to Bicep supported by a new .biceptest file type. The focus of introducing testing to Bicep is to provide client-side test functionality, addressing a currently unmet user need for offline, pre-deployment validation.

After developing client-side unit test blocks and corresponding assert statements, we will introduce mocking within test blocks to further the capability of client-side testing, including common Bicep functions that depend on run-time values (eg. resourceGroup.location()).

Throughout development of all testing solutions, testing should be supported in both the CLI (Command Line Interface) and in CI/CD (Continuous Integration/Continuous Deployment) pipelines.

With unit tests, Bicep users can confirm they have met (a) organization-defined conventions, such as company naming standards, and (b) Azure-defined requirements, such as Azure maximum character lengths, before deploying.

Assert statements allow users to fail fast and iterate quickly if they fail their defined tests. This reduces the author-feedback or dev-test deployment cycle, by allowing users to short circuit long deployment wait times that would end in failure or incorrectly configured resources that are difficult to edit once created in Azure. The Expected vs. Actual test output results clearly guide users’ debugging.

This solution provides value for our users by empowering individual Bicep developers to be confident in their deployments. Testing reduces friction when creating Azure resources, thus improving the Azure cloud experience.

test and assert keywords in .biceptest files

test mainTest 'main.bicep' = {
  params: {
    location: 'eastus' 
    env: 'prod' 
  }
}
assert mainServiceAppName = contains(testMain.resources.appServiceApp.name, '${env}')

Mocking

A common Bicep scenario is to refer to a resourceGroup() location within a Bicep template, which requires a connection to ARM to process. We can test components like these that depend on functions that require a connection to ARM by mocking them on the client-side in test blocks. We see mocking as an important tool to continue pushing testing functionality to the client side in future test iterations. See Milestone 3 for implementation details and possible syntax.

Note: Mocking syntax has not been carefully considered, but one proof of concept is to use a standard lambda function syntax, as shown below.

test mainTest 'main.bicep' = {
  params: {
    location: 'eastus' 
    env: 'prod' 
  }
  mocks: {
    resourceGroup: () => {location: 'westus'} 
  }
}

Running Tests, Output, and Results:

Tests can be run through CLI, CI/CD pipelines, and the VS Code integrated debugger. The output result format is standardized regardless of test results so it can be easily and maintainably parsed and interpretted. See output requirements in Milestone 1.

To run tests in the CLI, the user runts the command bicep test <filepath_to_.biceptest_file>.

Proposed Milestones & User Workflows

Milestones:

  1. Introduce a new file type .biceptest in which users author test blocks and corresponding unit-test assert statements that can reference test blocks. Implement (1) client-side test blocks that take params as either objects or as a reference to a .bicepparam file and (2) unit-test assert statement that can reference Bicep template resources within test blocks. Note: Goal functionality for initial release.
  2. Expand on unit-test functionality to add mocking (see definition) of run-time functionality on the client-side. Note: Some initial progress made on a proof of concept of mocking.
  3. Expand on client-side functionality to execute static and run-time assert statements as part of the deployment process.
  4. Implement End-to-End automatic author à unit test à deploy à post-deployment test à teardown loop for users to be fully confident in the expected and actual results of their deployment.

Throughout the development of each milestone, attention should be paid to support testing in both the CLI (Command Line Interface) and in CI/CD (Continuous Integration/Continuous Deployment) pipelines.

Milestone 1 User Workflow:

  1. Users author .bicep template file
  2. Users author .biceptest file and write test blocks to reference .bicep files, passing in parameters either as a object or as a reference to an existing .bicepparam file.
test mainTest 'main.bicep' = {
  params: {
    location: 'eastus' 
    env: 'prod' 
  }
}
  1. Users author assert statements within the .biceptest file referencing resources within test blocks.  
assert mainServiceAppName = testMain.resources.appServiceApp.name == 'prod-solution-app' 
  1. Users run a client-side test command in bicep test <filepath_to_test_file> to run assert statements (without deploying code).
  2. Test results are outputted. The output summary includes:
    • Full breakdown of assertion results for each test block in the following format: test name followed by Number of Passed assertions (ie. assert statements) / Total assertions for each test block
    • If 1 or more assertion in the test block Failed or Skipped, output detailed account of assert name for all Passed and Failed assertions within a test block
    • For all Failed assertions, output Expected vs. Actual values
    • Summary of all test blocks showing Total number of test blocks, number of Passed test blocks / number of Total tests, number of Failed test blocks, and number of Skipped test blocks. These results are defined as:
    • Total tests: All assert statements related to the test block including Passed, Failed, and Skipped
    • Passed test: All assert statements related to the test block Passed. Represented by [ ✓ ]
    • Failed test: At least 1 assert statement related to the test block Failed. Represented by [ ✗ ]
    • Skipped test: At least 1 assert statement related to the test block Skipped. Represented by [ - ]
    • If any test is Skipped, display a note explaining the reason for the test being Skipped, such as:
    • Note: 1 or more Assertions were skipped because you reference properties only available at run-time.
[V] testStorage: Passed 5/5 Assertions
[x] testMain: Passed 1/3 Assertions 
  [V] Assert 'envInName': Passed 
  [x] Assert 'latestVersionInName': Failed
     Expected: testMain2.resources.appServiceApp.name contains 'v3'
     Actual: testMain2.resources.appServiceApp.name == 'v2.09-prod-solution-app' 
  [-] Assert 'resourceGroupLocation': Skipped

Test Summary: 2 Total - Passed 1/2 Tests 
Note: 1 or more Assertions were skipped because you reference properties only available at run-time. 

Note: "[ V ]" represents [ ✓ ]

  1. User debugs .bicep file corresponding to the Failed assert statements
  2. User repeats steps 4-7 until all output Passed

Open Questions

Gijsreyn commented 9 months ago

I'm really curious if the testing can be reported through CI/CD systems, like the Sarif format while linting.

riosengineer commented 9 months ago

+1 on @Gijsreyn suggestion, having a Sarif format would be valuable.

Also, is this imminently being released as an experimental feature without the need for the 'nightly build'? I am getting a 403 when trying to install the nightly CLI & VSCode extension so cannot test this anymore (and the latest Azure CLI & VSCode do not recognise the bicep test cmdlet). Keen to continue testing.

Xitric commented 8 months ago

First of all, this is a very interesting proposal and we are excited to see how this feature continues to develop. I would like to add our thoughts on testing in Bicep, and perhaps also give some examples of what we already do today to improve our confidence in our own modules.

Testing resource naming conventions

With unit tests, Bicep users can confirm they have met (a) organization-defined conventions, such as company naming standards, and (b) Azure-defined requirements, such as Azure maximum character lengths, before deploying.

I can certainly understand if the focus on testing resource names currently stems from the fact that it was probably the easiest thing to implement first - after all, resource names in a module are required to be calculated at the start of deployment time. However, if I author a module, a good practice is to take ownership of how resource names are generated, such that the user of the module needs not be concerned about that. I might have something like this:

resource rg 'Microsoft.Resources/resourceGroups@2023-07-01'' = {
  name: 'rg-${env}-${location}-${appName}'
  location: location
}

To then write a test that my resource group name begins with the substring rg-${env}-${location}- does not seem to add much value, honestly. After all, in writing the test I have to become aware of my company's naming convention, so I might as well get the name right in the Bicep template first. And if I allow the consumer of my module to pass their own resource group name as a parameter, I won't be able to detect that in a unit test of my module. What I will be able to detect is if someone edits the module and changes how the resource group name is generated - but we could probably come up with a linter for detecting the same thing without the Bicep unit tests. The benefit of the linter would be that it only needs to be configured once, whereas the Bicep unit tests need to be (re)written (I assume) for every resource, even of the same type.

Something else related to resource naming that could be quite interesting relates to Ariel Silverstein's comment on having built-in assertions in Bicep. What I often find myself doing when naming new resources is to reference Naming rules and restrictions for Azure resources to determine maximum length and valid characters for the name. If I could rely on built-in tests to do this for me, it would certainly remove one obstacle when naming resources. But then again, a linter could probably achieve much of the same, and even warn me while I write the name instead of later when I run the tests.

Assertions, however, seem rather interesting for enforcing additional policies on parameter values which cannot be expressed using existing annotations.

Relation to what-if and confidence in deployments

When we author a new Bicep module, initially it starts off quite simple. We have a set of parameters that are used for configuring a well-defined set of resources. However, as more people start using the module and new customer requirements arise, the number of conditional expressions or loops in the template grows. Here are some examples from our current modules:

  1. Depending on the type of environment, we choose to either integrate against Elastic Cloud (expensive but reliable) or simply deploy a container instance with Elastic (quick and cheap during development). This is abstracted away in an elastic Bicep module.
  2. Depending on the type of environment, we could use different SKU's for our resources.
  3. Some of our modules manage secrets. Our pattern is typically that if you pass a value for the secret to the module via a parameter, we store it in your key vault, but if you leave the value empty/null, we won't modify the secret already in the key vault. This makes it easier for us to handle both the initial and future deployments of the module. If we ever need to update the secret, we just pass in a new value to the deployment.
  4. Some of our applications have multiple features that can be enabled/disabled at deployment time. Depending on which features are enabled, certain resources must be deployed - but there is no reason deploying resources for a feature that is disabled.

Having test coverage of such conditional paths and loops is vital to improving our confidence in our modules. Currently, we have "solved" this issue by using What-if together with a homemade testing approach written in Pester:

  1. Deploy a Bicep module with a certain set of parameters to an empty resource group in Azure using the what-if command and with the --no-pretty-print flag enabled.
    $deployment = az deployment group what-if -g "rg-ci-test-bicep" -f $bicepFile -p $parameterFile --no-pretty-print 2>$null | ConvertFrom-Json
  2. Perform assertions against the resulting $deployment object to verify that resources will be created as expected based on the set of parameters.

    It "should succeed" {
      $deployment.status | Should -Be "Succeeded"
    }
    
    It "should deploy a SQL server without public network access" {
      $sqlServer = Get-BicepResources $deployment.changes `
        -OfType "Microsoft.Sql/servers" | Select-Object -first 1
    
      $sqlServer.properties.publicNetworkAccess | Should -Be "Disabled"
    }
  3. Repeat with a new permutation of parameters and new assertions...

The downsides of this approach are:

  1. It is homemade
  2. We cannot always trust the output of what-if, making some things impossible to test without an actual deployment
  3. Every permutation of parameters takes 20-40 seconds to complete the what-if operation, which ultimately:
    1. Discourages writing more tests due to delay on pull requests. We could write a lot more tests and improve our test coverage if only it wasn't so "expensive"
    2. Encourages testing multiple things at once (anti-pattern) to make more "efficient" use of the 40 second delay

So ultimately, if it would be possible in the future to use offline Bicep tests for asserting what resources will be created and with what properties (we are fine to test properties that can be determined at the start of deployment time) based on a set of parameters, that would be of tremendous value to our team!

oliverlabs commented 5 months ago

Are there any updates to this feature? Are there any plans to implement it as part of bicep?

sydkar commented 5 months ago

@oliverlabs No updates on this currently. We have paused development on this, but we are open to revisiting it in the future should there be continued interest.

chris-kruining commented 2 months ago

I would like to add my +1 for continued interest. I am working on an internal repo that offers a load of functions so that other our devs can develop faster and adhere to conventions more easily. and I would like to write tests to make sure these functions do what they need to