Proposal - Testing in Bicep

By Emily Redmond

Note: This document represents a product proposal for the future direction of testing in Bicep. You can find an overview of existing experimental testing functionality here. All testing functionality and future proposals are subject to change. If you have any feedback, please leave a comment below so we can track it for future discussion.

Executive Summary

Testing provides the ability to confirm code acts as a programmer intends. With test-driven development, a well-adopted and well-loved principle in software development, a developer can be confident that the actual results of a program are in line with expected results.

The Bicep language strives to draw inspiration from the best practices of software development and make them available in Infrastructure as Code to improve the DevOps experience. Thus, we will introduce a client-side unit-testing framework to the Bicep language through two new keywords, test and assert and a new file type <.biceptest> that requires no Azure connection or deployment to test Bicep code.

Customer research inspired and validated this approach, as Bicep users have indicated that saving time, failing fast, and being confident with deployment changes is important to their workflow. Moreover, customers complain that long authoring cycles that end in failure or incorrectly configured resources is a pain point.

Problem Statement

Long authoring-deployment cycles, ie. dev-test cycles, are a pain point for Bicep users. Users complain that the time for some deployments can exceed 10 minutes, nearing 30 minutes in some cases like AKS and Cosmos DB deployments. Frustratingly, these deployments may then fail halfway through or near the end of this long deployment process due to an incorrectly configured resource.

Users have no way to “fail fast” and catch errors before deployment today. Further, they have no way to validate their Bicep code offline on the client-side (ie. locally without a connection to Azure).

If a deployment completes successfully and a user realizes after the resources have been created that they used an incorrect naming convention, misconfigured certain properties, or simply made a typo, there is no way to easily change the resource settings or roll back the deployment. In many cases, the user must completely tear down or delete the created resources and return to the Bicep authoring stage, thus restarting the 10+ minute deployment cycle.

For example, imagine a Bicep dev has named a resource “${location}-${env}-service-app” and realizes after successfully deploying the resource they did not follow their organization’s expected naming convention “${env}-{location}-service-app”.

Goals

The overarching goal of introducing testing to Bicep is to further improve the quality and maturity of the Bicep language. Goals specific to our proposed testing solution include:

Push error detection further left in the author-deployment development cycle
- Focus on client-side testing without connecting to ARM. This tightens the dev/test loop, facilitating early validation at the Bicep-user level rather than the organization level, thus empowering individual Bicep developers to be confident in deployments.
- Support unit testing. With Bicep’s focus on modular development, it is essential that customers are confident in each module as an individual unit before combining modules and using/reusing them as building blocks for a template. Unit tests allow for validation and confidence in individual modules.
Empower Bicep developers and their infrastructure teams to be confident in their deployments and maintain modules used across the team.
Create an integrated Bicep testing experience that doesn’t require any additional components to be installed and easily integrates with current developer workflow.

A non-goal of this project is to replicate test frameworks in other established software development languages. While we want to draw inspiration from and learn from users’ perception of existing testing frameworks, we want to build a cloud infrastructure-specific approach.

Target Personas

Bicep DevOps Engineers

User persona: DevOps engineers and Cloud Architects who deploy infrastructure for app teams and broader organization at the enterprise-level, particularly teams or contractors who have multiple developers contributing to complex templates.

Testing need:

Testing helps maintain modules and ensures consistency and conventions across the team. Testing empowers individual developers to be confident in deployments and thus supports the efficiency of the entire team.
Bicep contractors can provide a set of tests to their clients with their Bicep templates, which allows their clients to be more confident in updating the templates as needed without fear of breaking changes by running their changes against the provided tests to ensure the system still works as intended.

Definitions

Push left – To move testing or any component of development ‘left’ in the process diagram towards the beginning of the development cycle to save the developer time or effort and catch errors or inconsistencies as soon as possible and with less friction. “Failing fast” is related to pushing testing left

Assert / Assertion – Common testing framework that allows a developer to provide an expected or intended output and compare it to the actual results of code output via a Boolean expression

Mocking – Run-time functions and results can be mocked on the client-side (ie. Without connection to Azure), thus allowing a developer to test code components on the client-side that are usually dependent on run-time functions

Test Scopes:

Unit test (Current focus for Bicep testing) - Validates the integrity of a single component without considering dependencies. Unit tests run offline on the client-side (ie. Without connection to Azure)
Integration test - Runs on and between all the units that make up a functional group, e.g. infra for a single app
E2E test - End to End test. Confirms the end state of a deployed app conforms to expectations

Cloud Test Phases:

Pre-Deployment – Includes authoring a Bicep template and converting a Bicep template to an ARM template. This phase is completely offline, ie. no connection to ARM or Azure
During-Deployment – Includes calling the Azure Resource Manager (ARM) deployments API and the connecting to the resource provider of every resource defined in a Bicep template. This phase requires a connection to ARM and each resource provider.
Post-Deployment – Includes evaluating or reviewing created resources in Azure. This phase requires a connection to Azure to review existing resources.

Solution

Proposed Solution:

Introduce assert and test keywords to Bicep supported by a new .biceptest file type. The focus of introducing testing to Bicep is to provide client-side test functionality, addressing a currently unmet user need for offline, pre-deployment validation.

After developing client-side unit test blocks and corresponding assert statements, we will introduce mocking within test blocks to further the capability of client-side testing, including common Bicep functions that depend on run-time values (eg. resourceGroup.location()).

Throughout development of all testing solutions, testing should be supported in both the CLI (Command Line Interface) and in CI/CD (Continuous Integration/Continuous Deployment) pipelines.

With unit tests, Bicep users can confirm they have met (a) organization-defined conventions, such as company naming standards, and (b) Azure-defined requirements, such as Azure maximum character lengths, before deploying.

Assert statements allow users to fail fast and iterate quickly if they fail their defined tests. This reduces the author-feedback or dev-test deployment cycle, by allowing users to short circuit long deployment wait times that would end in failure or incorrectly configured resources that are difficult to edit once created in Azure. The Expected vs. Actual test output results clearly guide users’ debugging.

This solution provides value for our users by empowering individual Bicep developers to be confident in their deployments. Testing reduces friction when creating Azure resources, thus improving the Azure cloud experience.

`test` and `assert` keywords in `.biceptest` files

.biceptest is a new file type to support Bicep unit tests where the test and assert keywords are active (ie. these keywords are only functional within .biceptest files)
test blocks reference individual .bicep files and pass in parameters to mock deployment conditions, thus defining a client-side unit test

test mainTest 'main.bicep' = {
  params: {
    location: 'eastus' 
    env: 'prod' 
  }
}

assert statements are boolean expressions that can validate template conventions and expectations of a .bicep file that is referenced by a test block. Assert statements support Bicep logic and functions, such as “contains()”, “length()”, etc. Assert statements are authored within a .biceptest file and symbolically refer to values within test blocks.
- When users author assert statements in Bicep the assert statements get converted to the equivalent ARM template assert statements depending on their deployment stage (template-level, resource-level, or resource properties-level) upon building the .bicep file.

assert mainServiceAppName = contains(testMain.resources.appServiceApp.name, '${env}')

Mocking

A common Bicep scenario is to refer to a resourceGroup() location within a Bicep template, which requires a connection to ARM to process. We can test components like these that depend on functions that require a connection to ARM by mocking them on the client-side in test blocks. We see mocking as an important tool to continue pushing testing functionality to the client side in future test iterations. See Milestone 3 for implementation details and possible syntax.

Note: Mocking syntax has not been carefully considered, but one proof of concept is to use a standard lambda function syntax, as shown below.

test mainTest 'main.bicep' = {
  params: {
    location: 'eastus' 
    env: 'prod' 
  }
  mocks: {
    resourceGroup: () => {location: 'westus'} 
  }
}

Running Tests, Output, and Results:

Tests can be run through CLI, CI/CD pipelines, and the VS Code integrated debugger. The output result format is standardized regardless of test results so it can be easily and maintainably parsed and interpretted. See output requirements in Milestone 1.

To run tests in the CLI, the user runts the command bicep test <filepath_to_.biceptest_file>.

Proposed Milestones & User Workflows

Milestones:

Introduce a new file type .biceptest in which users author test blocks and corresponding unit-test assert statements that can reference test blocks. Implement (1) client-side test blocks that take params as either objects or as a reference to a .bicepparam file and (2) unit-test assert statement that can reference Bicep template resources within test blocks. Note: Goal functionality for initial release.
Expand on unit-test functionality to add mocking (see definition) of run-time functionality on the client-side. Note: Some initial progress made on a proof of concept of mocking.
Expand on client-side functionality to execute static and run-time assert statements as part of the deployment process.
Implement End-to-End automatic author à unit test à deploy à post-deployment test à teardown loop for users to be fully confident in the expected and actual results of their deployment.

Throughout the development of each milestone, attention should be paid to support testing in both the CLI (Command Line Interface) and in CI/CD (Continuous Integration/Continuous Deployment) pipelines.

Milestone 1 User Workflow:

Users author .bicep template file
Users author .biceptest file and write test blocks to reference .bicep files, passing in parameters either as a object or as a reference to an existing .bicepparam file.

test mainTest 'main.bicep' = {
  params: {
    location: 'eastus' 
    env: 'prod' 
  }
}

Users author assert statements within the .biceptest file referencing resources within test blocks.

assert mainServiceAppName = testMain.resources.appServiceApp.name == 'prod-solution-app'

Users run a client-side test command in bicep test <filepath_to_test_file> to run assert statements (without deploying code).
Test results are outputted. The output summary includes:
- Full breakdown of assertion results for each test block in the following format: test name followed by Number of Passed assertions (ie. assert statements) / Total assertions for each test block
- If 1 or more assertion in the test block Failed or Skipped, output detailed account of assert name for all Passed and Failed assertions within a test block
- For all Failed assertions, output Expected vs. Actual values
- Summary of all test blocks showing Total number of test blocks, number of Passed test blocks / number of Total tests, number of Failed test blocks, and number of Skipped test blocks. These results are defined as:
- Total tests: All assert statements related to the test block including Passed, Failed, and Skipped
- Passed test: All assert statements related to the test block Passed. Represented by [ ✓ ]
- Failed test: At least 1 assert statement related to the test block Failed. Represented by [ ✗ ]
- Skipped test: At least 1 assert statement related to the test block Skipped. Represented by [ - ]
- If any test is Skipped, display a note explaining the reason for the test being Skipped, such as:
- Note: 1 or more Assertions were skipped because you reference properties only available at run-time.

[V] testStorage: Passed 5/5 Assertions
[x] testMain: Passed 1/3 Assertions 
  [V] Assert 'envInName': Passed 
  [x] Assert 'latestVersionInName': Failed
     Expected: testMain2.resources.appServiceApp.name contains 'v3'
     Actual: testMain2.resources.appServiceApp.name == 'v2.09-prod-solution-app' 
  [-] Assert 'resourceGroupLocation': Skipped

Test Summary: 2 Total - Passed 1/2 Tests 
Note: 1 or more Assertions were skipped because you reference properties only available at run-time.

Note: "[ V ]" represents [ ✓ ]

User debugs .bicep file corresponding to the Failed assert statements
User repeats steps 4-7 until all output Passed

Open Questions

Mocking syntax
Assert statement overlap with policy statements
Relation to what-if
Re-occurring AKS connection pain point --> is the real problem how long these deployments take in the first place?
How to define integration testing in terms of cloud infrastructure?
How to define E2E testing in terms of cloud infrastructure?
Will some Bicep users only ever be fully confident with complete deployments?
Should we have some published, suggested, or automatic tests that test Azure resource constraints?
- From Ariel Silverstein: Here is an idea to add value to the testing framework. As you may already know, Azure resource property constraints are not enforceable and in most cases are only known by reading documentation or trial and error (deploying something, seeing it fail, discovering that the value in the property is not matching the constraints). If we were able to publish known assertions for these Azure resources and properties and run these assertions as a built-in capability of bicep test then we are already adding an intrinsic value to this feature.

First of all, this is a very interesting proposal and we are excited to see how this feature continues to develop. I would like to add our thoughts on testing in Bicep, and perhaps also give some examples of what we already do today to improve our confidence in our own modules.

Testing resource naming conventions

With unit tests, Bicep users can confirm they have met (a) organization-defined conventions, such as company naming standards, and (b) Azure-defined requirements, such as Azure maximum character lengths, before deploying.

I can certainly understand if the focus on testing resource names currently stems from the fact that it was probably the easiest thing to implement first - after all, resource names in a module are required to be calculated at the start of deployment time. However, if I author a module, a good practice is to take ownership of how resource names are generated, such that the user of the module needs not be concerned about that. I might have something like this:

resource rg 'Microsoft.Resources/resourceGroups@2023-07-01'' = {
  name: 'rg-${env}-${location}-${appName}'
  location: location
}

To then write a test that my resource group name begins with the substring rg-${env}-${location}- does not seem to add much value, honestly. After all, in writing the test I have to become aware of my company's naming convention, so I might as well get the name right in the Bicep template first. And if I allow the consumer of my module to pass their own resource group name as a parameter, I won't be able to detect that in a unit test of my module. What I will be able to detect is if someone edits the module and changes how the resource group name is generated - but we could probably come up with a linter for detecting the same thing without the Bicep unit tests. The benefit of the linter would be that it only needs to be configured once, whereas the Bicep unit tests need to be (re)written (I assume) for every resource, even of the same type.

Something else related to resource naming that could be quite interesting relates to Ariel Silverstein's comment on having built-in assertions in Bicep. What I often find myself doing when naming new resources is to reference Naming rules and restrictions for Azure resources to determine maximum length and valid characters for the name. If I could rely on built-in tests to do this for me, it would certainly remove one obstacle when naming resources. But then again, a linter could probably achieve much of the same, and even warn me while I write the name instead of later when I run the tests.

Assertions, however, seem rather interesting for enforcing additional policies on parameter values which cannot be expressed using existing annotations.

Relation to what-if and confidence in deployments

When we author a new Bicep module, initially it starts off quite simple. We have a set of parameters that are used for configuring a well-defined set of resources. However, as more people start using the module and new customer requirements arise, the number of conditional expressions or loops in the template grows. Here are some examples from our current modules:

Depending on the type of environment, we choose to either integrate against Elastic Cloud (expensive but reliable) or simply deploy a container instance with Elastic (quick and cheap during development). This is abstracted away in an elastic Bicep module.
Depending on the type of environment, we could use different SKU's for our resources.
Some of our modules manage secrets. Our pattern is typically that if you pass a value for the secret to the module via a parameter, we store it in your key vault, but if you leave the value empty/null, we won't modify the secret already in the key vault. This makes it easier for us to handle both the initial and future deployments of the module. If we ever need to update the secret, we just pass in a new value to the deployment.
Some of our applications have multiple features that can be enabled/disabled at deployment time. Depending on which features are enabled, certain resources must be deployed - but there is no reason deploying resources for a feature that is disabled.

Having test coverage of such conditional paths and loops is vital to improving our confidence in our modules. Currently, we have "solved" this issue by using What-if together with a homemade testing approach written in Pester:

Deploy a Bicep module with a certain set of parameters to an empty resource group in Azure using the what-if command and with the --no-pretty-print flag enabled.
```
$deployment = az deployment group what-if -g "rg-ci-test-bicep" -f $bicepFile -p $parameterFile --no-pretty-print 2>$null | ConvertFrom-Json
```

Perform assertions against the resulting $deployment object to verify that resources will be created as expected based on the set of parameters.

It "should succeed" {
  $deployment.status | Should -Be "Succeeded"
}

It "should deploy a SQL server without public network access" {
  $sqlServer = Get-BicepResources $deployment.changes `
    -OfType "Microsoft.Sql/servers" | Select-Object -first 1

  $sqlServer.properties.publicNetworkAccess | Should -Be "Disabled"
}

Repeat with a new permutation of parameters and new assertions...

The downsides of this approach are:

It is homemade
We cannot always trust the output of what-if, making some things impossible to test without an actual deployment
Every permutation of parameters takes 20-40 seconds to complete the what-if operation, which ultimately:
1. Discourages writing more tests due to delay on pull requests. We could write a lot more tests and improve our test coverage if only it wasn't so "expensive"
2. Encourages testing multiple things at once (anti-pattern) to make more "efficient" use of the 40 second delay

So ultimately, if it would be possible in the future to use offline Bicep tests for asserting what resources will be created and with what properties (we are fine to test properties that can be determined at the start of deployment time) based on a set of parameters, that would be of tremendous value to our team!

Azure / bicep

Proposal: Bicep Testing Framework #11966

Proposal - Testing in Bicep

Executive Summary

Problem Statement

Goals

Target Personas

Bicep DevOps Engineers

Definitions

Solution

Proposed Solution:

`test` and `assert` keywords in `.biceptest` files

Mocking

Running Tests, Output, and Results:

Proposed Milestones & User Workflows

Milestones:

Milestone 1 User Workflow:

Open Questions

Azure / bicep

Proposal: Bicep Testing Framework #11966

Proposal - Testing in Bicep

Executive Summary

Problem Statement

Goals

Target Personas

Bicep DevOps Engineers

Definitions

Solution

Proposed Solution:

test and assert keywords in .biceptest files

Mocking

Running Tests, Output, and Results:

Proposed Milestones & User Workflows

Milestones:

Milestone 1 User Workflow:

Open Questions

`test` and `assert` keywords in `.biceptest` files