Proposal - 'anonymous' module definitions

anthony-c-martin commented 3 years ago

Motivation

Bicep greatly simplifies code reuse with modules. However, a sometimes-undesirable side effect of using modules is that a nested deployment is used to represent the module resources in the template JSON. This means that:

You're forced to give a name when defining a module in Bicep. It's not clear to the end-user that this value has semantic value, and picking a 'bad' value could introduce problems.
Viewing related deployment operations today is complex as you have to manually traverse the nested deployments and collect the data yourself.
There are possible race conditions - e.g. deploying two modules with the same name inside a single deployment scope can lead to failures or data being overwritten/lost.
The above also makes modules an unattractive option for more granular forms of code reuse (e.g. sharing constants between files).

My theory is that 95% of Bicep users don't care about how their resources are deployed, and simply use modules as a means of code reuse. For many of these users, treating all resources across module boundaries as part a single deployment in fact makes the deployment model conceptually simpler, and improves debuggability & auditability.

Suggestion

Introduce a deployment engine change such that it's possible to deploy an anonymous nested template, which is expanded into the current deployment scope at runtime, rather than generating an independent deployment.
```
<TBD what the intermediate JSON would look like!>
```
Simplify the Bicep module experience by removing the need for the name field:
```
module myMod './myMod.bicep' = {
params: {
paramA 'valueA'
}
}
```
When viewing deployments & deployment operations for a .bicep file containing one or more anonymous modules, Portal, Pwsh, CLI, SDK would only surface a single deployment containing all of the resources deployed - perhaps with some grouping to match the module structure.

Notes

Another approach is for Bicep to expand module resources directly into the parent template during codegen. To me this feels less attractive as it results in a loss of fidelity (may particularly be relevant when we start to think about external modules), and would make Bicep codegen much more complex.
This approach may not be viable when crossing scope boundaries with modules, as 'scope escaping' with resources is not currently supported and has some hard challenges around auditability.

afscrome commented 3 years ago

Bicep users don't care about how their resources are deployed, and simply use modules as a means of code reuse

+1 from me. I can also think of 3 issues I've encountered that I think anonymous modules would have avoided/mitigated:

2529
2760
Various what-if issues around resources being ignored becaus passing a resoruce id between modules resolves down to a reference call (which what-if can't handle). (At least I think many of these would be resolved if you took the crude inline approach - not so sure if handling these in the deployment engine would help these)

afscrome commented 3 years ago

@anthony-c-martin Could this also allow for more optimal parallelism in deployments (reducing deployment times).

Right now, if module B is dependant on the output from module A, B can't start until everything in A is finished. If A and B are inlined into a single deployment, would the deployment engine be clever enough to start resources in module B before A is fully complete?

majastrz commented 3 years ago

I definitely agree that inlining a module into the parent deployment would be challenging. The nested deployment today acts as a nice JSON language scope that makes a lot of the codegen logic pretty simple. Even the making the decision whether something could be inlined or not will be complex because the nested deployments act like a boundary for our deploy-time constant analysis.

alex-frankel commented 3 years ago

This approach may not be viable when crossing scope boundaries with modules, as 'scope escaping' with resources is not currently supported and has some hard challenges around auditability.

we should consider allowing scope escaping for all resource types in the deployments runtime as a prereq for this. I think it'll be complex to understand that "if I need to scope escape, I must name my module"

miqm commented 3 years ago

This approach may not be viable when crossing scope boundaries with modules, as 'scope escaping' with resources is not currently supported and has some hard challenges around auditability.

we should consider allowing scope escaping for all resource types in the deployments runtime as a prereq for this. I think it'll be complex to understand that "if I need to scope escape, I must name my module"

I was thinking about an additional keyword like inline module moduleSymbolName 'file.bicep' = {...} - so you as developer will know that this module will not be a separate deployment but a fragment of the module. Then we can omit the name and scope, restrict some of the features that require nested deployments (loops, conditions) and potentially ignore module outputs or translate them to direct reference() calls.

But this will make code-gen more complicated or we would need to do the inlining before the code gen - on the semantic tree (but then we might have problems with scoping variables).

On the other hand, same as inline methods in C# - this might seem to many dev's like more advanced feature and will not use it / will not know it.

askew commented 3 years ago

There are two very distinct things here.

A Bicep module is an ARM nested deployment, that has parameters and outputs, and deploys separately.
A desire to separate the code into separate files.

I think bicep does need to make it clear to the author what the end result will be.

The first is a feature of ARM, the latter is [could be] a compile time feature of bicep.

If you are not using a nested deployment, then you are just combining all the resources into a single template. You could also achieve this by passing multiple .bicep files to bicep build, or even a directory. You could also have an include statement that would pull in and combine another bicep file.

The nice thing about the anonymous module suggestion is could allow scoped parameters, but I'm not sure how this would be handled in the resultant ARM template.

miqm commented 3 years ago

The nice thing about the anonymous module suggestion is could allow scoped parameters, but I'm not sure how this would be handled in the resultant ARM template.

I think not parameters will be a problem, but variables. Parameters of the module we could easily replace by repeating assigned values ("unfold" them), but variables would need to be global for the deployment. In the output ARM we probably need to prefix them with some unique but deterministic identifier (hash of a relative location from the main file?).

bmoore-msft commented 3 years ago

Need to think about the impact of this on limits that exist in a single deployment - today, modules/nesting are used to "work around" those limits. We need to be careful not to block larger scale deployments that rely on this.

mitchdenny commented 3 years ago

I just added a related issue suggesting a kind of inline module synax: #4555

To me one of the benefits of having a compiler that front-ends ARM is being able to avoid having to author a separate template file for simple little things (in my example I used child DNS Zone delegation which requires projecting the values from the nameservers property of a child zone into a dnsZone/NS resource.

miqm commented 3 years ago

I just added a related issue suggesting a kind of inline module synax: #4555

To me one of the benefits of having a compiler that front-ends ARM is being able to avoid having to author a separate template file for simple little things (in my example I used child DNS Zone delegation which requires projecting the values from the nameservers property of a child zone into a dnsZone/NS resource.

I think there are couple different cases: By deployment scope:

desired nested deployment when you want to change the scope (different Rg, sub, mg, etc)
same scope (modularise for authoring convenience and code re-use)

By authoring:

separate files
module inside a module.

The main goal purpose behind using 'short' modules inside bicep file is to change the scope for a single resource.

The purpose for 'inlining' modules is to split big files into smaller but they still will be deployed to a single scope.

Inlining modules by code gen can be beneficial as it'd reduce the file size removing some nested deployment 'bloatware', however we'd hit some limitations on deployment time constants (using reference function in resource names).

On the other hand, with extensive use of modules, all the benefits of running pre-deployment validation is gone as it only validates the top level module and sub-deployments still fail.

It seems to me like this problem is not solvable without runtime change. I'd happy to give my PoV as person who writes and deploys bicep code on daily basis. I'm really looking forward for improving this. Multiple modules are great for code reuse but they can make deployments run few times longer comparing to a one-file.

afscrome commented 2 years ago

A few other scenarios where anonymous modules could help

Accidental Secret revealing

If you think of bicep modules as code reuse, and being like includes / function calls, then it can be surprising behaviour that secrets passed as params to a sub module could be leaked. Whilst a number of linter rules have been written to detect these cases, they are still ultimately best guesses and will miss things. If a module was inlined, there is no risk of secrets being leaked from the nested module's inputs or outputs.

Circular dependencies

I've encountered a few situations where I've created bicep templates that end up being undeployable due to a circular dependency. I've found this particularly easy to do with role assignments around managed identity.

If inlined, this circular dependency could be resolved:

Azure / bicep