Azure / bicep

Bicep is a declarative language for describing and deploying Azure resources
MIT License
3.21k stars 742 forks source link

Bicep's public registry feature should include security features to protect enterprise environments #5803

Open yobyot opened 2 years ago

yobyot commented 2 years ago

Bicep's public registry feature should include security features to protect enterprise environments

Problem description

Bicep is planning to release a version containing functionality that implements a public registry. While this is both commonplace and useful, from a security standpoint it is far from an unalloyed good.

Public registries, in the wrong hands (not just black hats but also those of amateur and non-security-oriented developers), can present a real threat to enterprise environments. Consider the experience of Alex Birsan who found it astonishingly easy to hack Apple and others, including Microsoft. His experience lead him to coin the term "dependency confusion" which describes the fundamental challenges of public registries.

In Bicep's case, the prospect of a supply chain attack is extraordinarily frightening. Because Bicep can deploy anything ARM can request of a back-end Azure provider, it's easy to imagine some disturbing attacks and not just of the supply chain (dependency confusion) variety. Some examples include:

(I just made these up as I write this -- making the case this is scarier than one might at first think. I'm not half or even a quarter as clever as the bad guys so just think what they could come up with.)

What should be done?

I'll leave it to the community and the Microsoft team to debate both the merits of this issue (though I think you have to believe in "alternate facts" to think this isn't a pressing issue) as well as the potential solution.

However, I hope any solution would offer at least some of the following capabilities:

alex-frankel commented 2 years ago

Some notes:

One solution could look like the following:

A workflow like that should be able to be used to be completely restrictive (no external references) or partially restrictive. Thoughts?

shenglol commented 2 years ago

Package managers like NuGet and npm allow users to specify package registries to use in config files. Although Bicep is a programming language, the package manager is built into the language server and CLI, so IMHO I feel like it makes sense to add support for whitelisting or blacklisting module registries. We would have to create a dedicated configuration section for this instead of adding a new linter rule though, since the Bicep linter is independent from module restoration (modules being blocked will be downloaded anyway regardless of linting errors).

yobyot commented 2 years ago

CI at the organization ensures that this linter rule is enabled with a specific list such that it is impossible to check in code without this rule being evaluated.

I think this still devolves too much control to the individual developer's workstation. A developer could still attempt to use a non-approved registry and, on receiving the "error," try to work around it.

If I may, you may be overthinking the requirement a bit, @alex-frankel. While I understand you have to consider all the possible implementations of Bicep across many different enterprises users with very large, dispersed development groups will want the ability to "just shut it off." IOW, a more blunt-force, absolute and no-nuance way to control this. Managing lists and encouraging users to workaround something that's still there -- only generating errors -- isn't as complete a solution. A binary "on" or "off" is better, IMHO.

alex-frankel commented 2 years ago

IOW, a more blunt-force, absolute and no-nuance way to control this.

What about a property in bicepconfig.json that accepts a boolean such as allowModules, which would accept true or false? It would be on the org to have a CI test which would enforce that a bicepconfig.json exists and has this this one property (set to false). Would that work better for you @yobyot?

wsmelton commented 2 years ago

I think it is unrealistic to put the whole ownership of security on Microsoft and Bicep itself. No other cloud provider has these type of controls for their deployment mechanisms (GCP or AWS). Minor things could be accomplished with config options, but in the end it is a tool that has to fully support performing a deployment as the user wants. Business rules and security controls have to done elsewhere which is why Azure offers RBAC controls on resources and lock mechanism to prevent unwanted deployments that have not been checked.

You are not going to be able to fully control what a dev can do from their workstation (or Azure Cloud Shell) except through the permissions of their account to your tenants. If you want full control over developers, don't give them rights to deploy anything from their work accounts. Put locks on all of your resources in Azure...there are ton of other things that can done here and I think is where the responsibility should be.

yobyot commented 2 years ago

What about a property in bicepconfig.json...

I don't think this will cut the mustard, @alex-frankel. Consider this scenario:

That's all because Bicep views its role in an enterprise narrowly and/or places too heavy a requirement on the enterpise to monitor deployment pipelines.

I know it's not what you and your colleagues want to hear. But the best solution for Bicep is to make it very hard to use any external libraries by default.

yobyot commented 2 years ago

No other cloud provider has these type of controls for their deployment mechanisms...

Thanks for joining the conversation, @wsmelton.

Sorry, I respectfully disagree that the failure of other cloud systems to address their configuration language security vulnerabilities means that Microsoft isn't obligated to do better.

alex-frankel commented 2 years ago

If they have compromised an organization so deeply that they can bypass the CI system and have owner permissions on a subscription, then they can deploy malicious resources through any number of channels and tools. They could cut bicep entirely out of the picture at that point, no?

I think the most actionable next step is to look at some prior art. Can you share some examples of good systems you have seen deployed for managing this problem in other programming languages? How is this handled in the C# and PowerShell ecosystems?

yobyot commented 2 years ago

How is this handled in the C# and PowerShell ecosystems?

It's not -- and that's the problem. Just in the last few weeks, we've seen reports of government-linked hackers pivoting from the Log4j debacle to PowerShell to gain/maintain a foothold.

Can you share some examples of good systems you have seen deployed for managing this problem in other programming languages?

I'll leave this up to you and the team. Consider it a chance to innovate. :-) All I know is that you guys should spend the time and effort to get ahead of this and think about it fundamentally, not as a tack-on or accommodation.

wsmelton commented 2 years ago

How is this handled in the C# and PowerShell ecosystems?

PowerShell ecosystems provides extremely detailed logging (more than any other language) and configuration options that the Enterprise would use to monitor how user accounts are utilizing PowerShell. The responsibility is placed fully on the Enterprise to implement. Lee Holme's has presented: Defending against PowerShell attacks - in theory, and in practice by Lee holmes.

PowerShell Gallery scans (PSScriptAnalyzer is used for some of this scanning) are performed when modules get published and if those scans find critical issues the owner is notified. If the owner never fixes it the module will be removed, or unpublished. (PowerShell Team can share further details as I'm only aware of a few details).

wsmelton commented 2 years ago

I know it's not what you and your colleagues want to hear. But the best solution for Bicep is to make it very hard to use any external libraries by default.

Terraform does not prevent the use of external providers, and in fact requires it for those deploying to AWS and Azure (which not all of the Azure providers are owned by MS either). That is the biggest feature it offers in extensibility. I know a number of highly secure-minded companies (military/gov't branches, banks, hospitals, etc.) that utilize this tool for deployments. They have no issues meeting their compliance and security standards.

Bicep is just a DSL, ARM is the controlling arm of what gets deployed in Azure and does not even know it came from a Bicep file (JSON is generated before it even sends it to ARM API). I am finding it difficult to understand why the belief that Bicep needs to suddenly be the security guard for all of this.

What industry are you speaking of that needs the level of security you are describing? That may help in finding the right path to discuss within MS.

alex-frankel commented 2 years ago

I tend to agree that this is beyond the scope of Bicep to solve. Dependency management is very large problem to solve and we'd be happy to snap to industry-accepted solutions, but don't feel comfortable charting new territory here since Bicep is such a small project.

I think it makes sense to add some easy on/off switches for allowing external registries (or a specific set) and provide a way to test that the policy is being enforced, but beyond that I am not sure there is much more we can do here.

BernieWhite commented 2 years ago

In terms of PowerShell and .NET there is code signing. While is it up to the organization to configure the specifics some restrictions can be in place.

Some longer term options might be:

Currently ARM/ Policy has no specific way to validate these, but maybe that is possible in the future or can be added to the tool chain as an easy step such as a bicep verify.

alex-frankel commented 2 years ago

Thes are great points and suggestions @berniewhite! Thinking about this problem from the perspective of supply chain security is probably the most promising approach. @SteveLasker I know supply chain is top of mind for container registries. Are there any best practices for securing the supply chain of OCI artifacts?

SteveLasker commented 2 years ago

Hi folks, Thanks @yobyot for the great summary, and @alex-frankel for the ping.

The asks from auditing to signing is right up the secure supply chain work we've been doing here at Microsoft with the SCITT project and contributing to the broad community through ORAS Artifacts and Notary v2

Your private registry isn't just crazy, it's actually a best practice to own the content you depende on. See Consuming Public Content for more detials. @alex-frankel references this as well here

Versioning is also greatness. Here are some opinions on how to version public content in registries.

For the importing/approval process, ACR is building a gated/import cache feature, that aligns with the consuming public content post above. Ralph is driving that feature and can provide more insight.

@shenglol is right-on with the configuration for the registry endpoint. This is a huge gap with container registries. Here are some early thoughts around this: (Is It Time to Change How We Reference Container Images?) (Enabling Artifact CLIs to Reference Environment Specific Registries Through Configuration)

With great power comes responsibility, and trust

@wsmelton and @yobyot, it looks like you're kinda hitting at the core. Only allow users to do as much as required (least privledge requirements), while also assuring you only consume content from entities you trust (signed and verified). And, I'm excited to see the team push forward with better security, and not be in the taillights of what others do. Security is about constantly moving forward, as the bad-folks keep catching up. Terraform is a good model to consider, as they have a great model that we see a LOT of use.

@BernieWhite seems to have teased in the ORAS Artifacts and Notary v2 work.

Notary v2 is on the cusp of releasing. We will have support across Azure and AWS, with others hopefully joining soon. The Zot project has also added support for Notary v2 through ORAS Artifact support.

Here's a script for how you can sign any artifact (bicep, sbom, container image, kids photos) and push the signature to a registry.

BernieWhite commented 3 months ago

Without having to add the whole set of features within the Bicep CLI alone. Some thoughts on how we could make this easier would be: