Investigate and disambiguate the semantics of an `existing` reference for a Bicep Extensibility Resource

Context

As part of the investigation related to https://github.com/project-radius/radius/issues/3876 a knowledge gap was uncovered relating to the usage of the existing keyword in Bicep for Bicep Extensibility Resources in general and for the AWS Provider resources specifically.

The Bicep engine references an existing resource based on the concept of a Scope (as documented in: Existing resources in Bicep). The concept of a Scope is well defined for Azure resources to be the scope of an Azure resource group. If a user doesn't specify the scope for the resource reference, Bicep assumes the resource belongs to the same resource group as the current deployment. A user may choose to specify scope to bind to a resource belonging to a separate resource group.

Gaps

The notion of scope for a Bicep extensibility resource is not explicitly defined.
The algorithm for the resolution of a resource reference of a Bicep extensibility resource is not explicitly defined
The function to fetch a scope other than the current deployment scope is not defined for a Bicep extensibility resource

Desired Outcome / Definition of Done

[ ] Provide answers for the gaps described above.
[ ] Create a test that exercises an AWS resource using the existing keyword in Bicep to assert the behavior is as expected. That is that the reference will be scoped to find the resource in the target region configured in the AWS Bicep extensibility provider and/or the default region set during AWS Provider installation. AB#4144

In a team discussion with @jkotalik he mentioned that it's possible that there is already a pre-existing logic to handle existing references for Bicep extensibility resources and AWS resources as a result. He mentioned that to keep things simple, we should assume the scope to be the region that is configured for the AWS Bicep extensibility provider and/or the default region set during AWS Provider installation.

The ability to set the scope field may be a future optimization/feature and is currently not guaranteed to work.

Adding a task in this issue to create a test that exercises an AWS resource using the existing keyword in Bicep to assert the behavior is as expected.

I can probably answer any questions that are lingering about this. I think the analysis in the original issue includes quite a few misunderstandings that I can hopefully clear up. I think reality is much simpler than this issue suggests 😁

The Bicep engine references an existing resource based on the concept of a Scope (as documented in: Existing resources in Bicep).

This is a misunderstanding. Scope is a property of an Azure/ARM resource and we're using a similar concept in UCP, but none of this is fundamental to Bicep.

Resources in Bicep have two primary operations which I'll call GET and PUT for simplicity. GET is used for existing resources and PUT is used for non-existing resources. GET logically "looks up" a resource, and PUT logically "creates/updates" a resource.

How GET (existing resource) behaves is defined by its provider. The provider can implement whatever logic it wants to decide what resource to return.

The concept of a Scope is well defined for Azure resources to be the scope of an Azure resource group.

There are actually multiple types of scopes in Azure/ARM, not just resource group. For example the scope could be set to a subscription - this is just less common than a resource group.

If a user doesn't specify the scope for the resource reference, Bicep assumes the resource belongs to the same resource group as the current deployment.

This is true for Azure resources and only for resource-group-scoped-deployments, not true in general for extensibility.

There's also a special-case here for extension-resources, which are resources whose scope is another resource most of the time.

A user may choose to specify scope to bind to a resource belonging to a separate resource group.

This is only really true when narrowing the scope (eg: subscription -> resource group) invoking a module. The definition of scope used by the ARM/Azure provider is tightly coupled with how RBAC works for the Azure version of the deployment engine. When a deployment is created the DE creates a token that's only valid for the scope that you provided and it's children. This means that you can't actually can't look up resources in a different resource group than the currently-executing scope. The compiler and the backend include a really large amount of sophisticated logic to block you from this. Again, this all applies only to Azure resources.

1. The notion of `scope` for a Bicep extensibility resource is not explicitly defined.

This is by-design. Each provider defines its own concepts, and each provider gets to define how it behaves.

Azure is a hierarchy, not all resource management systems are hierarchies or leverage hierarchies to the same degree Azure does. If you want to feel better about this, consider that Terraform has no such concept 😁.

2. The algorithm for the resolution of a resource reference of a Bicep extensibility resource is not explicitly defined

This is also by design for the same reasons.

3. The function to fetch a scope other than the current deployment scope is not defined for a Bicep extensibility resource

I'm not totally sure what this means so I'm going to guess what you meant and provide an example.

If you wanted to work on two Kubernetes clusters at once, the way to do this is to declare two instances of the Kubernetes provider. If you wanted to work on two AWS accounts or regions at once, the way to do this is to declare two instances of the AWS provider. This is the design of both Bicep and Terraform.

For the reasons I described above, the Azure provider in Bicep does not support this. I've asked them and they said they have no requests for it.

3. The function to fetch a scope other than the current deployment scope is not defined for a Bicep extensibility resource
I'm not totally sure what this means so I'm going to guess what you meant and provide an example.

Happy to clarify, what I meant is that in the case of Azure they use a call to resourceGroup(exampleRG) to get a reference for a resource group that in the Azure case scopes the resource reference and we I am not sure what the equivalent for an AWS Bicep extensibility resource would be.

source: https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/existing-resource#different-scope

resource stg 'Microsoft.Storage/storageAccounts@2019-06-01' existing = {
  name: 'examplestorage'
  scope: resourceGroup(exampleRG)
}

output blobEndpoint string = stg.properties.primaryEndpoints.blob

Happy to clarify, what I meant is that in the case of Azure they use a call to resourceGroup(exampleRG) to get a reference for a resource group that in the Azure case scopes the resource reference and we I am not sure what the equivalent for an AWS Bicep extensibility resource would be.

OK cool this makes a lot of sense what you're asking about 😁

I think this is a pretty open design space where we have a few options. I can tell from the examples you're using that you're thinking about this as ARM/Azure-like, but it doesn't have to work that way.

I've put a little bit of thinking into these options, but I don't have strong feelings right now about which is best. I think it would also be good to get some input from the Bicep team as well. eg: why did this choose the design they chose, what do they think about these ideas, etc. Good topic for our sync with them.

I have a slight preference for options 1 & 2. I think the 'scope' concept is pretty complicated for users, and they will already be familiar with account and region as concepts.

Option 1 - Separate Providers

This is the design Terraform uses. Each TF AWS provider is configured with a region and account. If you want to work with more than one of those things, then you need to use multiple providers.

// Account/Region could be configured in bicep code or as part of 'provider config'
// it doesn't have a big impact on the design
import aws as region1 
import aws as region2

resource thing1 'region1:AWS.Kinesis/Stream@default' = {
  ....
}

resource thing2 'region2:AWS.Kinesis/Stream@default' = {
  ....
}

This is the reason by Bicep has the as construct, for configuring a provider alias in case there are multiple. The alias is used as a prefix for the resource type, which associates the resource with the provider.

Option 2 - Support Account/Region properties

This design makes the 'single provider' case more flexible by letting the account and region information configurable per-resource.

// Account/Region could be configured in bicep code or as part of 'provider config'
// it doesn't have a big impact on the design
import aws as aws 

resource thing1 'AWS.Kinesis/Stream@default' = {
  ....
}

resource thing2 'AWS.Kinesis/Stream@default' = {
  ....
  Region: us-east-1'
}

I'm showing a hardcoded example for simplicity, but I think most users would not hardcode the region. They can use Bicep parameters 😆

Option 3 - Support Scope property

// Account/Region could be configured in bicep code or as part of 'provider config'
// it doesn't have a big impact on the design
import aws as aws 

resource thing1 'AWS.Kinesis/Stream@default' = {
  ....
}

resource thing2 'AWS.Kinesis/Stream@default' = {
  ....
  scope: '/accounts/<accountnumber>/regions/us-east-1'
}

I'm showing a hardcoded example for simplicity, but I think most users would not hardcode the account/region. They can use Bicep parameters 😆 OR

Option 3 - Support Scope property with functions for lookup

// Account/Region could be configured in bicep code or as part of 'provider config'
// it doesn't have a big impact on the design
import aws as aws 

resource thing1 'AWS.Kinesis/Stream@default' = {
  ....
}

resource thing2 'AWS.Kinesis/Stream@default' = {
  ....
  scope: aws.scope('<accountnumber>', 'us-east-1')
}

This augments option 3 with a function for formatting the scope into a resource ID.

radius-project / radius