Azure / bicep-registry-modules

Bicep registry modules
MIT License
514 stars 361 forks source link

[AVM Module Issue]: Mixing workspace types in avm/res/container-service/managed-cluster #3670

Open mriuttam opened 3 weeks ago

mriuttam commented 3 weeks ago

Check for previous/existing GitHub issues

Issue Type?

Bug

Module Name

avm/res/container-service/managed-cluster

(Optional) Module Version

0.4.1

Description

In Azure, Log Analytics Workspace and Monitor Workspace are two different types of resources. However, in the current main branch of container-service/managed-cluster, log analytics workspace is used interchangeably with monitor workspace at the following code lines:

Is this intentional?

Practically, this leads to situation, where the first deployment of an AKS cluster seems to succeed, but all the following deployments will fail due to errors such as:

"AddContainerInsightsSolutionError": 'Unable to add ContainerInsights solution. […] Message="The value supplied is not a valid Workspace resource Id." […] Target="properties.workspaceResourceId" […]

A potential reason for this behaviour is that at the first run the monitor workspace is empty (blob storage), but then on the following runs, the monitor workspace is initialised as a “monitoring blob storage”.

Nevertheless, ARM seems to want a resource ID of a log analytics workspace for container insights, but a monitoring workspace ID is passed here instead: line 770

When it comes to container insights, I was able to fix the issue by altering the managed-cluster AVM module by adding a new parameter, containerInsightsLawResourceId, and passing it a valid log analytics workspace id:

// Adding new param..
@description('Optional. Resource ID of the log analytics workspace.’)
param containerInsightsLawResourceId string?

[…]

// ..And then, at the lines [766…776](https://github.com/Azure/bicep-registry-modules/blob/main/avm/res/container-service/managed-cluster/main.bicep#L766-L776):
      containerInsights: enableContainerInsights
        ? {
            enabled: enableContainerInsights
            logAnalyticsWorkspaceResourceId: !empty(containerInsightsLawResourceId)
              ? containerInsightsLawResourceId
              : null
            disableCustomMetrics: disableCustomMetrics
            disablePrometheusMetricsScraping: disablePrometheusMetricsScraping
            syslogPort: syslogPort
          }
        : null

Generally, here’s the parameters I’m using when calling the managed-cluster AVM module:

module managedClusters 'br/public:avm/res/container-service/managed-cluster:0.4.1' = {
  name: 'deploy-aks'
  params: {
    name: clusterName
    tags: tags
    location: location
    agentPools: […]
    primaryAgentPoolProfiles: […]
    networkPlugin: 'kubenet'
    enableTelemetry: false
    roleAssignments: []
    monitoringWorkspaceResourceId: mw.id // points to an instance of 'Microsoft.Monitor/accounts'
    omsAgentEnabled: true
    enableContainerInsights: false
    enableAzureMonitorProfileMetrics: true
    diagnosticSettings: [
      {
        name: 'aks'
        workspaceResourceId: law.id // points to an instance of 'Microsoft.OperationalInsights/workspaces'
        logCategoriesAndGroups: […]
      }
    ]
    autoUpgradeProfileUpgradeChannel: 'patch'
    managedIdentities: {
      userAssignedResourcesIds: […]
    }
    dnsPrefix: clusterName
    networkPolicy: 'calico'
    nodeResourceGroup: nodeResourceGroup
    enablePrivateCluster: true
    privateDNSZone: privateDNSZone
    outboundType: outboundType
    enableKeyvaultSecretsProvider: true
    enableSecretRotation: true
    enableOidcIssuerProfile: true
    aadProfileAdminGroupObjectIDs: aadProfileAdminGroupObjectIDs
    enableWorkloadIdentity: true
  }
}

And for the hot fixed module version, I'm passing otherwise the same parameters, but adding along containerInsightsLawResourceId, which gets passed to container insights:

module managedClusters 'managed-cluster-0.4.1-hotfix/main.bicep' = {
  name: 'deploy-aks'
  params: {
    …
    containerInsightsLawResourceId: law.id // points to an instance of 'Microsoft.OperationalInsights/workspaces'
    …
  }
}

Are you able to reproduce the issue?

Thanks in advance

(Optional) Correlation Id

No response

microsoft-github-policy-service[bot] commented 3 weeks ago

[!IMPORTANT] The "Needs: Triage :mag:" label must be removed once the triage process is complete!

[!TIP] For additional guidance on how to triage this issue/PR, see the BRM Issue Triage documentation.

avm-team-linter[bot] commented 3 weeks ago

@mriuttam, thanks for submitting this issue for the avm/res/container-service/managed-cluster module!

[!IMPORTANT] A member of the @Azure/avm-res-containerservice-managedcluster-module-owners-bicep or @Azure/avm-res-containerservice-managedcluster-module-contributors-bicep team will review it soon!

microsoft-github-policy-service[bot] commented 2 weeks ago

[!WARNING] Tagging the AVM Core Team (@Azure/avm-core-team-technical-bicep) due to a module owner or contributor having not responded to this issue within 3 business days. The AVM Core Team will attempt to contact the module owners/contributors directly.

[!TIP]

  • To prevent further actions to take effect, the "Status: Response Overdue 🚩" label must be removed, once this issue has been responded to.
  • To avoid this rule being (re)triggered, the ""Needs: Triage :mag:" label must be removed as part of the triage process (when the issue is first responded to)!
microsoft-github-policy-service[bot] commented 2 weeks ago

[!WARNING] Tagging the AVM Core Team (@Azure/avm-core-team-technical-bicep) due to a module owner or contributor having not responded to this issue within 3 business days. The AVM Core Team will attempt to contact the module owners/contributors directly.

[!TIP]

  • To prevent further actions to take effect, the "Status: Response Overdue 🚩" label must be removed, once this issue has been responded to.
  • To avoid this rule being (re)triggered, the ""Needs: Triage :mag:" label must be removed as part of the triage process (when the issue is first responded to)!
microsoft-github-policy-service[bot] commented 2 weeks ago

[!CAUTION] This issue requires the AVM Core Team's (@Azure/avm-core-team-technical-bicep) immediate attention as it hasn't been responded to within 6 business days.

[!TIP]

  • To avoid this rule being (re)triggered, the "Needs: Triage :mag:" and "Status: Response Overdue :triangular_flag_on_post:" labels must be removed when the issue is first responded to!
  • Remove the "Needs: Immediate Attention :bangbang:" label once the issue has been responded to.
microsoft-github-policy-service[bot] commented 1 week ago

[!WARNING] Tagging the AVM Core Team (@Azure/avm-core-team-technical-bicep) due to a module owner or contributor having not responded to this issue within 3 business days. The AVM Core Team will attempt to contact the module owners/contributors directly.

[!TIP]

  • To prevent further actions to take effect, the "Status: Response Overdue 🚩" label must be removed, once this issue has been responded to.
  • To avoid this rule being (re)triggered, the ""Needs: Triage :mag:" label must be removed as part of the triage process (when the issue is first responded to)!
microsoft-github-policy-service[bot] commented 1 week ago

[!CAUTION] This issue requires the AVM Core Team's (@Azure/avm-core-team-technical-bicep) immediate attention as it hasn't been responded to within 6 business days.

[!TIP]

  • To avoid this rule being (re)triggered, the "Needs: Triage :mag:" and "Status: Response Overdue :triangular_flag_on_post:" labels must be removed when the issue is first responded to!
  • Remove the "Needs: Immediate Attention :bangbang:" label once the issue has been responded to.
microsoft-github-policy-service[bot] commented 5 days ago

[!WARNING] Tagging the AVM Core Team (@Azure/avm-core-team-technical-bicep) due to a module owner or contributor having not responded to this issue within 3 business days. The AVM Core Team will attempt to contact the module owners/contributors directly.

[!TIP]

  • To prevent further actions to take effect, the "Status: Response Overdue 🚩" label must be removed, once this issue has been responded to.
  • To avoid this rule being (re)triggered, the ""Needs: Triage :mag:" label must be removed as part of the triage process (when the issue is first responded to)!
microsoft-github-policy-service[bot] commented 5 days ago

[!CAUTION] This issue requires the AVM Core Team's (@Azure/avm-core-team-technical-bicep) immediate attention as it hasn't been responded to within 6 business days.

[!TIP]

  • To avoid this rule being (re)triggered, the "Needs: Triage :mag:" and "Status: Response Overdue :triangular_flag_on_post:" labels must be removed when the issue is first responded to!
  • Remove the "Needs: Immediate Attention :bangbang:" label once the issue has been responded to.
microsoft-github-policy-service[bot] commented 2 days ago

[!WARNING] Tagging the AVM Core Team (@Azure/avm-core-team-technical-bicep) due to a module owner or contributor having not responded to this issue within 3 business days. The AVM Core Team will attempt to contact the module owners/contributors directly.

[!TIP]

  • To prevent further actions to take effect, the "Status: Response Overdue 🚩" label must be removed, once this issue has been responded to.
  • To avoid this rule being (re)triggered, the ""Needs: Triage :mag:" label must be removed as part of the triage process (when the issue is first responded to)!
microsoft-github-policy-service[bot] commented 1 day ago

[!CAUTION] This issue requires the AVM Core Team's (@Azure/avm-core-team-technical-bicep) immediate attention as it hasn't been responded to within 6 business days.

[!TIP]

  • To avoid this rule being (re)triggered, the "Needs: Triage :mag:" and "Status: Response Overdue :triangular_flag_on_post:" labels must be removed when the issue is first responded to!
  • Remove the "Needs: Immediate Attention :bangbang:" label once the issue has been responded to.