hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.46k stars 4.54k forks source link

azurerm_container_app - Support for managed TLS certificate #21866

Open jtatum opened 1 year ago

jtatum commented 1 year ago

Is there an existing issue for this?

Community Note

Description

Container Apps has a public preview of managed SSL certs available (see microsoft/azure-container-apps#607). The azurerm_container_app resource is already kind of tough to use with external domains, requiring either complex logic on when to include a custom domain block and multiple terraform runs, or manually commenting out the custom domain block until your validation records exist. The validation records are almost exactly the same as in Azure App Service, so this would be much easier if we could follow the app service pattern via new resources - azurerm_container_app_custom_hostname_binding and azurerm_container_app_certificate_binding. Based on the cli in microsoft/azure-container-apps#607, I don't think it needs a separate azurerm_container_app_managed_certificate resource - the binding appears to create a managed certificate by default if one isn't specified.

New or Affected Resource(s)/Data Source(s)

azurerm_container_app

Potential Terraform Configuration

data "azurerm_dns_zone" "example" {
  name                = "mydomain.com"
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_log_analytics_workspace" "example" {
  name                = "acctest-01"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  sku                 = "PerGB2018"
  retention_in_days   = 30
}

resource "azurerm_container_app_environment" "example" {
  name                       = "example-environment"
  location                   = azurerm_resource_group.example.location
  resource_group_name        = azurerm_resource_group.example.name
  log_analytics_workspace_id = azurerm_log_analytics_workspace.example.id
}

resource "azurerm_container_app" "example" {
  name                         = "example-app"
  container_app_environment_id = azurerm_container_app_environment.example.id
  resource_group_name          = azurerm_resource_group.example.name
  revision_mode                = "Single"

  template {
    container {
      name   = "examplecontainerapp"
      image  = "mcr.microsoft.com/azuredocs/containerapps-helloworld:latest"
      cpu    = 0.25
      memory = "0.5Gi"
    }
  }
}

resource "azurerm_dns_txt_record" "example" {
  name                = "asuid.mycustomhost"
  zone_name           = data.azurerm_dns_zone.example.name
  resource_group_name = data.azurerm_dns_zone.example.resource_group_name
  ttl                 = 300

  record {
    value = azurerm_container_app.example.custom_domain_verification_id
  }
}

resource "azurerm_dns_cname_record" "example" {
  name                = "mycustomhost"
  zone_name           = data.azurerm_dns_zone.example.name
  resource_group_name = data.azurerm_dns_zone.example.resource_group_name
  ttl                 = 300
  record              = azurerm_container_app.example.ingress[0].fqdn
}

resource "azurerm_container_app_custom_hostname_binding" "example" {
  hostname            = join(".", [azurerm_dns_cname_record.example.name, azurerm_dns_cname_record.example.zone_name])
  container_app_name  = azurerm_container_app.example.name
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_app_service_certificate_binding" "example" {
  hostname_binding_id = azurerm_container_app_custom_hostname_binding.example.id
  ssl_state           = "SniEnabled"
  managed_certificate = true  # if needed?
}

References

No response

delconis commented 1 year ago

Since container apps are very opinionated by design. I would debate that the only change required should add a type of managed to the custom_domain.certificate_binding_type and require a reference to your hosted_zone.name. It could then add the entries to complete a Cert Binding / Challenge.

resource "azurerm_container_app" "example" {
  name                         = "junk-client-${var.environment}"
  container_app_environment_id = data.azurerm_container_app_environment.example.id
  resource_group_name          = data.azurerm_resource_group.current.name
  revision_mode                = "Single"

  ingress {
    external_enabled = true
    target_port      = 80

    custom_domain {
      name = "app.env.domain.tld"
      certificate_binding_type = "managed"
      hosted_zone_id = data.azurerm_dns_zone.example.id
    }
    traffic_weight {
      percentage = 100
    }
  }
}

The name would be the full DNS name as it should but the entry created would be considered a root @ if the name matches the hosted_zones name, or it would trim the hosted_zone name that must match the name provided and create records for roots or subdomains appropriately such as @ vs app for zone records.

I think the solution example you provide has a great use case, such as an externally hosted DNS Zone and alignment with other products. But I would as well debate that the container app definition should still have a new custom_domain.certificate_binding_type type such as external_managed that would work with your solution, and that would at least complete the hostname binding and output a new attribute custom_domain_binding_id. and then you would externally bind your certificate as in your example.

resource "azurerm_container_app" "example" {
  name                         = "junk-client-${var.environment}"
  container_app_environment_id = data.azurerm_container_app_environment.shared.id
  resource_group_name          = data.azurerm_resource_group.current.name
  revision_mode                = "Single"

  ingress {
    external_enabled = true
    target_port      = 80

    custom_domain {
      certificate_binding_type = "external_managed"
      name = "app.env.domain.tld"
    }
    traffic_weight {
      percentage = 100
    }
  }
}

## Child Zone of `env.domain.tld`
resource "azurerm_dns_txt_record" "example" {
  name                = "asuid.app"
  zone_name           = data.azurerm_dns_zone.example.name
  resource_group_name = data.azurerm_dns_zone.example.resource_group_name
  ttl                 = 300

  record {
    value = azurerm_container_app.example.custom_domain_verification_id
  }
}

## Child Zone of `env.domain.tld`
resource "azurerm_dns_cname_record" "example" {
  name                = "app"
  zone_name           = data.azurerm_dns_zone.example.name
  resource_group_name = data.azurerm_dns_zone.example.resource_group_name
  ttl                 = 300
  record              = azurerm_container_app.example.ingress[0].fqdn
}

resource "azurerm_container_app_certificate_binding" "example" {
  container_app_id = azurerm_container_app.example.id
  ssl_state           = "SniEnabled"
  managed_certificate = true  # if needed?
}
delconis commented 1 year ago

I think as a bit of a separate topic but would change custom_domain.certificate_binding_type from my above example from external_managed to external and have external_managed support a set of Providers such as LetsEncrypt or integration with ACME for Terraform.

delconis commented 1 year ago

We used the UI to complete adding Managed Certificate, and when trying to align the Manual Changes with Terraform's drift created from the manual actions the referenced certificate failed to parse the field on terraform validate

Drift we had from adding the cert via UI

          - custom_domain {
              - certificate_binding_type = "SniEnabled" -> null
              - certificate_id           = "[Excluded] Example provided below" -> null
              - name                     = "app.env.domain.tld" -> null
            }

The drift showed this as an example for the certificate_id

/subscriptions/[EXCLUDED]/resourceGroups/shared-dev/providers/Microsoft.App/managedEnvironments/cae-shared-dev/managedCertificates/[EXCLUDED_CERT_ID]

It failed with the error when the drift was added to the terraform source.

parsing segment "staticCertificates": expected the segment "managedCertificates" to be "certificates"

This would be expected as the Terraform is not in alignment with this new Preview feature, but something I thought worth noting as this feature's current state does not consider the resource type for the certificate as supported. We are ignoring changes in our ingress[0].custom_domain for the interim.

jtatum commented 1 year ago

I might be wrong, but with a single resource I think you'd wind up with issues creating the validation records when creating apps from scratch. You need the app to get created in order to get the ingress address and the validation address, but since you also define the certificate/custom domain in the app, you have to run tf more than once to work around this (and either add the custom domain after creating the app, or do some locals/dynamic block tricks to get it to work).

LynnAU commented 1 year ago

Is there any update on this?

Been a month since the last comment and this issue is more than a month old now.

jtatum commented 1 year ago

@LynnAU don't forget to react to the issue with a πŸ‘ to help the maintainers prioritize this issue. We need about 40 of them to get on the second page of issues.

fardarter commented 10 months ago

For anyone needing a workaround, az containerapp hostname add and az containerapp hostname bind can be run in a provisioner.

loadaverage commented 10 months ago

I might be wrong, but with a single resource I think you'd wind up with issues creating the validation records when creating apps from scratch. You need the app to get created in order to get the ingress address and the validation address, but since you also define the certificate/custom domain in the app, you have to run tf more than once to work around this (and either add the custom domain after creating the app, or do some locals/dynamic block tricks to get it to work).

One-time apply works perfectly with Azure Static WebApps (azurerm_static_site and azurerm_static_site_custom_domain resources) where you can have custom domain validated with CNAME only. I don't see any reason why exactly this way is not implemented on Container Apps (except infrastructure solutions on Azure of course).

E.g. with such simple approach I can get my domain (hosted on Azure DNS) accessible over TLS:

resource "azurerm_resource_group" "myapp" {
  name     = local.name
  location = local.azurerm_resource_group_location
  tags     = local.tags
}

resource "azurerm_static_site" "myapp" {
  name                = local.name
  resource_group_name = azurerm_resource_group.myapp.name
  location            = local.app_location
  sku_tier            = local.app_sku.tier
  sku_size            = local.app_sku.size
  tags                = local.tags
}

resource "azurerm_static_site_custom_domain" "myapp" {
  static_site_id  = azurerm_static_site.myapp.id
  domain_name     = local.custom_domain
  validation_type = "cname-delegation"
}

resource "azurerm_dns_zone" "myapp" {
  name                = local.dns_zone
  resource_group_name = azurerm_resource_group.myapp.name
  tags                = local.tags
}

resource "azurerm_dns_cname_record" "myapp" {
  name                = local.custom_domain
  zone_name           = azurerm_dns_zone.myapp.name
  resource_group_name = azurerm_resource_group.myapp.name
  ttl                 = local.default_ttl
  record              = azurerm_static_site.myapp.default_host_name
}
canegru commented 10 months ago

For anyone needing a workaround, az containerapp hostname add and az containerapp hostname bind can be run in a provisioner.

Thanks for the lead, I was able to solve it for now by using a provisioner as you've mentioned.

For anyone interested, here is what it looks like. You'll need to make sure to include lifecycle, because your terraform config will null out the custom domain on change and the provisioner needs to execute on every modification.

resource "null_resource" "configure-hostname" {
  provisioner "local-exec" {
    command    = "az containerapp hostname add --resource-group ${data.azurerm_resource_group.app.name} --name ${azurerm_container_app.app.name} --hostname yoursubdomain.domain.com"
    on_failure = continue
  }

  provisioner "local-exec" {
    command    = "az containerapp hostname bind --resource-group ${data.azurerm_resource_group.app.name} --name ${azurerm_container_app.app.name} --hostname yoursubdomain.domain.com --environment ${azurerm_container_app_environment.env.name} --validation-method CNAME"
    on_failure = continue
  }

  lifecycle {
    replace_triggered_by = [azurerm_container_app.app]
  }
}
rawi96 commented 9 months ago

We deployed the container app with the container environment via terraform and created then the managed certificate manually. After that we added the ignore changes property to the lifecycle block.

Not the cleanest solution but works for the moment until its hopefully possible to deploy managed certificates via terraform. Another cleaner way would probably be to use the lets encrypt terraform provider to create certificates and use this instead of a managed by azure.

  lifecycle {
    ignore_changes = [ "ingress" ] // Required to not delete the manually created custom domain since it is not possible to create a managed certificate for a custom domain with terraform
  }
fardarter commented 9 months ago

I'm using the following to check for existence and then do a data query if it exists.

check-for-container-app.sh

#!/bin/bash

# See example: 
# - https://gist.github.com/irvingpop/968464132ded25a206ced835d50afa6b

# Exit if any of the intermediate steps fail
set -e

function error_exit() {
  echo "$1" 1>&2
  exit 1
}

function check_deps() {
  test -f "$(which jq)" || error_exit "jq command not detected in path, please install it"
  test -f "$(which az)" || error_exit "az command not detected in path, please install it"
  az extension add -n containerapp || error_exit "az extension add -n containerapp failed"
}

function parse_input() {
  # jq reads from stdin so we don't have to set up any inputs, but let's validate the outputs
  eval "$(jq -r '@sh "export CONTAINER_APP_NAME=\(.container_app_name) RESOURCE_GROUP=\(.resource_group)"')"
  if [[ -z "${CONTAINER_APP_NAME}" ]]; then export CONTAINER_APP_NAME=none; fi
  if [[ -z "${RESOURCE_GROUP}" ]]; then export RESOURCE_GROUP=none; fi
}

check_deps
az login --service-principal --username "$ARM_CLIENT_ID" --password "$ARM_CLIENT_SECRET" --tenant "$ARM_TENANT_ID" >> /dev/null
parse_input

export EXISTS=false
len=$(az containerapp list -g "$RESOURCE_GROUP" --query "[?name=='$CONTAINER_APP_NAME']" | jq '. | length')

re='^[0-9]+$'
if ! [[ "$len" =~ $re ]] ; then
   error_exit "len is not an integer"
fi

if [ "$len" -gt 0 ]; then
export EXISTS=true
fi

jq -n \
--arg name "$CONTAINER_APP_NAME" \
--arg resource_group "$RESOURCE_GROUP" \
--arg exists "$EXISTS" \
'{"name":$name,"resource_group":$resource_group,"exists":$exists}'
data "external" "query_cht_ingress_exists" {
  # See https://registry.terraform.io/providers/hashicorp/external/latest/docs/data-sources/external
  # Output .exists will be a string: "true" or "false"
  program = ["bash", "${path.module}/scripts/check-for-container-app.sh"]

  query = {
    container_app_name = var.containers.apps.ingress.name
    resource_group     = var.resource_groups.target.name
  }
}
AndreasMWalter commented 7 months ago

Going to add my findings from the last weeks in trying to implement the whole thing with AZAPI provider. The whole managed certificate stuff seems to be a little wonky. In order to get it running via the API (which is what the Portal and Azure CLI do...) you need to execute the following steps:

  1. Create a container app with
    ignore_changes = [
      ingress[0].custom_domain
    ]
  2. with "Custom Hostname" using the desired subject name
    • Note that I didn't try adding a managed wildcard certificate, not sure that that even works
    • You need to specifically Patch with binding disabled and without certificate reference. If you carefully trace the Portal action you notice it does the same step...
      resource "azapi_resource_action" "container_app_hostname" {
      type        = "Microsoft.App/containerApps@2023-05-02-preview"
      resource_id = azurerm_container_app.container_app.id
      method = "PATCH"
      body = jsonencode({
      properties = {
      configuration = {
      ingress = {
        customDomains = [
              name             = "CUSTOM.FQDN" 
              bindingType   = "Disabled"
          }
        ]         
      }
      }
      }
      })
      depends_on = [ 
      azurerm_container_app.container_app,
      ]
      }
  3. Add the certificate for custom domain
    • Note as I mention in my code, domain control validation seems to not work yet.... the Portal will also only create HTTP based no matter what you select...
      resource "azapi_resource" "container_app_certificate" {
      type = "Microsoft.App/managedEnvironments/managedCertificates@2023-05-02-preview"
      name = "CUSTOM.FQDN"
      location = var.location
      parent_id = azapi_resource.managed_environment.id
      tags = var.tags
      body = jsonencode({
      properties = {
      domainControlValidation = "HTTP" # It seems verification via TXT either doesn't work or it is running too long for IAC
      subjectName = each.value.name
      }
      })
      depends_on = [ 
      azapi_resource_action.container_app_hostname
      ]
      }
  4. Add another Patch for your container app, this time you can actually activate the binding and configure the certificate.

    resource "azapi_resource_action" "container_app_custom_domain" {
    
    #On destroy provisioner necessary to remove domain before removing Certificate see error in comments below
    type        = "Microsoft.App/containerApps@2023-05-02-preview"
    resource_id = azurerm_container_app.container_app.id
    method = "PATCH"
    body = jsonencode({
    properties = {
      configuration = {
        ingress = {
          customDomains = [
            {
                name            = "CUSTOM.FQDN" 
                bindingType = "SniEnabled"
                certificateId   = azapi_resource.container_app_certificate.id
            }
          ]         
        }
      }
    }
    })
    depends_on = [ 
    azapi_resource.container_app_certificate,
    ]
    provisioner "local-exec" {
    /*
    Note you will need provisioner magic, this resource will not unconfigure on delete and you will run into issues if you don't remove it before deletion
    */
    when    = destroy
    command = <<PROVISIONER
    PROVISIONER
    }
    }
miceg commented 3 months ago

Thanks for the tips @AndreasMWalter, with your comment I got something usable. πŸŽ‰

A couple of small things:

If you want to automate removing managed certificates without a local-exec provisioner, you can have a "tombstone" variable which triggers removing the custom domain from the Container App first (edit: assuming a one-to-one mapping of resources to apps to domains).

This needs an extra resource to remove the custom domain:

resource "azapi_resource_action" "container_app_remove_custom_domain" {
  # Trigger on removal only
  count       = var.custom_domain_tombstone ? 1 : 0
  type        = "Microsoft.App/containerApps@2023-05-02-preview"
  resource_id = azurerm_container_app.container_app.id
  method      = "PATCH"
  body = jsonencode({
    properties = {
      configuration = {
        ingress = {
          customDomains = []
        }
      }
    }
  })
  depends_on = [
    azurerm_container_app.container_app,
  ]
}

Then you need two extra steps before you can remove a managed certificate:

  1. Create a plan with custom_domain_tombstone = true. This adds the container_app_remove_custom_domain resource.
  2. Apply the plan. This will remove the custom domain from the app, allowing the managed certificate to be deleted.

It would be nice if you could link these pseudo-resources to the infrastructure lifecycle, or check that the tombstone is in place before actually removing the resources... but this is really a giant hack anyway πŸ˜…

I haven't attempted to automate the validation process, but I think I will wait for this provider to support managed certificates properly first. Just being able to automate this part is a small win.

As a side note, as of last week, Container Apps Managed Certificates is now in GA.

AndreasMWalter commented 3 months ago

Thanks for the tips @AndreasMWalter, with your comment I got something usable. πŸŽ‰

A couple of small things:

  • Your second step has a syntax error in customDomains, there's a } and ] swapped around.
  • In your third step, you can use domainControlValidation = "CNAME" for DNS-based validation (TXT is in the API, but doesn't show up as an option in the Azure Portal).

If you want to automate removing managed certificates without a local-exec provisioner, you can have a "tombstone" variable which triggers removing the custom domain from the Container App first.

This needs an extra resource to remove the custom domain:

resource "azapi_resource_action" "container_app_remove_custom_domain" {
  # Trigger on removal only
  count       = var.custom_domain_tombstone ? 1 : 0
  type        = "Microsoft.App/containerApps@2023-05-02-preview"
  resource_id = azurerm_container_app.container_app.id
  method      = "PATCH"
  body = jsonencode({
    properties = {
      configuration = {
        ingress = {
          customDomains = []
        }
      }
    }
  })
  depends_on = [
    azurerm_container_app.container_app,
  ]
}

Then you need two extra steps before you can remove a managed certificate:

  1. Create a plan with custom_domain_tombstone = true. This adds the container_app_remove_custom_domain resource.
  2. Apply the plan. This will remove the custom domain from the app, allowing the managed certificate to be deleted.

It would be nice if you could link these pseudo-resources to the infrastructure lifecycle, or check that the tombstone is in place before actually removing the resources... but this is really a giant hack anyway πŸ˜…

I haven't attempted to automate the validation process, but I think I will wait for this provider to support managed certificates properly first. Just being able to automate this part is a small win.

As a side note, as of last week, Container Apps Managed Certificates is now in GA.

Oh, don't know about the syntax error, it worked in my code but it may have dropped in when I anonymized the code, sorry about that.

I did try with CNAME validation, however the certificate would never deploy, there is two possibilities, maybe:

  1. They fixed it by now and you can use it
  2. I made a mistake earlier, I will check another time perhaps :)

Regarding your workaround for the provisioner, there is now a destroy action supported for azapi_resource_action it was released few days after I tested my provisioner:

Note that this is just some code I ripped from my test, I haven't fully verified it yet but it worked for one deploy and destroy: The resource however must depends_on the resource which needs it removed. As you can see below the managedEnvironments/managedCertificates needs the deletion of the configuration beforehand.

resource "azapi_resource_action" "container_app_delete_custom_domain" {
  for_each = {
    for key, value in var.container_apps : key => value
    if value.ingress.custom_domains != null
  }
  type        = "Microsoft.App/containerApps@2023-05-02-preview"
  resource_id = azurerm_container_app.container_app[each.key].id
  when = "destroy"
  method = "PATCH"
  body = jsonencode({
    properties = {
      configuration = {
        ingress = {
          customDomains = []
        }
      }
    }
  })
  depends_on = [ 
    azapi_resource.managedEnvironments_managedCertificates
  ]
}

Hope I explained that correctly

miceg commented 3 months ago

I did try with CNAME validation, however the certificate would never deploy (...)

Ah ok. Based on my experience with Azure Container Apps thus far, I'd be willing to chalk something like that up to an Azure control plane reliability issue. I've had many operations (via Azure Portal or via Terraform AzureRM) be unreasonably slow (10 - 20 minutes), even if they eventually fail due to a server error (ie: HTTP 5xx)... but then other times things succeed (or fail) reasonably quickly (under 1 minute).

Regarding your workaround for the provisioner, there is now a destroy action supported for azapi_resource_action

Cool, I'll have to give that a go. Thanks! πŸ˜„