pulumi / pulumi-azure-native

Azure Native Provider
Apache License 2.0
128 stars 34 forks source link

Circular dependency between WebAppHostNameBinding and Certificate #578

Open imod opened 4 years ago

imod commented 4 years ago

I'm trying to create a add custom domain to WebApp and have it use a managed certificate created by azure. This can be done with these two types:

This example shows how to do it, but there is a big issue:

const subdomain = `dummy`;

const verificationRecord = new cloudflare.Record("TXT verification record", {
  zoneId: cloudflareZoneId,
  name: `asuid.${subdomain}`,
  type: "TXT",
  value: `${azureAppVerificationToken}`,
});

const app = new web.WebApp(
  "Docker App",
  {
    name: `pulumi-config`,
    resourceGroupName: resourceGroup.name,
    location: resourceGroup.location,
    serverFarmId: plan.id,
    siteConfig: {
      alwaysOn: true,
      linuxFxVersion: `DOCKER|${dockerImage}`,
    },
  },
  { dependsOn: verificationRecord }
);

const dnsRecord = new cloudflare.Record("CNAME record", {
  name: subdomain,
  zoneId: cloudflareZoneId,
  type: "CNAME",
  value: app.defaultHostName,
  ttl: 300,
  proxied: false,
});

const cert = new web.Certificate("Certificate", {
  name: "mycert",
  password: "xxxx",
  location: resourceGroup.location,
  resourceGroupName: resourceGroup.name,
  serverFarmId: plan.id,
  canonicalName: `${subdomain}.mydomain.com`,
});

const hostNameBinding = new web.WebAppHostNameBinding("custom domain binding", {
  name: "custom-domain-binding",
  resourceGroupName: resourceGroup.name,
  hostName: `${subdomain}.mydomain.com`,
  thumbprint: cert.thumbprint,
  sslState: "SniEnabled",
});

This would do exactly what i want, if I omit/comment the creation of the cert and the thumbprint in the hostNameBinding on the first run.

If I don’t do that, I get this:

  azure-nextgen:web/latest:Certificate (mycert):
    error: Code="BadRequest" Message="Properties.CanonicalName is invalid.  Certificate creation requires hostname dummy.mydomain.com added to an app in the serverFarmId /subscriptions/55f6bc5c-cb2d-4352-b37e-6b3c0854adcf/resourceGroups/pulumi-rg/providers/Microsoft.Web/serverfarms/linux-asp" Details=[{"Message":"Properties.CanonicalName is invalid.  Certificate creation requires hostname dummy.mydomain.com added to an app in the serverFarmId /subscriptions/55f6bc5c-cb2d-4352-b37e-6b3c0854adcf/resourceGroups/pulumi-rg/providers/Microsoft.Web/serverfarms/linux-asp"},{"Code":"BadRequest"},{"ErrorEntity":{"Code":"BadRequest","ExtendedCode":"51021","Message":"Properties.CanonicalName is invalid.  Certificate creation requires hostname dummy.mydomain.com added to an app in the serverFarmId /subscriptions/55f6bc5c-cb2d-4352-b37e-6b3c0854adcf/resourceGroups/pulumi-rg/providers/Microsoft.Web/serverfarms/linux-asp","MessageTemplate":"{0} is invalid.  {1}","Parameters":["Properties.CanonicalName","Certificate creation requires hostname dummy.mydomain.com added to an app in the serverFarmId /subscriptions/55f6bc5c-cb2d-4352-b37e-6b3c0854adcf/resourceGroups/pulumi-rg/providers/Microsoft.Web/serverfarms/linux-asp"]}}]

So the problem is, that we have circular dependency between two resources. As we want to run pulumi without any manual interaction at all, this is not going to work - having to comment/uncomment resources in the correct order is just not practical.

I understand this is bad API design on Azure/MS site, but I kind of expect Pulumi provide a way to work around this. Maybe some post create action on the Certificate that allows to update an existing resource again?

How can this be done?

imod commented 4 years ago

The same problem currently exists in terraform (and this is one of the reasons why I’m looking at pulumi: https://github.com/terraform-providers/terraform-provider-azurerm/issues/4824 (the issue also has quite some +1s, so there is definitely need of this in terraform and i guess in pulumi too) There are some workarounds available, but the plan for terraform is to split azurerm_app_service_custom_hostname_binding to break the circular dependency: https://github.com/terraform-providers/terraform-provider-azurerm/issues/8069

mikhailshilkov commented 4 years ago

There are some ideas in this similar thread, albeit none of them are super-simple. This comment and the one below:

https://github.com/pulumi/pulumi/issues/3021#issuecomment-713270279

imod commented 4 years ago

@mikhailshilkov I hope there will be a simpler solution... don't think any of these two "solutions" a really practical. And looking at all the other comments, this seems a rather common problem in all different scenarios and a general solution needs to be found. Just thinking out loud... how about extending the pulumi.CustomResourceOptions with some kind of a callback definition which holds a reference to an already created resource and also has access to the currents object output so it can modify the hold object again.

imod commented 3 years ago

@mikhailshilkov this now solved on terraform site: a new resource azurerm_app_service_certificate_binding has been introduced to break the circular dependency:

I guess this could be used in the "old azure provider" - but how can we fix this issue with the nextgen provider?

tedvanderveen commented 3 years ago

Running into same chicken-and-egg situation here. Hate it, but for now going back to Pulumi Terraform provider.

gregtayl commented 3 years ago

Running into this same issue as well.

It's a bit hacky, but as a workaround I pass an 'is-initial-create' config value to the stack. When deploying a new stack for the first time, I call pulumi up twice, first with is-initial-create=true (which triggers logic in the stack to create the WebAppHostNameBinding first without the Thumbprint and the Certificate second), and then second with is-initial-create=false (which triggers logic in the stack to create the Certificate first and the WebAppHostNameBinding second with the certificate thumbprint output). Running pulumi up twice is only required for deploying a new stack - moving forward, pulumi up can be called a single time (always with is-initial-create=false).

markodjukic commented 3 years ago

Here's another workaround which avoids having to set flags to begin with, and does some logic in the code to figure out whether initial run or subsequent runs. Sets an output flag which allows for any automated pulumi up invocations to be run again:

try:
    # Try to fetch the hostname binding for the url.
    host_name_bind_check = web.get_web_app_host_name_binding(
        host_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        name=app.name,
        resource_group_name=resource_group.name,
    )
    # If we get this far, then fetching of hostname has not failed, which means there is one
    # already attached to the app, so go ahead and create an Azure managed certificate next.
    cert_bind = web.Certificate(
        'cert_bind_app',
        canonical_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        host_names=[
            app.default_host_name,
            record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        ],
        opts=pulumi.ResourceOptions(
            delete_before_replace=True,
        ),
        resource_group_name=resource_group.name,
        server_farm_id=app.server_farm_id,
        tags=tags_common,
    )

    # Now bind our custom CNAME to the app with the cert details.
    hostname_bind = web.WebAppHostNameBinding(
        'host_bind_app',
        name=app.name,
        host_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        resource_group_name=resource_group.name,
        ssl_state=web.SslState.SNI_ENABLED,
        thumbprint=cert_bind.thumbprint,
    )
except Exception:
    # Host name is not bound yet, so do that first. Cert will be created in another run.
    hostname_bind = web.WebAppHostNameBinding(
        'host_bind_app',
        name=app.name,
        host_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        resource_group_name=resource_group.name,
        site_name=app.name,
    )
    # Export a flag to run pulumi update again.
    pulumi.export('pulumi_update_again', True)
imod commented 3 years ago

@markodjukic I see where you'r going with this and yes, this might actually work - but as you say this is also a workaround and I just hope we get a better, proper solution to this any time soon.

markodjukic commented 3 years ago

@markodjukic I see where you'r going with this and yes, this might actually work - but as you say this is also a workaround and I just hope we get a better, proper solution to this any time soon.

It does work, I'm already using it in production on our deployments. But I completely agree that it's a hack and not what I'd want as clean code.

I was looking at the Azure portal calls and it seems like it does a POST call to /Microsoft.Web/sites/{name} which simply updates the hostname binding to enable the SSL binding on the hostname. So logically it would make sense for Pulumi to somehow mimic that behaviour by allowing an existing WebAppHostNameBinding() resource update after the Certificate() creation.

Maybe something like this:

    hostname_bind = web.WebAppHostNameBinding(
        'host_bind_app',
        name=app.name,
        host_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        resource_group_name=resource_group.name,
        site_name=app.name,
    )

    cert_bind = web.Certificate(
        'cert_bind_app',
        canonical_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        host_names=[
            app.default_host_name,
            record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        ],
        opts=pulumi.ResourceOptions(
            delete_before_replace=True,
        ),
        resource_group_name=resource_group.name,
        server_farm_id=app.server_farm_id,
        tags=tags_common,
    )

    hostname_bind.update(
        'host_bind_app',
        host_name=record_set_cname.fqdn.apply(lambda url: url.strip('.')),
        ssl_state=web.SslState.SNI_ENABLED,
        thumbprint=cert_bind.thumbprint,
    )
BabakScript commented 2 years ago

Any update or fix for this?

MattRipia commented 2 years ago

Also having this issue. We are unable to use traffic managers over https routing to WebApps in Azure; we have tried solving the circular dependency between the binding and the certificate without much luck.

donniehale-awh commented 1 year ago

Bump. Any updates here?

Is Pulumi's thinking on this (understandably, I suppose) that since Microsoft's primitives for these capabilities have the circular dependency, it's not something for Pulumi to solve?

My primary objective, which would be a response if that is Pulumi's thinking, is that at a minimum we need a workaround that works in all cases with only a single pulumi up.

Thanks.

donniehale-awh commented 1 year ago

Still no replies from anyone with Pulumi - even a comment like "we're not going to address this until Microsoft does" would be helpful.

In case anyone finds this useful, here's the "two pulumi up" workaround in C#.

// EXPLANATION OF WHAT'S GOING ON HERE...
//
// For background, see: https://github.com/pulumi/pulumi-azure-native/issues/578
//
// TL;DR - Azure has a circular dependency between WebAppHostNameBinding and Certificate
// a) A binding for the FQDN must exist to create a certificate for that FQDN
// b) For the binding to be TLS-enabled, it needs the thumbprint of the certificate
//
// The solution here requires *TWO* "pulumi up" executions:
// 1) Determines that the binding does not exist, so creates it without a thumbprint
// 2) Once the binding exist, creates the certificate and updates the binding with the thumbprint
// (Technically, per Pulumi, it replaces the binding)

var customDnsFqdn = config.Require("CustomDnsFqdn");
// apiAppService and rgName are available in this scope

try
{
    var inputArgs = Output.Tuple<string, string>(apiAppService.Name, rgName);

    // We don't have access to the app service or resource group names until
    // "Apply" can resolve them.
    //
    // And we can't use the non-async version of "Invoke" because the exception
    // thrown when the resource isn't found in that version aborts the program
    // immediately - "catch" doesn't work.
    inputArgs.Apply(args =>
    {
        var bindingLookupArgs = new GetWebAppHostNameBindingArgs
        {
            HostName = customDnsFqdn,
            Name = args.Item1,
            ResourceGroupName = args.Item2,
        };

        var bindingLookupResult = GetWebAppHostNameBinding.InvokeAsync(bindingLookupArgs).Result;

        return "found";
    });

    // If we get here, the initial binding exists; so we can create the certificate
    // and update the binding
    var managedCert = new Certificate("mycertname", new()
    {
        ServerFarmId = apiAppService.ServerFarmId!,
        CanonicalName = customDnsFqdn,
        HostNames = new[]
        {
            customDnsFqdn,
        },
        Location = "eastus",
        ResourceGroupName = rgName
    });

    var customDomain = new WebAppHostNameBinding("mybindingname", new()
    {
        HostName = customDnsFqdn,
        HostNameType = HostNameType.Verified,
        ResourceGroupName = rgName,
        Name = apiAppService.Name,
        SiteName = apiAppService.Name,
        SslState = SslState.SniEnabled,
        Thumbprint = managedCert.Thumbprint
    });
}
catch
{
    Log.Info($"***** WebAppHostNameBinding DOES NOT EXIST *****");
    // Do nothing - we've verified that the resource doesn't exist

    // If we get here, we create an initial binding for the FQDN so
    // that the subsequent "pulumi up" can create the certificate
    var prodCustomDomain = new WebAppHostNameBinding("mybindingname", new()
    {
        HostName = customDnsFqdn,
        HostNameType = HostNameType.Verified,
        ResourceGroupName = rgName,
        Name = apiAppService.Name,
        SiteName = apiAppService.Name
    });
}
xwipeoutx commented 1 year ago

Thanks for that @donniehale-awh , it got me unblocked <3

I went an extra step and make a component resource for it - an SSLAppServiceBinding, if anyone wants it, its yours:

using System;
using Pulumi;
using Pulumi.AzureNative.Web;

public class SslAppServiceBindingArgs : ResourceArgs
{
    public Input<string> AppServiceName { get; set; }
    public Input<string> ResourceGroupName { get; set; }
    public Input<string> HostName { get; set; }

    public Input<string> ServerFarmId { get; set; }
}

/// <summary>
/// Does some tricky logic to create a binding and certificate for an app service with SSL.
/// Requires a double `pulumi up`
///
/// See https://github.com/pulumi/pulumi-azure-native/issues/578
/// </summary>
public class SslAppServiceBinding : ComponentResource
{
    public SslAppServiceBinding(string name, SslAppServiceBindingArgs args, ComponentResourceOptions? options = null)
        : base("custom:azure:SslAppServiceBinding", name, args, options)
    {
        var inputArgs = Output.Tuple(args.HostName, args.AppServiceName, args.ResourceGroupName);

        // We don't have access to the app service or resource group names until
        // "Apply" can resolve them.
        //
        // And we can't use the non-async version of "Invoke" because the exception
        // thrown when the resource isn't found in that version aborts the program
        // immediately - "catch" doesn't work.
        var exists = inputArgs.Apply(args =>
        {
            var bindingLookupArgs = new GetWebAppHostNameBindingArgs
            {
                HostName = args.Item1,
                Name = args.Item2,
                ResourceGroupName = args.Item3,
            };

            try
            {
                var bindingLookupResult = GetWebAppHostNameBinding.InvokeAsync(bindingLookupArgs).Result;
            }
            catch (Exception)
            {
                return false;
            }

            return true;
        });

        exists.Apply(canAddSsl =>
        {
            if (canAddSsl)
            {
                // If we get here, the initial binding exists; so we can create the certificate
                // and update the binding
                var managedCert = new Certificate($"{name}-cert", new()
                {
                    ServerFarmId = args.ServerFarmId,
                    CanonicalName = args.HostName,
                    HostNames = new[]
                    {
                        args.HostName,
                    },
                    ResourceGroupName = args.ResourceGroupName
                }, new() { Parent = this });

                var customDomain = new WebAppHostNameBinding($"{name}-binding", new()
                {
                    HostName = args.HostName,
                    HostNameType = HostNameType.Verified,
                    ResourceGroupName = args.ResourceGroupName,
                    Name = args.AppServiceName,
                    SiteName = args.AppServiceName,
                    SslState = SslState.SniEnabled,
                    Thumbprint = managedCert.Thumbprint
                }, new() { Parent = this });
                return customDomain;
            }
            else
            {
                // If we get here, we create an initial binding for the FQDN so
                // that the subsequent "pulumi up" can create the certificate
                var customDomain = new WebAppHostNameBinding($"{name}-binding", new()
                {
                    HostName = args.HostName,
                    HostNameType = HostNameType.Verified,
                    ResourceGroupName = args.ResourceGroupName,
                    Name = args.AppServiceName,
                    SiteName = args.AppServiceName
                }, new() { Parent = this });
                return customDomain;
            }
        });
    }
}

And the usage:

var binding = new SslAppServiceBinding($"mybinding", new()
{
    HostName = domain,
    ResourceGroupName = resourceGroup.Name,
    AppServiceName = appService.Name,
    ServerFarmId = plan.Id
}, new() { Parent = app });
UnstoppableMango commented 1 year ago

Hilariously bicep has a not-too-hacky solution for this using modules. I wonder if the pulumi automation API could be used to implement a similar workaround in the interim? It would probably be more convoluted than the workaround solutions already proposed here though.

P-de-Jong commented 6 months ago

Ran into this today, pretty annoying that we have to run the up twice for the initial setup :(