Error while deploying extension with Terraform #285

Open BzSpi opened 1 year ago

BzSpi commented 1 year ago


While deploying with Terraform, I have the following error

Helm installation failed : Unable to render the helm chart and substitue helm values to get a valid yaml : Recommendation Please check if the config settings provided are valid : InnerError [failed to install CRD crds/prometheus-crd.yaml: is forbidden: User "system:serviceaccount:kube-system:ext-installer-azureml-extension" cannot create resource "customresourcedefinitions" in API group "" at the cluster scope]

Here's a part of the Terraform code:

data "azurerm_key_vault_certificate_data" "aml" {
  key_vault_id = var.keyvault_id
  name         = var.machine_learning_extension.ssl_keyvault_certificate_name

resource "azurerm_resource_provider_registration" "kubernetes_configuration_registration" {
  name = "Microsoft.KubernetesConfiguration"

resource "azurerm_resource_provider_registration" "extension_manager_registration" {
  count = var.providers_registration_enabled ? 1 : 0

  name = "Microsoft.ContainerService"

  feature {
    name       = "AKS-ExtensionManager"
    registered = true

resource "azurerm_kubernetes_cluster_extension" "machine_learning" {
  name           = "azureml-extension"
  cluster_id     =
  extension_type = "Microsoft.AzureML.Kubernetes"

  configuration_settings = {
    enableTraining               = true
    enableInference              = true
    inferenceRouterServiceType   = "loadBalancer"
    allowInsecureConnections     = false
    internalLoadBalancerProvider = "azure"
    privateEndpointILB           = true
    sslCname                     = var.machine_learning_extension.endpoint_fqdn

  configuration_protected_settings = {
    sslKey  = data.azurerm_key_vault_certificate_data.aml.key
    sslCert = data.azurerm_key_vault_certificate_data.aml.pem

  depends_on = [azurerm_resource_provider_registration.kubernetes_configuration_registration, azurerm_resource_provider_registration.extension_manager_registration]

Deployment is made with a Service Principal.

BzSpi commented 1 year ago

The issue is that the extension must be deployed with the azureml namespace, default value does not work.

anwojcie commented 1 year ago

I do not think so.... You are missing a couple of needed configuration settings and also use some not existent (privateEndpointILB) check and

Also, reconsider using "Microsoft.AzureML.Kubernetes".

BzSpi commented 1 year ago

Hi @anwojcie

Thanks for your response. It may be a behavior from the azurerm_kubernetes_cluster_extension resource which use the extension name as namespace.

Also, I've added a fully working code on how to deploy a private LB in issue #284