hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.53k stars 4.61k forks source link

Can't create a Compute Instance without a Public IP on a Machine Learning Workspace with Managed Virtual Network enabled #27506

Open ervisab opened 6 days ago

ervisab commented 6 days ago

Is there an existing issue for this?

Community Note

Terraform Version

1.0.1

AzureRM Provider Version

3.116

Affected Resource(s)/Data Source(s)

azurerm_machine_learning_compute_instance

Terraform Configuration Files

We have created a Machine Learning Workspace with Managed Virtual Network enabled.
Our goal is to create a machine learning compute instance with No Public IP.

As referring to the official Documentation:
https://registry.terraform.io/providers/hashicorp/azurerm/3.116.0/docs/resources/machine_learning_compute_instance
this can be done by setting the below parameter to false:
node_public_ip_enabled  = false

Though setting this to false indicates that it is a must for us to have subnet_resource_id set.

"node_public_ip_enabled - (Optional) Whether the compute instance will have a public ip. To set this to false a subnet_resource_id needs to be set. Defaults to true. Changing this forces a new Machine Learning Compute Cluster to be created."

But setting the subnet_resource_id conflicts with the fact that the workspace has a Managed Virtual Network / can not provide a subnet resource id if we have Managed Virtual Network enabled.

Option nr1:
resource "azurerm_machine_learning_compute_instance" "ml-ci" {
  for_each                      = local.compute_instances
  name                          = each.value["name"]
  location                      = var.azureml_resource_group_location
  machine_learning_workspace_id = var.azureml_workspace_id
  virtual_machine_size          = each.value["vm_size"]
  subnet_resource_id            = null
  tags                          = each.value["tags"]
  identity {
    type = "SystemAssigned"
  }
  assign_to_user {
    object_id = each.value["user_object_id"]
    tenant_id = var.tenant_id
  }
  node_public_ip_enabled = var.node_public_ip_enabled
}

Option nr2:
resource "azurerm_machine_learning_compute_instance" "ml-ci" {
  for_each                      = local.compute_instances
  name                          = each.value["name"]
  location                      = var.azureml_resource_group_location
  machine_learning_workspace_id = var.azureml_workspace_id
  virtual_machine_size          = each.value["vm_size"]
  subnet_resource_id            = "/subscriptions/xxx/resourceGroups/xx/providers/Microsoft.Network/virtualNetworks/xx/subnets/xx"
  tags                          = each.value["tags"]
  identity {
    type = "SystemAssigned"
  }
  assign_to_user {
    object_id = each.value["user_object_id"]
    tenant_id = var.tenant_id
  }
  node_public_ip_enabled = var.node_public_ip_enabled
}

Debug Output/Panic Output

Error on option nr1:
This is the error we get if we do not provide a subnet_resource_id:
Error: `subnet_resource_id` must be set if `node_public_ip_enabled` is set to `false`

Error on option nr2:
│ Error: waiting for creation of Machine Learning Compute (Subscription: "xxx"
│ Resource Group Name: "xxx"
│ Workspace Name: "xxx"
│ Compute Name: "xxx"): polling failed: the Azure API returned the following error:
│
│ Status: "BadRequest"
│ Code: ""
│ Message: "Unsupported operation: Attempting to create ComputeInstance compute in custom vnet (/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/virtualNetworks/xxx/subnets/xxx) when the workspace is configured with a Managed 
Virtual Network. Please ensure the subnet is set to null or the workspace is not configured with a Managed Virtual Network."
│ Activity Id: ""
│
│ ---
│
│ API Response:
│
│ ----[start]----
│ {
│   "status": "Failed",
│   "error": {
│     "code": "BadRequest",
│     "message": "Unsupported operation: Attempting to create ComputeInstance compute in custom vnet (/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/virtualNetworks/xxx/subnets/xxx) when the workspace is configured with a Managed Virtual Network. Please ensure the subnet is set to null or the workspace is not configured with a Managed Virtual Network."
│   }
│ }
│ -----[end]-----
│
│
│   with module.compute_instance_dp["xxx"].azurerm_machine_learning_compute_instance.ml-ci["xxx"],
│   on ..\modules\compute-instance\main.tf line 15, in resource "azurerm_machine_learning_compute_instance" "ml-ci":
│   15: resource "azurerm_machine_learning_compute_instance" "ml-ci" {
│

Expected Behaviour

It is expected that the Compute Instance gets create on the workspace without Public IP and just a Private one which will be managed by Azure itself since Managed Virtual Network is enabled in the workspace.

The Instance is able to get created manually without Public IP from the Machine Learning Workspace.

We expect same to be created via terraform.

Actual Behaviour

Currently this is not possible.

Steps to Reproduce

No response

Important Factoids

The region is West Europe

References

https://registry.terraform.io/providers/hashicorp/azurerm/3.116.0/docs/resources/machine_learning_compute_instance

ervisab commented 16 hours ago

Is there any update on this issue?