vmware / terraform-provider-nsxt

Terraform VMware NSX-T provider
https://www.terraform.io/docs/providers/nsxt/
Other
123 stars 80 forks source link

nsxt_transport_node resource fails to create edge node when using a VLAN segment path in data_network_id #1053

Closed liftconfig closed 4 months ago

liftconfig commented 7 months ago

Describe the bug

NSX version: 4.1.1 Terraform provider version: 3.4.0 Resource: nsxt_transport_node (beta)

When creating an edge VM node using the nsx_transport_node resource if the path of a VLAN-backed segment is specified in the data_network_id argument, terraform fails to create the resource and the following error is seen:

nsxt_transport_node.edge: Creating...
╷
│ Error:  Failed to create TransportNode edge: [Fabric] Network interface '6b67e800-144e-44ff-9b01-de8509c54a89' is either an overlay logical switch or in an unaccessible Transport zone. Only accessible VLAN logical switches are supported. (code 16031)
│ 
│   with nsxt_transport_node.edge,
│   on generated.tf line 18, in resource "nsxt_transport_node" "edge":
│   18: resource "nsxt_transport_node" "edge" {

The network interface ID specified in the error corresponds to the VLAN segment used in the data_network_id argument.

Creating the edge node through the NSX Manager UI with the exact same argument parameters and using the same VLAN segments works. A GET request on the manually created edge node shows that the paths of the VLAN segments used in the data_network_id argument are correct.

Reproduction steps

  1. Specify VLAN segment(s) in the data_network_ids argument when createing a edge nsxt_transport_node resource. For example, using two trunk-based VLAN segments created through NSX manager:
 data_network_ids = [
   "/infra/segments/EDGE-UL1-TRUNK",
   "/infra/segments/EDGE-UL2-TRUNK",
]
  1. Run terraform apply (plan works)

Expected behavior

The edge node should provision successfully with its uplinks assigned to the specified VLAN segments.

Additional context

No response

salv-orlando commented 7 months ago

@ksamoray can you try and reproduce to check why we get this error from NSX when creating the node via Terraform?

liftconfig commented 7 months ago

Not sure if this is of any use but I'm currently using an Ansible module to deploy the edge nodes using VLAN segment IDs. Python code below.

nsxt_transport_nodes.txt

ksamoray commented 7 months ago

Hi, Does using the network id (e.g EDGE-UL1-TRUNK instead of /infra/segments/EDGE-UL1-TRUNK) help?

liftconfig commented 7 months ago

No unfortunately not, it can’t find the VLAN segment if you just use the network name instead of the ID (ID = path)

On Thu, 7 Dec 2023 at 9:49 pm, Kobi Samoray @.***> wrote:

Hi, Does using the network id (e.g EDGE-UL1-TRUNK instead of /infra/segments/EDGE-UL1-TRUNK) help?

— Reply to this email directly, view it on GitHub https://github.com/vmware/terraform-provider-nsxt/issues/1053#issuecomment-1845375570, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANSKWL66EGFSRM3RABTH7WLYIHCNXAVCNFSM6AAAAABAHYL5YCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBVGM3TKNJXGA . You are receiving this because you authored the thread.Message ID: @.***>

ksamoray commented 7 months ago

Hi, I've tried to reproduce your issue - indeed the paths of the segments should be used.

Anyway I was able to create an edge node on a VLAN segment with the code below (the cluster should be connected to the VDS on vCenter):

resource "nsxt_policy_host_transport_node_profile" "tnp" {
  display_name = "tnp220"
  standard_host_switch {
    host_switch_id   = data.vsphere_distributed_virtual_switch.venv_vds.id
.
.
.
    transport_zone_endpoint {
      transport_zone = nsxt_policy_transport_zone.tz_overlay.path
    }
    transport_zone_endpoint {
      transport_zone = nsxt_policy_transport_zone.tz_vlan.path
    }
    host_switch_profile = [nsxt_policy_uplink_host_switch_profile.uplink_host_switch_profile.path]
  }
  depends_on = [data.nsxt_compute_manager_realization.vc1_realization]
}

data "nsxt_compute_collection" "edge_cluster_collection" {
  display_name = data.vsphere_compute_cluster.venv_edge_cluster.name
  origin_id    = data.nsxt_compute_manager_realization.vc1_realization.id
}

resource "nsxt_policy_host_transport_node_collection" "htnc2" {
  display_name                = "htnc2"
  compute_collection_id       = data.nsxt_compute_collection.edge_cluster_collection.id
  transport_node_profile_path = nsxt_policy_host_transport_node_profile.tnp.path
  depends_on                  = [data.nsxt_compute_manager_realization.vc1_realization]
}

resource "nsxt_policy_vlan_segment" "vlanseg" {
  display_name        = "vlanseg"
  transport_zone_path = nsxt_policy_transport_zone.tz_vlan.path
  vlan_ids            = ["140"]
}

resource "nsxt_transport_node" "edgenode1" {
  standard_host_switch {
    transport_zone_endpoint {
      transport_zone = nsxt_policy_transport_zone.tz_overlay.realized_id
    }
    transport_zone_endpoint {
      transport_zone = nsxt_policy_transport_zone.tz_vlan.realized_id
    }
    host_switch_profile = [nsxt_policy_uplink_host_switch_profile.uplink_host_switch_profile.realized_id]
  }
  edge_node {
    deployment_config {
      vm_deployment_config {
        data_network_ids      = [nsxt_policy_vlan_segment.vlanseg.path]
        compute_id            = data.vsphere_compute_cluster.venv_edge_cluster.id # Cluster is connected to VDS on vCenter
.
.
.

      }
    }
.
.
.
}
ksamoray commented 7 months ago

@liftconfig any update?

liftconfig commented 6 months ago

@ksamoray Sorry for the late response as I've been away. Below is an example of the uplink segments and edge resource configuration when I hit this issue.

The main bits that differ from your example:

  1. The edge has 2 uplink VLAN segments assigned
  2. The VLAN trunk segments assigned to the edge uplink vNICs (data_network_id) are in a VLAN Transport Zone (TZ) that is different from the one assigned to the edge's standard_host_switch. We use one VLAN TZ for segments/portgroups configured on the ESXi hosts, and one VLAN TZ for uplink segments on the edges.
data "nsxt_compute_manager" "vc" {
  display_name = "test-vc.domain.local"
}

data "nsxt_policy_uplink_host_switch_profile" "edge_uplink_profile" {
  display_name = "edge_uplink_profile"
}

data "nsxt_policy_ip_pool" "edge_tep_pool" {
  display_name = "edge_tep_pool"
}

# Overlay TZ assigned to hosts and edges
data "nsxt_policy_transport_zone" "tz_overlay" {
  display_name = "tz_overlay"
}

#VLAN TZ assigned to edges. Contains edge uplink VLAN segments
data "nsxt_policy_transport_zone" "tz_vlan_edge" {
  display_name = "tz_vlan_edge"
}

#VLAN TZ assigned to ESXi hosts. Contains infrastructure VLAN segments and trunk VLAN segments
data "nsxt_policy_transport_zone" "tz_vlan_host" {
  display_name = "tz_vlan_host"
}

# Portgroup configured on host ESXi VDS. Used for edge VM uplink 1. Uplink teaming policy maps to host PNIC uplink 1
resource "nsxt_policy_vlan_segment" "trunk-uplink1" {
  display_name        = "trunk-uplink1"
  transport_zone_path = data.nsxt_policy_transport_zone.tz_vlan_host.path
  vlan_ids            = ["2001", "100"]

  advanced_config {
    uplink_teaming_policy = "host-uplink1-active"
    connectivity          = "ON"
  }
}

# Portgroup configured on host ESXi VDS. Used for edge VM uplink 2. Uplink teaming policy maps to host PNIC uplink 2
resource "nsxt_policy_vlan_segment" "PER01B4NSX-C11UL2-TRUNK" {
  display_name        = "trunk-uplink2"
  transport_zone_path = data.nsxt_policy_transport_zone.tz_vlan_host.path
  vlan_ids            = ["2002", "100"]

  advanced_config {
    uplink_teaming_policy = "host-uplink2-active"
    connectivity          = "ON"
  }
}

resource "nsxt_transport_node" "edge_node" {
  display_name   = "edge"
  edge_node { 
    deployment_config {
      form_factor = "XLARGE"
      node_user_settings {
        cli_password   = var.edge_admin_password
        root_password  = var.edge_root_password
      }
      vm_deployment_config {
        compute_folder_id       = "group-v121234"
        compute_id              = "domain-c125678"
        data_network_ids        = [nsxt_policy_vlan_segment.trunk-uplink1.path,
                                   nsxt_policy_vlan_segment.trunk-uplink2.path]
        default_gateway_address = ["10.x.x.x"]
        ipv4_assignment_enabled = true
        management_network_id   = "dvportgroup-121248"
        storage_id              = "datastore-124857"
        vc_id                   = data.nsxt_compute_manager.PER01B4VCS01.id
        management_port_subnet {
          ip_addresses  = ["10.x.x.x"]
          prefix_length = 24
        }
      }
    }
  }
  standard_host_switch {
    host_switch_mode         = "STANDARD"
    host_switch_profile      = [data.nsxt_policy_uplink_host_switch_profile.edge_uplink_profile.id]
    host_switch_type         = "NVDS"
    ip_assignment {
      assigned_by_dhcp = false
      static_ip_pool   = data.nsxt_policy_ip_pool.edge_tep_pool.id
    }
    pnic {
      device_name = "fp-eth0"
      uplink_name = "uplink-1"
    }
    pnic {
      device_name = "fp-eth1"
      uplink_name = "uplink-2"
    }
    transport_zone_endpoint {
      transport_zone         = data.nsxt_policy_transport_zone.tz_overlay.id
    }
    transport_zone_endpoint {
      transport_zone         = data.nsxt_policy_transport_zone.tz_vlan_edge.id
    }
  }
}
ksamoray commented 5 months ago

@liftconfig BTW are the related elements fully realized? e.g NSX installation over the hypervisors? Or are those preinstalled and there are no issues of such?

liftconfig commented 5 months ago

@ksamoray yep all fully realized / pre-installed and working with no issues.

ksamoray commented 5 months ago

@liftconfig as I can't reproduce the behavior you observe, can you do the following?

Obviously cleanup any info from the output which could be a security threat.

liftconfig commented 4 months ago

@ksamoray sorry for the late reply again. I've found that the ETN (using the renamed nsxt_edge_transport_node resource) deploys fine if I specify the "host_id" argument under the "vm_deployment_config". If I omit this argument, then I get the error as described in the original bug post.

Looking at the differences between the JSON POST when using Terraform vs the Web UI - if you don't specify the host_id argument in Terraform it will send "host_id": "". If you don't specify the host in the Web UI the "host_id" isn't present in the JSON payload.

I'm using NSX-T 4.1.1.0 now but had the same issue in 3.2.2.1

ksamoray commented 4 months ago

@liftconfig that seems easy to fix - so the issue isn't related to the VLAN segment assignment at all?

liftconfig commented 4 months ago

@ksamoray correct, it appears the error is a a red herring:

"Error: Failed to create TransportNode edge: [Fabric] Network interface '6b67e800-144e-44ff-9b01-de8509c54a89' is either an overlay logical switch or in an unaccessible Transport zone. Only accessible VLAN logical switches are supported. (code 16031)"

Making sure the host is set in the vm deployment config resolves this and the ETN is deployed correctly with the network interfaces connected to the right vlan trunk segments. Fix would be to not send a blank host if the host value is not specified.