OpenNebula / terraform-provider-opennebula

Terraform provider for OpenNebula
https://www.terraform.io/docs/providers/opennebula/
Mozilla Public License 2.0
61 stars 54 forks source link

F-477: add nic scheduling attributes #502

Closed treywelsh closed 6 months ago

treywelsh commented 8 months ago

Community Note

Description

Add NIC scheduling attributes

References

Close #477

New or Affected Resource(s)

Checklist

treywelsh commented 7 months ago

If you want to give it a try @jamie-pate here is a first shot, feedback is welcome

jamie-pate commented 7 months ago

It seems to work during deployment, but when I re-run terraform it has problems: Error attaching new VM NIC

It looks like the SCHED_RANK and SCHED_REQUIREMENTS aren't saved in the vm template inside opennebula?

NIC = [
  AR_ID = "0",
  BRIDGE = "br1",
  BRIDGE_TYPE = "linux",
  CLUSTER_ID = "0",
  IP = "10.59.134.50",
  MAC = "02:00:0a:3b:86:32",
  MODEL = "virtio",
  NAME = "NIC0",
  NETWORK = "public-134",
  NETWORK_ID = "161",
  NIC_ID = "0",
  SECURITY_GROUPS = "0",
  TARGET = "one-2891-0",
  VN_MAD = "bridge" ]

Deploy works:

      + nic {
          + model                    = "virtio"
          + network                  = (known after apply)
          + network_id               = -1
          + network_mode_auto        = true
          + nic_id                   = (known after apply)
          + sched_rank               = "-USED_LEASES"
          + sched_requirements       = "LABEL = \"devops2\""
        }

This change works:

  ~ resource "opennebula_virtual_machine" "minio" {
        id                     = "2883"
        name                   = "fosexp-minio-test-test2"
        # (25 unchanged attributes hidden)

      ~ nic {
          ~ network_id               = 162 -> 161
            # (9 unchanged attributes hidden)
        }

        # (7 unchanged blocks hidden)
    }

This change not work

      ~ nic {
          ~ network_id               = 161 -> -1
          ~ network_mode_auto        = false -> true
          + sched_rank               = "-USED_LEASES"
          + sched_requirements       = "LABEL = \"devops2\""
            # (8 unchanged attributes hidden)
        }

This one also doesn't work

      ~ nic {
          ~ sched_rank               = "-USED_LEASES" -> "USED_LEASES"
            # (11 unchanged attributes hidden)
        }
╷
│ Error: Failed to update NIC
│ 
│   with opennebula_virtual_machine.minio,
│   on main.tf line 153, in resource "opennebula_virtual_machine" "minio":
│  153: resource "opennebula_virtual_machine" "minio" {
│ 
│ virtual machine (ID: 2883): vm nic attach: network -1: Tue Nov 28 13:22:22 2023 : Error attaching new VM NIC: Missing VN_MAD, BRIDGE, TARGET or MAC in VM NIC
╵

full log

Tue Nov 28 15:15:30 2023 [Z0][VM][I]: New state is ACTIVE
Tue Nov 28 15:15:30 2023 [Z0][VM][I]: New LCM state is PROLOG
Tue Nov 28 15:15:44 2023 [Z0][VM][I]: New LCM state is BOOT
Tue Nov 28 15:15:44 2023 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/2889/deployment.0
Tue Nov 28 15:15:47 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Nov 28 15:15:48 2023 [Z0][VMM][I]: pre: Executed "sudo -n ip link add name br03 type bridge ".
Tue Nov 28 15:15:48 2023 [Z0][VMM][I]: pre: Executed "sudo -n ip link set br03 up".
Tue Nov 28 15:15:48 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:15:48 2023 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Nov 28 15:15:50 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:15:50 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: deploy.
Tue Nov 28 15:15:50 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:15:50 2023 [Z0][VMM][I]: Successfully execute network driver operation: post.
Tue Nov 28 15:15:50 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Nov 28 15:18:54 2023 [Z0][VM][I]: New state is ACTIVE
Tue Nov 28 15:18:54 2023 [Z0][VM][I]: New LCM state is HOTPLUG_NIC
Tue Nov 28 15:18:56 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:18:56 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: detach_nic.
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: clean: Executed "sudo -n ip link delete br03".
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: Successfully execute network driver operation: clean.
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: prereconfigure.
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: reconfigure.
Tue Nov 28 15:18:57 2023 [Z0][VMM][I]: VM NIC Successfully detached.
Tue Nov 28 15:18:57 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Nov 28 15:18:59 2023 [Z0][VM][I]: New state is ACTIVE
Tue Nov 28 15:18:59 2023 [Z0][VM][I]: New LCM state is HOTPLUG_NIC
Tue Nov 28 15:18:59 2023 [Z0][VMM][E]: Error attaching new VM NIC: Missing VN_MAD, BRIDGE, TARGET or MAC in VM NIC
Tue Nov 28 15:18:59 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Nov 28 15:19:27 2023 [Z0][VM][I]: New LCM state is SHUTDOWN
Tue Nov 28 15:19:34 2023 [Z0][VMM][I]: ExitCode: 0
Tue Nov 28 15:19:34 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: shutdown.
Tue Nov 28 15:19:34 2023 [Z0][VMM][I]: Successfully execute network driver operation: clean.
Tue Nov 28 15:19:34 2023 [Z0][VM][I]: New LCM state is EPILOG
Tue Nov 28 15:19:35 2023 [Z0][VM][I]: New state is DONE
Tue Nov 28 15:19:35 2023 [Z0][VM][I]: New LCM state is LCM_INIT

oned.log

Tue Nov 28 15:18:59 2023 [Z0][ReM][D]: Req:256 UID:2 IP:172.16.198.245 one.vm.attachnic invoked , 2889, "NIC=[
    SCHED_RANK..."
Tue Nov 28 15:18:59 2023 [Z0][ReM][D]: Req:256 UID:2 one.vm.attachnic result SUCCESS, 2889
Tue Nov 28 15:18:59 2023 [Z0][VMM][D]: Message received: ATTACHNIC FAILURE 2889 Missing VN_MAD, BRIDGE, TARGET or MAC in VM NIC
treywelsh commented 7 months ago

Thank you for your feedback.

I'm able to reproduce the error Missing VN_MAD, BRIDGE, TARGET or MAC in VM NIC when I'm trying to attach a new NIC with network_mode_auto = true to an already existing VM. If I create a new VM with a NIC defining network_mode_auto = true, it works.

This error is returned by OpenNebula (you can read it in the VM USER_TEMPLATE looking at the key ERROR) when attaching a new NIC with NETWORK_MODE=auto. OpenNebula is asking to define more NIC attributes.

Maybe that we didn't understood how this should be used, in the doc I read: You can delay the network selection for each NIC in the VM to the deployment phase. or This strategy is useful to prepare generic VM templates that can be deployed in multiple OpenNebula clusters. It talks about deployment phase, VM template that can be deployed etc. Moreover, it seems that we can't attach a NIC to an existing VM in this way via sunstone or via the CLI.

treywelsh commented 7 months ago

@jamie-pate the more I read the more it seems to me that we just can neither attach a new NIC (with NETWORK_MODE=auto) nor update an existing NIC (to replace a network ID with NETWORK_MODE=auto) to an already existing VM. So it appears to me that the errors you're seeing are expected.

jamie-pate commented 7 months ago

Could this work if you just record the network_id when you create the resource, then re-use it for modifications?

It would need a note in the docs that mention that it only works for VM creation, and the network is fixed on updates unless you select a new network id ..

Otherwise it seems like it needs a new feature in opennebula to consider NETWORK_MODE during updates i guess?

github-actions[bot] commented 6 months ago

This pull request is stale because it has been open for 30 days with no activity and it is not in a milestone. Remove 'status: stale' label or comment, or this will be closed in 5 days.

treywelsh commented 6 months ago

I'll think I'll do minimal change to move forward, I'll add a note on the actual behavior in the VM documentation.

If you want something more convenient, for OpenNebula opens an issue here: https://github.com/OpenNebula/one To consider this comment Could this work if you just record the network_id when you create the resource, then re-use it for modifications?, prefer opening a new issue.