ipspace / netlab

Making virtual networking labs suck less
https://netlab.tools
Other
460 stars 69 forks source link

Implement libvirt.uuid to influence the Serial Number of nodes #1407

Closed sdargoeuves closed 1 month ago

sdargoeuves commented 1 month ago

As suggested in the issue 1405, here is the ticket to discuss implementation of a libvirt.uuid node option.

I currently have a workaround, where after spinning up a lab, I want to force the serial number of a(some) specific node(s) to influence the serial number. To do so:

  1. Back up the uuid I want to keep
  2. Bring netlab up
  3. Copy the dumpxml of the node(s): sudo dumpxml --domain <node) > /tmp/<node>.xml
  4. Replace in the temporary file the uuid
  5. Destroy and undefine the node(s) using sudo virsh destroy <node> followed by sudo virsh undefine <node>
  6. Redefine the node, using the temporary file: sudo virsh define --file /tmp/<node>.xml
  7. Restart the vm: sudo virsh start node

There is a caveat when doing so will affect the netlab down. The VM needs to be destroyed and undefined manually. I will add some logs later.

sdargoeuves commented 1 month ago

note1:

I haven't investigated the option of using sudo virsh edit <node> to modify the uuid, it might be easier, but you can't do that if the node is running:

  ~/code/netsim-main-lab on   revival_q3_2024 !6 ?1 ❯ sudo virsh edit ml_1_lb3x01
error: operation failed: domain 'ml_1_lb3x01' already exists with uuid 8e7406de-b67f-41ec-b7ef-3d14a761a04e
Failed. Try again? [y,n,i,f,?]:

note 2

The logs when doing netlab down with the issue linked to the node I've manually destroy/undefine/define/start

  ~/code/netsim-main-lab on   revival_q3_2024 !6 ?1 ❯ netlab down
[SUCCESS] Read transformed lab topology from snapshot file netlab.snapshot.yml

┌──────────────────────────────────────────────────────────────────────────────────┐
│ CHECKING virtualization provider installation                                    │
└──────────────────────────────────────────────────────────────────────────────────┘
[SUCCESS] libvirt installed and working correctly
[SUCCESS] clab installed and working correctly

┌──────────────────────────────────────────────────────────────────────────────────┐
│ STOPPING clab nodes                                                              │
└──────────────────────────────────────────────────────────────────────────────────┘
INFO[0000] Parsing & checking topology file: clab-augment.yml
INFO[0000] Destroying lab: ml_1
INFO[0002] Removed container: clab-ml_1-s3xh04
[...]
INFO[0010] Removed container: clab-ml_1-s4xsw07
INFO[0010] Removing containerlab host entries from /etc/hosts file
INFO[0010] Removing ssh config for containerlab nodes

┌──────────────────────────────────────────────────────────────────────────────────┐
│ STOPPING libvirt nodes                                                           │
└──────────────────────────────────────────────────────────────────────────────────┘
==> s5xr06: Removing domain...
==> s5xr06: Deleting the machine folder
[...]
==> c1xr01: Deleting the machine folder
Name `ml_1_lb3x01` of domain about to create is already taken. Please try to run
`vagrant up` command again.
Error executing vagrant destroy -f:
  Command '['vagrant', 'destroy', '-f']' returned non-zero exit status 1.
[FATAL]   netlab down: vagrant destroy -f failed, aborting...

  ~/co/netsim-main-lab on   revival_q3_2024 !6 ?1 ❯ sudo virsh list --all
 Id   Name          State
------------------------------
 56   ml_1_lb3x01   running
 -    stormshield   shut off

To be able to execute correctly the netlab down command, you will need to destroy and undefine the node:

  ~/code/netsim-main-lab on   revival_q3_2024 !6 ?1 ❯ sudo virsh destroy ml_1_lb3x01 && sudo virsh undefine ml_1_lb3x01
Domain ml_1_lb3x01 destroyed

Domain ml_1_lb3x01 has been undefined

  ~/code/netsim-main-lab on   revival_q3_2024 !6 ?1 ❯ netlab down
[SUCCESS] Read transformed lab topology from snapshot file netlab.snapshot.yml

┌──────────────────────────────────────────────────────────────────────────────────┐
│ CHECKING virtualization provider installation                                    │
└──────────────────────────────────────────────────────────────────────────────────┘
[SUCCESS] libvirt installed and working correctly
[SUCCESS] clab installed and working correctly

┌──────────────────────────────────────────────────────────────────────────────────┐
│ STOPPING clab nodes                                                              │
└──────────────────────────────────────────────────────────────────────────────────┘
INFO[0000] no containerlab containers found
Error executing sudo ip link del dev ml_1_28:
  Command '['sudo', 'ip', 'link', 'del', 'dev', 'ml_1_28']' returned non-zero exit status 1.
[...]

┌──────────────────────────────────────────────────────────────────────────────────┐
│ STOPPING libvirt nodes                                                           │
└──────────────────────────────────────────────────────────────────────────────────┘
==> s5xr06: Domain is not created. Please run `vagrant up` first.
[...]
==> c1xr01: Domain is not created. Please run `vagrant up` first.
==> lb3x01: Remove stale volume...
==> lb3x01: Domain is not created. Please run `vagrant up` first.
ipspace commented 1 month ago

You'll be glad to know that EOS VM uses UUID to generate the serial number (or I just got the same serial number twice by pure luck).

sdargoeuves commented 1 month ago

Thank you so much! I will start playing with the dev version of netlab to test this (and other fixes/improvements)

sdargoeuves commented 1 month ago

This is awesome, thank you so much @ipspace, what used to be a very long and complex script to try to automate this part, is now one line in the topology file... I absolutely love what you are doing, thank you