ipspace / netlab

Making virtual networking labs suck less
https://netlab.tools
Other
439 stars 66 forks source link

[BUG] Version checking at "netlab up" seems bogus #1313

Closed DanPartelly closed 1 month ago

DanPartelly commented 1 month ago

Describe the bug

Running a lab which is built of multiple libbvirtd boxes , using both isov and iosvl2 devices results in conflicting version check results. In my case the device versions where set like this: iosv , the L3 image is at 159.3.M8 while iosvl2 is simply set to 2020. the version checking seems to be confused about the fact there are 2 different device types and compares their versions.

To Reproduce

Run a lab with both iosvl2 and iosv devices. Please notice that checking is inconsistent a bit, there where several runs when no warnings where emited.

Expected behavior

A clear and concise description of what you expected to happen.

Lab topology

this skeleton should doit:

defaults: device: iosv

nodes: r1: r2: r3: r4:
sw1: device: iosvl2

links:

Output

==> r1: Checking if box 'cisco/iosv' version '159.3.M8' is up to date... ==> r1: A newer version of the box 'cisco/iosvl2' for provider 'libvirt' is ==> r1: available! You currently have version '159.3.M8'. The latest is version ==> r1: '2020'. Run vagrant box update to update.

Version

netlab version 1.9.0-post1 Note that I run the bleeding edge, my package was generated from latest dev commits

ipspace commented 1 month ago

@DanPartelly I cannot reproduce the bug. What version of Vagrant are you using? Also, could you please add the printout of "vagrant box list"?

DanPartelly commented 1 month ago

vagrant 2.3.7 with libvirt plugin 0.11.2

vagrant box list cisco/iosv (libvirt, 159.3.M8) cisco/iosvl2 (libvirt, 2020)

I tried again, and the (at netlab up ) message is the same. The lab is brought up and r1 is running the correct L3 image. Ill read the previous emails you sent and try the branch you mentioned.

@DanPartelly I cannot reproduce the bug. What version of Vagrant are you using? Also, could you please add the printout of "vagrant box list"?

ipspace commented 1 month ago

I have Vagrant 2.4.1, and our installation script pins it to 2.4.0-1, so you're definitely using an older version. Same with vagrant-libvirt, our installation script pins it to 0.12.2, and that's what I'm using.

As it seems to be a Vagrant problem - can you upgrade Vagrant and retry? It doesn't make sense to use weird box names that are not equal to device names if we're dealing with a fixed Vagrant bug.

DanPartelly commented 1 month ago

Sure. This particular box is an Open Suse Linux, so some packages are updated slower than in a typical rolling distro. Ill branch vagrant package and update it, to 2.4.0+ if it can be built/used with the same ruby and rubygems versions the base system has. If not, tough luck, I wont update that whole chain of packages and maintain a separate branch. As the labs work properly, it makes no sense to do that, neither for me, neither for you to worry about anything. And tbh, I personally would certainly prefer a bogus warning and a quirk note then weird device names.

My home NixOS box uses more recent packages, but I cant try networklab again on it for another week.

A side note: the whole SLES derived family of Linuxes uses globally enforced crypto policies, as redhat and derivatives such as Alma Linux. and whatever else. So it is yet another system family where update-crypto-policies --set LEGACY is needed. I made this side note because I recalled reading about Alma Linux in networklab quirk docs, so here it goes nothing.

As it seems to be a Vagrant problem - can you upgrade Vagrant and retry? It doesn't make sense to use weird box names that are not equal to device names if we're dealing with a fixed Vagrant bug.

DanPartelly commented 1 month ago

This is not really a bug in either networklab or vagrant IMO. But there are solutions.

It happens because both box images are generated locally with netlab libvirt package ... . If the boxes are built in the same folder in sequence, box.json is overwritten. Now, once the machines are added to .Vagrant, vagrant generates a metadata.url which references the box.json in the local file system.

The issue is , now all boxes created in sequence will reference the latest box.json (since it get overwritten every time a box is built in that folder). this will lead to bogus version comparisons. If you delete the folder where you build the boxes , vagrant will rightfully complain that it cannot download the metadata to see if there is a need for a box update.

Solution 1: mention this in docs. Advice to build boxes in separate folders, if you plan to maintain them locally. Also mention what happens when you delete the folder in which you created the VMs.

Solution 2: do not use the generic "box.json" name when building images with netlab libvirt, if vagrant allows it.

ipspace commented 1 month ago

Wow, thanks a million for taking the time and figuring this out. That never happened to me because I'm building boxes in separate folders.

Changing 'box.json' to 'netlabdevice-version-box.json' is probably doable. Anyway, I'll delete the branch with the wrong solution and work on this one ;)

ipspace commented 1 month ago

I changed the filename of the box.json file. Modified code works on my end; could you please check whether it works for you?

DanPartelly commented 1 month ago

It does, Ivan. No more errors an netlab up. Also the output of vagrant box outdated --global

is what is expected.

Modified code works on my end; could you please check whether it works for you?

ipspace commented 1 month ago

Great. Thanks a million for the feedback!