Closed helpdeskdan closed 7 months ago
That's a generic problem. Tons of devices use virt-install to create Vagrant boxes without specifying --os-variant
or --osinfo
.
Affected devices seem to be: Aruba CX, Cisco ASAv, Dell OS10, IOS XR, Mikrotik RouterOS7, Juniper vSRX. Other devices either don't have libvirt build recipes or use XML templates.
We could set the environment variable in netlab libvirt module and hope for the best, but something might break down the line, so it would be better to fix the build instructions. For the moment I'll fix the documentation 🤷♂️
Adding @ssasso as you own several devices in the above list
P.S. This netlab is a fantastic idea!
Thank you!
Is there, maybe, a newbie forum where people can ask/give assistance?
You can always open a discussion in this repo. There's also a Slack channel in network2code Slack team, but I don't know how active that is.
I can't get the config to stick in xr so vagrant fails when it tries to login. I'm still in the "trying" phase of that, not ready to file bug on that.
Weird. I don't remember having any issues along those lines, but then I only built IOS XR box once with whatever old version I managed to get, moved on, and never looked back (I have better things to do in my life than to wait for IOS XR to boot 🤦♂️).
Once a VM is started with Vagrant, you can get the VM name with virsh list and connect to the console with virsh console to investigate what's going on. Hope you'll figure it out.
Agreed - you have better things to do with your time than wait for ios-xr to boot! I will see what I can do and report back. Thank you for your time!
Is there a way to set config.vm.boot_timeout in Vagrantfile without it being written over?
Yes, you can take the system iosxr-domain.j2, copy it into the current directory and modify it. It's briefly mentioned in https://netlab.tools/customize/, the details are in https://blog.ipspace.net/2022/06/netsim-custom-vagrant-boxes.html.
One of these days I have to add links to all those blog posts to netlab documentation.
Step 1. Move to a much better box. Much more Ram, Cpu, 22.04 - install netlab. Quick tests on cumulus - I have ospf neighbors - everything good! But, I need to work on... (sigh)... IOXR. This is where things go south.
WARNING --os-variant/--osinfo OS name is required, but no value was set or detected. This is now a fatal error. Specifying an OS name is required for modern, performant, and secure virtual machine defaults. You can see a full list of possible OS name values with: virt-install --osinfo list If your Linux distro is not listed, try one of generic values such as: linux2022, linux2020, linux2018, linux2016 If you just need to get the old behavior back, you can use: --osinfo detect=on,require=off Or export VIRTINSTALL_OSINFO_DISABLE_REQUIRE=1 WARNING VIRTINSTALL_OSINFO_DISABLE_REQUIRE set. Skipping fatal error. WARNING Using --osinfo generic, VM performance may suffer. Specify an accurate OS for optimal results. Starting install... ERROR Network not found: no network with matching name 'vagrant-libvirt' Domain installation does not appear to have been successful. If it was, you can restart your domain by running: virsh --connect qemu:///system start vm_box otherwise, please restart your installation. Error executing virt-install --connect=qemu:///system --network network=vagrant-libvirt,model=e1000 --name=vm_box --cpu host --arch=x86_64 --vcpus=2 --ram=8192 --virt-type=kvm --disk path=vm.qcow2,format=qcow2,device=disk,bus=ide --graphics none --import: Command '['virt-install', '--connect=qemu:///system', '--network', 'network=vagrant-libvirt,model=e1000', '--name=vm_box', '--cpu', 'host', '--arch=x86_64', '--vcpus=2', '--ram=8192', '--virt-type=kvm', '--disk', 'path=vm.qcow2,format=qcow2,device=disk,bus=ide', '--graphics', 'none', '--import']' returned non-zero exit status 1. [FATAL] Aborting
Looks like 'netlab libvirt' did not create the 'vagrant-libvirt' network. That's weird, have to look into the source code.
Quite perplexed - I did not have this problem on my ancient desktop. (I had different problems, but not that one) I'd be happy to test anything I can.
OK, I checked the source code and netlab libvirt package definitely tries to create the vagrant-libvirt network. Admittedly, those commands are not error-checked (have to fix that).
Anyway, just to be on the safe side, I created a brand-new Ubuntu 22.04 VM, ran netlab install ubuntu libvirt on it and started creating an IOS XR box. Apart from the horrible "we have no idea what OS you're using" error the VM started, but then I killed the process.
It could be a permission problem. Did you use netlab install to install the virtualization software (kvm, libvirt, vagrant, vagrant plugin), or did you install it by yourself? If you used netlab install, did you logout after the installation (the group membership is evaluated only during the login procedure)?
uid=1000(dans) gid=1000(dans) groups=1000(dans),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),133(lxd),134(sambashare),137(libvirt),999(docker)
That looks right, and I created a dir in tmp.... I've been running cumulus and frr just fine. Perhaps it is something specific to this box, but I am afraid I don't understand vagrant well enough to troubleshoot.
I'm slowly running out of ideas (as in "so far I was throwing spaghetti at the wall to see if anything sticks, but now I'm running out of pasta"). Please run netlab libvirt package -vvv and post the full printout.
Also, it's not a Vagrant problem, it's a libvirt one. Vagrant is not involved until the disk image is modified with the startup configuration.
FWIW, did you use netlab install ubuntu libvirt to set things up?
Yes, many apologies, complete newbie to all this. Even I should have realized vagrant is not at all libvirt.
I did use netlab install ubuntu ansible libvirt containerlab. However, and I am sorry for not mentioning this sooner, but I am doing this in miniconda. I didn't mention that because it should work - it worked last time I did it that way.....
I suppose I should learn how to do this manually and see why libvirt isn't working.
$ netlab libvirt package -vvv iosxr xrv9k-fullk9-x.vrr-7.11.1.qcow2 ================= WARNING ================= This is an experimental script that does its best to build a Vagrant box for libvirt provider out of a VM disk. It might die a horrible death and leave all sorts of garbage behind that you'll have to clean up by hand (for example, libvirt 'vm_box' virtual machine). It also assumes that it can wreak havoc in the current directory (although it will do its best not to damage the original virtual disk). Do you want to continue? [y/n]y error: failed to get domain 'vm_box' error: failed to get domain 'vm_box' creating libvirt management network vagrant-libvirt Creating a copy of xrv9k-fullk9-x.vrr-7.11.1.qcow2 ==================== Starting the VM ==================== We'll start the VM from the newly-created virtual disk. When the VM starts, execute 'netlab libvirt config iosxr' in another window and follow the instructions. ==================== ERROR --os-variant/--osinfo OS name is required, but no value was set or detected. This is now a fatal error. Specifying an OS name is required for modern, performant, and secure virtual machine defaults. You can see a full list of possible OS name values with: virt-install --osinfo list If your Linux distro is not listed, try one of generic values such as: linux2022, linux2020, linux2018, linux2016 If you just need to get the old behavior back, you can use: --osinfo detect=on,require=off Or export VIRTINSTALL_OSINFO_DISABLE_REQUIRE=1 Error executing virt-install --connect=qemu:///system --network network=vagrant-libvirt,model=e1000 --name=vm_box --cpu host --arch=x86_64 --vcpus=2 --ram=8192 --virt-type=kvm --disk path=vm.qcow2,format=qcow2,device=disk,bus=ide --graphics none --import: Command '['virt-install', '--connect=qemu:///system', '--network', 'network=vagrant-libvirt,model=e1000', '--name=vm_box', '--cpu', 'host', '--arch=x86_64', '--vcpus=2', '--ram=8192', '--virt-type=kvm', '--disk', 'path=vm.qcow2,format=qcow2,device=disk,bus=ide', '--graphics', 'none', '--import']' returned non-zero exit status 1. [FATAL] Aborting
OK, I was hoping to get more debugging printouts :( Will fix the code to generate them; you'll have to clone the repo and run netlab from there.
The printout does indicate that the code to create the management network is executed, we just don't know what's going on inside it (and that's why I need those extra printouts). However, the management network is not there when netlab executes 'virt-install', which is totally weird, because based on "Cumulus Linux works" (assuming you're running it in VMs, not containers), obviously netlab successfully creates management network before starting Vagrant.
We must be doing something that something in your setup dislikes, but I can't figure out what it might be. Will write another comment to notify you once I have the debugging printouts in place.
Grasping at straws ;) -- can you do virsh net-list --all
after netlab libvirt package fails? Because it fails it doesn't do a cleanup, so we should see the virtual networks.
Oh, another potential gotcha that might explain the difference between netlab up and netlab libvirt package. Do export LIBVIRT_DEFAULT_URI=qemu:///system
and retry.
@helpdeskdan did the LIBVIRT_DEFAULT_URI environment variable help? Would love to add it to netlab libvirt in the next day or two, but it would be nice to know before that if it solved your problem or not.
Apologies, I have been ill.
$ export LIBVIRT_DEFAULT_URI=qemu:///system (netlab) ╭─dans@TheReplacement /tmp/cisco ╰─$ netlab libvirt package -vvv iosxr xrv9k-fullk9-x.vrr-7.11.1.qcow2 ================= WARNING ================= This is an experimental script that does its best to build a Vagrant box for libvirt provider out of a VM disk. It might die a horrible death and leave all sorts of garbage behind that you'll have to clean up by hand (for example, libvirt 'vm_box' virtual machine). It also assumes that it can wreak havoc in the current directory (although it will do its best not to damage the original virtual disk). Do you want to continue? [y/n]y error: failed to get domain 'vm_box' error: failed to get domain 'vm_box' creating libvirt management network vagrant-libvirt Creating a copy of xrv9k-fullk9-x.vrr-7.11.1.qcow2 ==================== Starting the VM ==================== We'll start the VM from the newly-created virtual disk. When the VM starts, execute 'netlab libvirt config iosxr' in another window and follow the instructions. ==================== ERROR --os-variant/--osinfo OS name is required, but no value was set or detected. This is now a fatal error. Specifying an OS name is required for modern, performant, and secure virtual machine defaults. You can see a full list of possible OS name values with: virt-install --osinfo list If your Linux distro is not listed, try one of generic values such as: linux2022, linux2020, linux2018, linux2016 If you just need to get the old behavior back, you can use: --osinfo detect=on,require=off Or export VIRTINSTALL_OSINFO_DISABLE_REQUIRE=1 Error executing virt-install --connect=qemu:///system --network network=vagrant-libvirt,model=e1000 --name=vm_box --cpu host --arch=x86_64 --vcpus=2 --ram=8192 --virt-type=kvm --disk path=vm.qcow2,format=qcow2,device=disk,bus=ide --graphics none --import: Command '['virt-install', '--connect=qemu:///system', '--network', 'network=vagrant-libvirt,model=e1000', '--name=vm_box', '--cpu', 'host', '--arch=x86_64', '--vcpus=2', '--ram=8192', '--virt-type=kvm', '--disk', 'path=vm.qcow2,format=qcow2,device=disk,bus=ide', '--graphics', 'none', '--import']' returned non-zero exit status 1. [FATAL] Aborting (netlab) ╭─dans@TheReplacement /tmp/cisco ╰─$ virsh net-list --all 1 ↵ Name State Autostart Persistent ---------------------------------------------------- default active yes yes vagrant-libvirt active no yes
Yet, it works:
$ cat topology.yml defaults: device: cumulus module: [ ospf ] nodes: [ s1, s2, s3 ] links: [ s1-s2, s2-s3, s1-s2-s3 ] (netlab) ╭─dans@TheReplacement ~/test ╰─$ netlab up
yada yada... netlab connect s1
s1# show ip ospf neighbor Neighbor ID Pri State Dead Time Address Interface RXmtL RqstL DBsmL 10.0.0.2 1 Full/DROther 36.790s 10.1.0.2 swp1:10.1.0.1 0 0 0 10.0.0.2 1 Full/Backup 36.790s 172.16.0.2 swp2:172.16.0.1 0 0 0 10.0.0.3 1 Full/DR 36.766s 172.16.0.3 swp2:172.16.0.1 1 0 0
Apologies, I have been ill.
So sorry to hear that :( Hope you're getting better and I apologize for bother you!
I tried to find something that would work on Ubuntu 20.04 and 22.04. Unfortunately, the virt-install used in Ubuntu 20.04 does not accept --osinfo
parameter at all, while the 22.04 version made it (almost) mandatory.
Obviously we could create the virtual machines from XML templates (like we're doing for Arista vEOS or Cisco IOSv) but I'm not going to waste my time going down that path for IOS XR. I'll just keep hoping the current workaround does not result in too dismal performance.
Note to anyone who might read this in the future: please feel free to submit a pull request containing the XML VM definition template for any device that still uses virt-install to create the box-building VM.
Document we need to fix
https://netlab.tools/labs/iosxr/
What's wrong
You have to do
Before you run:
Or it will not run. I'm on 22.04.3
P.S. This netlab is a fantastic idea! Is there, maybe, a newbie forum where people can ask/give assistance? I can't get the config to stick in xr so vagrant fails when it tries to login. I'm still in the "trying" phase of that, not ready to file bug on that.