Closed pramitagautam closed 1 year ago
Which box do you use? Is it the same problem as this one https://github.com/ppggff/vagrant-qemu/wiki/Fix-ubuntu-hang ?
@ppggff
controlplane.vm.box = "roboxes/ubuntu2004"
https://app.vagrantup.com/roboxes/boxes/ubuntu2004
This is what I have in the vagrant file:-
controlplane.vm.host_name = "controlplane"
controlplane.vm.box = "roboxes/ubuntu2004"
controlplane.vm.network :private_network
controlplane.vm.synced_folder "./sync/shared", "/var/sync/shared", type: "rsync"
controlplane.vm.synced_folder "./forked", "/var/sync/forked", type: "rsync"
controlplane.vm.synced_folder "./sync/linux", "/var/sync/linux", type: "rsync"
controlplane.vm.boot_timeout = 800
controlplane.vm.provider :qemu do |qm1|
qm1.memory = 4096
qm1.cpus = 2
end
Are you sure this box works with qemu-system-aarch64 without vagrant? (you can find the img file from ~/.vagrant.d/)
I wrote that in summary that using below command I was able to do login to the ubuntu box
qemu-system-aarch64 \
-machine virt,accel=hvf \
-cpu host \
-smp 8 \
-m 8G \
-drive if=virtio,cache=none,format=raw,file=./ubuntu.img \
-cdrom ../../.vagrant.d/boxes/roboxes-VAGRANTSLASH-ubuntu2004/4.2.10/libvirt/box.img \
-net user,hostfwd=tcp::10022-:22 -net nic -nographic \
-bios QEMU_EFI.fd
QEMU_EFI.fd -> curl -L https://releases.linaro.org/components/kernel/uefi-linaro/latest/release/qemu64/QEMU_EFI.fd -o QEMU_EFI.fd
[ OK ] Finished Set console scheme.
[ OK ] Created slice Slice /system/getty.
[ OK ] Started Getty on tty1.
[ OK ] Reached target Login Prompts.
[ OK ] Finished Terminate Plymouth Boot Screen.
[ OK ] Reached target Multi-User System.
[ OK ] Reached target Graphical Interface.
Starting Record Runlevel Change in UTMP...
[ OK ] Finished Record Runlevel Change in UTMP.
Ubuntu 22.10 ubuntu-test ttyAMA0
ubuntu-test login: vagrant
Welcome to Ubuntu 22.10 (GNU/Linux 5.19.0-26-generic aarch64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Tue Feb 7 11:26:35 AM UTC 2023
System load: 0.162109375
Usage of /: 17.3% of 18.01GB
Memory usage: 3%
Swap usage: 0%
Processes: 154
Users logged in: 0
IPv4 address for docker0: 172.17.0.1
IPv4 address for enp0s1: 10.0.2.15
IPv6 address for enp0s1: fec0::5054:ff:fe12:3456
5 updates can be applied immediately.
To see these additional updates run: apt list --upgradable
The list of available updates is more than a week old.
To check for new updates run: sudo apt update
Last login: Tue Feb 7 11:23:49 UTC 2023 on ttyAMA0
vagrant@ubuntu-test:~$
This is the command vagrant is using:-
/opt/homebrew/bin/qemu-system-aarch64
-machine virt,accel=hvf,highmem=on
-cpu host
-smp 2
-m 4096
-device virtio-net-device,netdev=net0
-netdev user,id=net0,hostfwd=tcp::50022-:22
-drive if=virtio,format=qcow2,file=/Users/gpramita/Documents/sig-windows-dev-tools/.vagrant/machines/controlplane/qemu/T6lDCgg5abU/linked-box.img
-drive if=pflash,format=raw,file=/Users/gpramita/Documents/sig-windows-dev-tools/.vagrant/machines/controlplane/qemu/T6lDCgg5abU/edk2-aarch64-code.fd,readonly=on
-drive if=pflash,format=raw,file=/Users/gpramita/Documents/sig-windows-dev-tools/.vagrant/machines/controlplane/qemu/T6lDCgg5abU/edk2-arm-vars.fd
-chardev socket,id=mon0,path=/Users/gpramita/.vagrant.d/tmp/vagrant-qemu/T6lDCgg5abU/qemu_socket,server=on,wait=off
-mon chardev=mon0,mode=readline
-chardev socket,id=ser0,path=/Users/gpramita/.vagrant.d/tmp/vagrant-qemu/T6lDCgg5abU/qemu_socket_serial,server=on,wait=off
-serial chardev:ser0
-pidfile /Users/gpramita/Documents/sig-windows-dev-tools/.vagrant/machines/controlplane/qemu/T6lDCgg5abU/qemu.pid
-parallel null
-monitor none
-display none
-vga none
-daemonize
hi @ppggff !
Ok, I added an extra -drive command in the command. Sorry for that. I corrected my command and this is the output I have:-
gpramita@gpramita6C7WG sig-windows-dev-tools % qemu-system-aarch64 \
-machine virt,accel=hvf \
-cpu host \
-smp 8 \
-m 8G \
-cdrom ../../.vagrant.d/boxes/roboxes-VAGRANTSLASH-ubuntu2004/4.2.10/libvirt/box.img \
-net user,hostfwd=tcp::10022-:22 -net nic -nographic \
-bios QEMU_EFI.fd
.PXE-E18: Server response timeout.
UEFI Interactive Shell v2.1
EDK II
UEFI v2.60 (EDK II, 0x00010000)
Mapping table
BLK5: Alias(s):
VenHw(F9B94AE2-8BA6-409B-9D56-B9B417F53CB3)
BLK4: Alias(s):
VenHw(8047DB4B-7E9C-4C0C-8EBC-DFBBAACACE8F)
BLK0: Alias(s):
PciRoot(0x0)/Pci(0x2,0x0)
BLK1: Alias(s):
PciRoot(0x0)/Pci(0x2,0x0)/HD(1,MBR,0xE6D9612C,0x800,0xF3800)
BLK2: Alias(s):
PciRoot(0x0)/Pci(0x2,0x0)/HD(2,MBR,0xE6D9612C,0xF4000,0x3D0800)
BLK3: Alias(s):
PciRoot(0x0)/Pci(0x2,0x0)/HD(3,MBR,0xE6D9612C,0x4C4800,0xFB3B000)
Press ESC in 1 seconds to skip startup.nsh or any other key to continue.
Shell>
This is same when I use UTM (that uses QEMU internally).
This box/img failed to boot because it's a x86_64 version.
Linux ubuntu2004.localdomain 5.4.0-137-generic #154-Ubuntu SMP Thu Jan 5 17:03:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
It can be booted with following Vagrantfile:
Vagrant.configure("2") do |config|
config.vm.box = "roboxes/ubuntu2004"
config.vm.synced_folder ".", "/vagrant", disabled: true
config.vm.provider "qemu" do |qe|
qe.qemu_dir = "/usr/local/share/qemu/"
qe.arch="x86_64"
# need for x86_64
qe.machine = "q35"
qe.cpu = "max"
qe.net_device = "virtio-net-pci"
qe.ssh_port = 50023
end
end
Ok, connection was established but got reset. This is what is happening on Mac. I used the above vagrant file you shared @ppggff
DEBUG ssh: == Net-SSH connection debug-level log START ==
DEBUG ssh: D, [2023-02-08T17:06:12.629406 #50429] DEBUG -- net.ssh.transport.session[a5438]: establishing connection to 127.0.0.1:50023
D, [2023-02-08T17:06:12.629813 #50429] DEBUG -- net.ssh.transport.session[a5438]: connection established
I, [2023-02-08T17:06:12.629842 #50429] INFO -- net.ssh.transport.server_version[a544c]: negotiating protocol version
D, [2023-02-08T17:06:12.629849 #50429] DEBUG -- net.ssh.transport.server_version[a544c]: local is `SSH-2.0-Ruby/Net::SSH_7.0.1 x86_64-darwin19'
DEBUG ssh: == Net-SSH connection debug-level log END ==
INFO ssh: SSH not ready: #<Vagrant::Errors::SSHConnectionReset: SSH connection was reset! This usually happens when the machine is
taking too long to reboot. First, try reloading your machine with
`vagrant reload`, since a simple restart sometimes fixes things.
If that doesn't work, destroy your machine and recreate it with
a `vagrant destroy` followed by a `vagrant up`. If that doesn't work,
contact support.>
INFO machine: Calling action: read_state on provider QEMU (n7Fva0ouN9Y)
INFO interface: Machine: action ["read_state", "start", {:target=>:controlplane}]
INFO runner: Running action: machine_action_read_state #<Vagrant::Action::Builder:0x00007fb8d10d7f50>
INFO warden: Calling IN action: #<Vagrant::Action::Builtin::ConfigValidate:0x00007fb8f3a6fac8>
INFO warden: Calling IN action: #<VagrantPlugins::QEMU::Action::ReadState:0x00007fb8f3a6faa0>
INFO warden: Calling OUT action: #<VagrantPlugins::QEMU::Action::ReadState:0x00007fb8f3a6faa0>
INFO warden: Calling OUT action: #<Vagrant::Action::Builtin::ConfigValidate:0x00007fb8f3a6fac8>
INFO interface: Machine: action ["read_state", "end", {:target=>:controlplane}]
DEBUG ssh: Checking key permissions: /Users/gpramita/.vagrant.d/insecure_private_key
INFO ssh: Attempting SSH connection...
INFO ssh: Attempting to connect to SSH...
INFO ssh: - Host: 127.0.0.1
INFO ssh: - Port: 50023
INFO ssh: - Username: vagrant
INFO ssh: - Password? false
INFO ssh: - Key Path: ["/Users/gpramita/.vagrant.d/insecure_private_key"]
DEBUG ssh: - connect_opts: {:auth_methods=>["none", "hostbased", "publickey"], :config=>false, :forward_agent=>false, :send_env=>false, :keys_only=>true, :verify_host_key=>:never, :password=>nil, :port=>50023, :timeout=>15, :user_known_hosts_file=>[], :verbose=>:debug, :logger=>#<Logger:0x00007fb8d09685f8 @level=0, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007fb8d09685d0 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007fb8d0968580 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<StringIO:0x00007fb8d0968648>, @binmode=false, @mon_data=#<Monitor:0x00007fb8d0968558>, @mon_data_owner_object_id=676960>>, :keys=>["/Users/gpramita/.vagrant.d/insecure_private_key"], :remote_user=>"vagrant"}
I think the vm may not be fully booted, please provide the debug log of running vagrant up
.
(Make sure there is no qemu-system-xxx process running before vagrant up
)
Attached are the logs. @ppggff vagrant_linux_logs.log
This is the vagrant file:-
# -*- mode: ruby -*-
# vi: set ft=ruby :
require 'yaml'
require 'fileutils'
# Modify these in the variables.yaml file... they are described there in gory detail...
# This will get copied down later to synch/shared/variables... and read by the controlplane.sh etc...
settingsFile = "variables.yaml" || ENV["VAGRANT_VARIABLES"]
FileUtils.cp(settingsFile, "sync/shared/variables.yaml")
settings = YAML.load_file settingsFile
kubernetes_version=settings["kubernetes_version"]
k8s_linux_kubelet_nodeip=settings['k8s_linux_kubelet_nodeip']
linux_ram = settings['linux_ram']
linux_cpus = settings['linux_cpus']
windows_ram = settings['windows_ram']
windows_cpus = settings['windows_cpus']
windows_node_ip = settings['windows_node_ip']
cni = settings['cni']
Vagrant.configure(2) do |config|
puts "cni:"
puts cni
config.vm.define :controlplane do |controlplane|
controlplane.vm.host_name = "controlplane"
controlplane.vm.box = "roboxes/ubuntu2004"
controlplane.vm.synced_folder ".", "/vagrant", disabled: true
# controlplane.vm.synced_folder "./sync/shared", "/var/sync/shared", type: "rsync"
# controlplane.vm.synced_folder "./forked", "/var/sync/forked", type: "rsync"
# controlplane.vm.synced_folder "./sync/linux", "/var/sync/linux", type: "rsync"
controlplane.vm.provider "qemu" do |qe|
# qe.memory = linux_ram
qe.qemu_dir = "/opt/homebrew/share/qemu"
qe.arch = "x86_64"
# need for x86_64
qe.machine = "q35"
qe.cpu = "max"
qe.net_device = "virtio-net-pci"
qe.ssh_port = 50023
end
controlplane.vm.provision :shell, privileged: false, path: "sync/linux/controlplane.sh", args: "#{kubernetes_version} #{k8s_linux_kubelet_nodeip}"
if cni == "calico" then
controlplane.vm.provision "shell", path: "sync/linux/calico-0.sh"
else
controlplane.vm.provision "shell", path: "sync/linux/antrea-0.sh"
end
end
end
I have the same problem for sig-windows-dev-tools/windows-2019
box as well.
Error occurred: Vagrant exited after cleanup due to external interrupt.
Do you kill the vagrant before it finish?
I cancelled it because it is getting timeout eventually.
INFO interface: error: Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
INFO interface: Machine: error-exit ["Vagrant::Errors::VMBootTimeout", "Timed out while waiting for the machine to boot. This means that\nVagrant was unable to communicate with the guest machine within\nthe configured (\"config.vm.boot_timeout\" value) time period.\n\nIf you look above, you should be able to see the error(s) that\nVagrant had when attempting to connect to the machine. These errors\nare usually good hints as to what may be wrong.\n\nIf you're using a custom box, make sure that networking is properly\nworking and you're able to connect to the machine. It is a common\nproblem that networking isn't setup properly in these boxes.\nVerify that authentication configurations are also setup properly,\nas well.\n\nIf the box appears to be booting properly, you may want to increase\nthe timeout (\"config.vm.boot_timeout\") value."]
Please try to run with qemu directly as:
qemu-system-x86_64 -machine q35 -cpu max -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -drive if=virtio,format=qcow2,file=/Users/gpramita/Documents/sig-windows-dev-tools/.vagrant/machines/controlplane/qemu/H_GkMcjl260/linked-box.img -pidfile /Users/gpramita/Documents/sig-windows-dev-tools/.vagrant/machines/controlplane/qemu/H_GkMcjl260/qemu.pid
Ok, after struggling with it, below is my understanding/observation:-
When I run vagrant up --provider=qemu
it creates images and a pid file under .vagrant directory
I tried running the above command as suggested by @ppggff and I got the below error:-
gpramita@gpramita6C7WG sig-windows-dev-tools % qemu-system-x86_64 -machine q35 -cpu max -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -drive if=virtio,format=qcow2,file= .vagrant/machines/controlplane/qemu/gaMhOAh6M8s/linked-box.img -pidfile .vagrant/machines/controlplane/qemu/gaMhOAh6M8s/qemu.pid
qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource temporarily unavailable
And when I do vagrant destroy -f
command, these img and pid files are deleted automatically.
gpramita@gpramita6C7WG sig-windows-dev-tools % qemu-system-x86_64 -machine q35 -cpu max -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -drive if=virtio,format=qcow2,file= .vagrant/machines/controlplane/qemu/gaMhOAh6M8s/linked-box.img -pidfile .vagrant/machines/controlplane/qemu/gaMhOAh6M8s/qemu.pid
qemu-system-x86_64: cannot create PID file: Could not create '.vagrant/machines/controlplane/qemu/gaMhOAh6M8s/qemu.pid': No such file or directory
After that I manually copied the pid file and linked.img file to current directory and tried the command:-
gpramita@gpramita6C7WG sig-windows-dev-tools % qemu-system-x86_64 -machine q35 -cpu max -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -drive if=virtio,format=qcow2,file=linked-box.img -pidfile qemu.pid
And hey the ubuntu image came up fine.. screenshot attached..
Now what's the problem when it gets created to .vagrant folder!! Some permissions? I checked the files get created by my user only... So, Need help!! :)
After many test and learning from lima, I found that changing cpu type from max
to qemu64
works.
Following Vagrantfile
up and ssh successfully on a macbook with apple m2.
Vagrant.configure("2") do |config|
config.vm.box = "roboxes/ubuntu2004"
config.vm.synced_folder ".", "/vagrant", disabled: true
config.vm.provider "qemu" do |qe|
# qe.qemu_dir = "/usr/local/share/qemu/"
qe.arch="x86_64"
# need for x86_64
qe.machine = "q35"
qe.cpu = "qemu64"
qe.net_device = "virtio-net-pci"
qe.ssh_port = 50023
end
end
Thanks @ppggff the solution worked for me as well. The ubuntu VM is up. I tried the windows one but that's failing for me. I am getting VM boot timeout and Vagrant::Errors::NetSSHException errors. Can you help me with that as well? Sharing the vagrant file below:-
puts "cni:"
puts cni
# WINDOWS WORKER (win server 2019)
config.vm.define :winw1 do |winw1|
winw1.vm.host_name = "winw1"
winw1.vm.box = "sig-windows-dev-tools/windows-2019"
winw1.vm.box_version = "1.0"
winw1.vm.synced_folder ".", "/vagrant", disabled:true
winw1.winrm.username = "vagrant"
winw1.winrm.password = "vagrant"
winw1.vm.provider :qemu do |qe|
qe.qemu_dir = "/opt/homebrew/share/qemu"
qe.arch = "x86_64"
# need for x86_64
qe.machine = "q35"
qe.cpu = "qemu64"
qe.net_device = "virtio-net-pci"
qe.no_daemonize = true
qe.control_port = 33333
qe.debug_port = 33334
end
end
end
Below is the error:-
DEBUG ssh: == Net-SSH connection debug-level log START ==
DEBUG ssh: D, [2023-03-01T12:50:43.741026 #16965] DEBUG -- net.ssh.transport.session[11364]: establishing connection to 127.0.0.1:50022
D, [2023-03-01T12:50:43.741567 #16965] DEBUG -- net.ssh.transport.session[11364]: connection established
I, [2023-03-01T12:50:43.741614 #16965] INFO -- net.ssh.transport.server_version[11378]: negotiating protocol version
D, [2023-03-01T12:50:43.741626 #16965] DEBUG -- net.ssh.transport.server_version[11378]: local is `SSH-2.0-Ruby/Net::SSH_7.0.1 x86_64-darwin19'
DEBUG ssh: == Net-SSH connection debug-level log END ==
INFO ssh: SSH not ready: #<Vagrant::Errors::NetSSHException: An error occurred in the underlying SSH library that Vagrant uses.
The error message is shown below. In many cases, errors from this
library are caused by ssh-agent issues. Try disabling your SSH
agent or removing some keys and try again.
If the problem persists, please report a bug to the net-ssh project.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
INFO interface: Machine: error-exit ["Vagrant::Errors::VMBootTimeout", "Timed out while waiting for the machine to boot. This means that\nVagrant was unable to communicate with the guest machine within\nthe configured (\"config.vm.boot_timeout\" value) time period.\n\nIf you look above, you should be able to see the error(s) that\nVagrant had when attempting to connect to the machine. These errors\nare usually good hints as to what may be wrong.\n\nIf you're using a custom box, make sure that networking is properly\nworking and you're able to connect to the machine. It is a common\nproblem that networking isn't setup properly in these boxes.\nVerify that authentication configurations are also setup properly,\nas well.\n\nIf the box appears to be booting properly, you may want to increase\nthe timeout (\"config.vm.boot_timeout\") value."]
Thanks alot for this - next step is the windows drivers
It seems this windows box doesn't support virtio boot device. I will add a new config argument to custom it soon.
@ppggff I was trying to use qemu commands directly. Do you have suggestion on what all to include. I tried below, but all failed to boot up the windows 2019;-
1. qemu-system-x86_64 -machine q35 -cpu max -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -drive if=virtio,format=qcow2,file=linked-box.img -pidfile qemu.pid
2. qemu-system-x86_64 -machine q35 -cpu max -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -drive if=scsi,format=qcow2,file=linked-box.img -pidfile qemu.pid
Please update to v0.3.4 by vagrant plugin update
.
Then boot the vm with following Vagrantfile
:
Vagrant.configure("2") do |config|
config.vm.box = "sig-windows-dev-tools/windows-2019"
config.vm.synced_folder ".", "/vagrant", disabled: true
config.vm.provider "qemu" do |qe|
# qe.qemu_dir = "/usr/local/share/qemu/"
qe.arch="x86_64"
# need for x86_64
qe.machine = "q35"
qe.cpu = "qemu64"
# devices compatible with this box
qe.net_device = "e1000"
qe.drive_interface = "ide"
qe.ssh_port = 50023
end
# use password (use winrm?)
config.vm.provider "qemu" do |qe, override|
override.ssh.username = "vagrant"
override.ssh.password = "vagrant"
end
end
Then it will boot with following messages:
The configured shell (config.ssh.shell) is invalid and unable
to properly execute commands. The most common cause for this is
using a shell that is unavailable on the system. Please verify
you're using the full path to the shell and that the shell is
executable by the SSH user.
Maybe you should set a different shell to fix this, I didn't try it.
Then you can use vagrant ssh
to login. (password is vagrant)
And you can find the actual qemu command by excuting ps -ef|grep qemu
on the host.
@ppggff Thanks for providing the support in vagrant for both the boxes. v0.3.4 works fine for windows server 2019 box with the vagrant sample file you shared.
Also, just for the record, the following command also booted (apart from the one that vagrant generates) the windows server 2019 VM:-
qemu-system-x86_64 -machine q35 -cpu qemu64 -smp 2 -m 4G -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::50023-:22 -hda linked-box.img -pidfile qemu.pid
Closing this issue now.
I have a amd64 ubuntu box and I am using "vagrant up --provider qemu" on M1 Mac Monterey. But the ssh is not working and it is getting timed out. Below is the error I am getting:-
Though the same works when I use
qemu-system-aarch64
command as below:-So I believe that this is a vagrant error when using qemu.