ppggff / vagrant-qemu

Use Vagrant to manage machines using QEMU. Test with Apple Silicon / M1 and CentOS aarch64 image
MIT License
408 stars 32 forks source link

hostfwd ports (windows) fail/rejected by qemu , 127.0.0.1 gets injected/rejected... for some reason? #39

Closed jayunit100 closed 11 months ago

jayunit100 commented 1 year ago

Am finding it tricky to forward ports properly... getting this message

Invalid host forwarding rule 'tcp:127.0.0.1::55986-:5986' (Bad host port)

Vagrant file and Error message are pasted below for simplifying the repro of this issue...

Error:

sig-windows-dev-tools git:(main-qemu) ✗ vagrant up winw1        

cni: calico
qemu loopBringing machine 'winw1' up with 'qemu' provider...
==> winw1: Checking if box 'sig-windows-dev-tools/windows-2019' version '1.0' is up to date...
Password:
==> winw1: Preparing SMB shared folders...
    winw1: You will be asked for the username and password to use for the SMB
    winw1: folders shortly. Please use the proper username/password of your
    winw1: account.
    winw1:  
    winw1: Username (user[@domain]): 
    winw1: Password (will be hidden): 
==> winw1: Warning! The QEMU provider doesn't support any of the Vagrant
==> winw1: high-level network configurations (`config.vm.network`). They
==> winw1: will be silently ignored.
==> winw1: Starting the instance...
A command executed by Vagrant didn't complete successfully!
The command run along with the output from the command is shown
below.

Command: ["qemu-system-x86_64", "-machine", "q35", "-cpu", "qemu64", "-smp", "2", "-m", "4096", "-device", "e1000,netdev=net0", "-netdev", "user,id=net0,hostfwd=tcp::50023-:22,hostfwd=tcp:127.0.0.1::55985-:5985,hostfwd=tcp:127.0.0.1::55986-:5986,net=10.20.30.0/24,dhcpstart=10.20.30.20", "-drive", "if=ide,format=qcow2,file=/Users/jayunit100/SOURCE/sig-windows-dev-tools/.vagrant/machines/winw1/qemu/P7k65Hw0IQI/linked-box.img", "-chardev", "socket,id=mon0,path=/Users/jayunit100/.vagrant.d/tmp/vagrant-qemu/P7k65Hw0IQI/qemu_socket,server=on,wait=off", "-mon", "chardev=mon0,mode=readline", "-chardev", "socket,id=ser0,path=/Users/jayunit100/.vagrant.d/tmp/vagrant-qemu/P7k65Hw0IQI/qemu_socket_serial,server=on,wait=off", "-serial", "chardev:ser0", "-pidfile", "/Users/jayunit100/SOURCE/sig-windows-dev-tools/.vagrant/machines/winw1/qemu/P7k65Hw0IQI/qemu.pid", "-parallel", "null", "-monitor", "none", "-display", "none", "-vga", "none", "-daemonize", {:notify=>[:stdout, :stderr, :stdin]}]

And the exact error message happens here:

Stderr: qemu-system-x86_64: -netdev user,id=net0,hostfwd=tcp::50023-:22,hostfwd=tcp:127.0.0.1::55985-:5985,hostfwd=tcp:127.0.0.1::55986-:5986,net=10.20.30.0/24,dhcpstart=10.20.30.20: Invalid host forwarding rule 'tcp:127.0.0.1::55986-:5986' (Bad host port)

Vagrantfile:

# -*- mode: ruby -*-
# vi: set ft=ruby :
require 'yaml'
require 'fileutils'

# Modify these in the variables.yaml file... they are described there in gory detail...
# This will get copied down later to synch/shared/variables... and read by the controlplane.sh etc...
settingsFile = "variables.yaml" || ENV["VAGRANT_VARIABLES"]
FileUtils.cp(settingsFile, "sync/shared/variables.yaml")
settings = YAML.load_file settingsFile

kubernetes_version=settings["kubernetes_version"]
k8s_linux_kubelet_nodeip=settings['k8s_linux_kubelet_nodeip']
pod_cidr=settings['pod_cidr']
calico_version=settings['calico_version']
containerd_version=settings['containerd_version']

linux_ram = settings['linux_ram']
linux_cpus = settings['linux_cpus']
windows_ram = settings['windows_ram']
windows_cpus = settings['windows_cpus']
windows_node_ip = settings['windows_node_ip']

cni = settings['cni']

Vagrant.configure(2) do |config|
  puts "cni: #{cni}"

#   LINUX Control Plane
  config.vm.define :controlplane do |controlplane|
    controlplane.vm.host_name = "controlplane"
    controlplane.vm.box = "roboxes/ubuntu2004"

    controlplane.vm.network :private_network, ip:"#{k8s_linux_kubelet_nodeip}"

    controlplane.vm.synced_folder ".", "/vagrant", disabled: true
    controlplane.vm.synced_folder "./sync/shared", "/var/sync/shared", type: "rsync"
    controlplane.vm.synced_folder "./forked", "/var/sync/forked", type: "rsync"
    controlplane.vm.synced_folder "./sync/linux", "/var/sync/linux", type: "rsync"
    controlplane.vm.network "private_network", ip: "10.20.30.10"

    controlplane.vm.provider "qemu" do |qe|
      qe.memory = linux_ram
      qe.arch = "x86_64"

      # need for x86_64
      qe.machine = "q35"
      qe.cpu = "qemu64"
      qe.net_device = "virtio-net-pci"
      qe.extra_netdev_args = "net=10.20.30.0/24,dhcpstart=10.20.30.10"

      print "qemu loop"
    end

    ### This allows the node to default to the right IP i think....
    # 1) this seems to break the ability to get to the internet

    controlplane.vm.provision :shell, privileged: false, path: "sync/linux/controlplane.sh", args: "#{kubernetes_version} #{k8s_linux_kubelet_nodeip} #{pod_cidr}"

    # TODO shoudl we pass KuberneteVersion to calico agent exe? and also service cidr if needed?
    # dont run as priveliged cuz we need the kubeconfig from regular user
    if cni == "calico" then
      controlplane.vm.provision "shell", path: "sync/linux/calico-0.sh", args: "#{pod_cidr} #{calico_version}"
    else
      controlplane.vm.provision "shell", path: "sync/linux/antrea-0.sh"
    end
  end

  config.vm.define "winw1" do |winw1|
    winw1.vm.box = "sig-windows-dev-tools/windows-2019"
     windows_box_path = File.expand_path("../../boxes/sig-windows-dev-tools-windows-2019.qcow2", __FILE__)

     #winw1.vm.network :forwarded_port, guest: 5986, host: 55986
     #winw1.vm.network :forwarded_port, guest: 2222, host: 50024
     #winw1.vm.network :forwarded_port, guest:2222, host:50025
     #config.vm.network :forwarded_port, guest:2222, host:50026

     winw1.vm.provider :qemu do |qemu|
          qemu.arch = "x86_64"
          qemu.memory = windows_ram
          qemu.machine = "q35"
          qemu.cpu = "qemu64"
          qemu.net_device = "e1000"
          qemu.drive_interface = "ide"
          # without this you get port collision and vagrant vm wont come up
          qemu.ssh_port = 50023

          qemu.command_line = [
            "-device", "virtio-net-pci,netdev=net0",
            "-netdev", "user,id=net0,hostfwd=tcp::50023-:22,hostfwd=tcp::3333:22,hostfwd=tcp::55985-:5985,hostfwd=tcp::55986-:5986",
            "-drive", "file=#{windows_box_path},format=qcow2,if=none,id=hd0",
            "-device", "ide-hd,bus=ide.0,drive=hd0",
            "-object", "rng-random,filename=/dev/random,id=rng0",
            "-device", "virtio-rng-pci,rng=rng0",
          ]
          qemu.extra_netdev_args = "net=10.20.30.0/24,dhcpstart=10.20.30.20"
    end

    winw1.vm.provision "shell", inline: "echo Hello, World!"

    winw1.vm.communicator = "winrm"
    winw1.winrm.username = "vagrant"
    winw1.winrm.password = "vagrant"
    winw1.winrm.port = 5986 # WinRM HTTPS port
    winw1.winrm.transport = :ssl
    winw1.winrm.ssl_peer_verification = false
  end
end

Original issue

Update: maybe theres an issue when we setup hostfwd w/ 127.0.0.1 being explicitly in the url... 127.0.0.1 ... am investigating further....

Looks like in lib/vagrant-qemu/driver.rb ... there's a line that always adds - to the end of hostfwd port 22.

Somewhere after this some generated statement is invalid .....

          # ports
          hostfwd = "hostfwd=tcp::#{options[:ssh_port]}-:22"

is that necessary or required ? it seems to work but subsequent port fwd commands fail and I'm not sure why. In windows we want to fwd the winrm as well as ssh ports so there are 3 or maybe more hostfwd entries....... the final one qemu complains is invalid

example 1

if i do

     winw1.vm.provider :qemu do |qemu|
          qemu.arch = "x86_64"
          qemu.memory = windows_ram
          qemu.machine = "q35"
          qemu.cpu = "qemu64"
          qemu.net_device = "e1000"
          qemu.drive_interface = "ide"
          qemu.ssh_port = 50023
          qemu.command_line = [
            "-device", "virtio-net-pci,netdev=net0",
            "-netdev", "user,id=net0,hostfwd=tcp::50023:22,hostfwd=tcp::3333:22,hostfwd=tcp:127.0.0.1::55985:5985,hostfwd=tcp:127.0.0.1::55986:5986",
            "-drive", "file=#{windows_box_path},format=qcow2,if=none,id=hd0",
            "-device", "ide-hd,bus=ide.0,drive=hd0",
            "-object", "rng-random,filename=/dev/random,id=rng0",
            "-device", "virtio-rng-pci,rng=rng0",
          ]
          qemu.extra_netdev_args = "net=10.20.30.0/24,dhcpstart=10.20.30.20"
    end

then i get a message that 55985-:5985 is invalid... but i didnt specify that - ... where does it come from ?

Stderr: qemu-system-x86_64: -netdev user,id=net0,hostfwd=tcp::50023-:22,hostfwd=tcp:127.0.0.1::55985-:5985,hostfwd=tcp:127.0.0.1::55986-:5986,net=10.20.30.0/24,dhcpstart=10.20.30.20: Invalid host forwarding rule 'tcp:127.0.0.1::55986-:5986' (Bad host port)

example 2

In general this seems to be injected in all of them...

    winw1.vm.network "forwarded_port", guest: 22, host: 2222, auto_correct: true, id: "ssh"
    winw1.vm.network "forwarded_port", guest: 80, host: 8080, auto_correct: true, id: "http"
    winw1.vm.network "forwarded_port", guest: 443, host: 8443, auto_correct: true, id: "https"
    winw1.vm.network "forwarded_port", guest: 5986, host: 55986

i see an error message with port ranges injected into them.... which break qemu.....

Stderr: qemu-system-x86_64: -netdev user,id=net0,

hostfwd=tcp::50023-:22,
hostfwd=tcp::8080-:80,h
ostfwd=tcp::8443-:443,
hostfwd=tcp::55986-:5986,
hostfwd=tcp::5985-:5985,
hostfwd=tcp::5986-:5986,
hostfwd=tcp:127.0.0.1::55985-:5985,
hostfwd=tcp:127.0.0.1::55986-:5986,net=10.20.30.0/24,dhcpstart=10.20.30.20: 

Invalid host forwarding rule 'tcp:127.0.0.1::55986-:5986' (Bad host port)
ppggff commented 1 year ago

The '-' comes from qemu's doc:

[cp|udp]:[hostaddr]:hostport-[guestaddr]:guestport]

(https://man.archlinux.org/man/qemu.1.en#hostfwd=_tcp_udp_:_hostaddr_:hostport)

Redirect incoming TCP or UDP connections to the host port hostport to the guest IP address guestaddr on guest port guestport. If guestaddr is not specified, its value is x.x.x.15 (default first address given by the built-in DHCP server). By specifying hostaddr, the rule can be bound to a specific host interface. If no connection type is set, TCP is used. This option can be given multiple times.

ppggff commented 1 year ago

I thinks there is a bug when 127.0.0.1 appears... Please provide the original Vagrantfile for debug

jayunit100 commented 1 year ago

Thanks! Ok the Vagrantfile branch is https://github.com/kubernetes-sigs/sig-windows-dev-tools branch "master-qemu"

I did some playing around and I guess with or without the dash.... I would get different errors.

When looking at the generated portfwd directive to qemu they seemed identical....

It wasn't obvious to me wether dashes are expected or not so I couldn't figure it out.

jayunit100 commented 1 year ago

trying the experiment of removing 127.0.0.1 from the Vagrantfile... will post what happens

jayunit100 commented 1 year ago

Thanks again @ppggff ! I posted the Vagrantfile and Error message at the top of this issue.

dixudx commented 1 year ago

trying the experiment of removing 127.0.0.1 from the Vagrantfile... will post what happens

From the log error, it seemed the hostfwd syntax was wrong.

Below is part of my Vagrantfile, which works well. You can take a try.

Vagrant.configure("2") do |config|
  config.vm.provider "qemu" do |qe|
    ...

    qe.extra_netdev_args = "net=192.168.51.0/24,dhcpstart=192.168.51.10,hostfwd=tcp:127.0.0.1:6443-:6443"

    ... 
  end

end
rgl commented 11 months ago

The bug is in this provider code that is generating the QEMU hostfwd entries at:

https://github.com/ppggff/vagrant-qemu/blob/v0.3.4/lib/vagrant-qemu/action/start_instance.rb#L48-L52

From the QEMU documentation, the syntax is:

hostfwd=[tcp|udp]:[hostaddr]:hostport-[guestaddr]:guestport

But this provider ends up calling qemu with an extra : character after hostaddr:

qemu-system-x86_64: -netdev user,id=net0,hostfwd=tcp::50022-:22,hostfwd=tcp:127.0.0.1::55985-:5985,hostfwd=tcp:127.0.0.1::55986-:5986: Invalid host forwarding rule 'tcp:127.0.0.1::55986-:5986' (Bad host port)

BTW, those networks/ports are there because vagrant creates them automatically at:

https://github.com/hashicorp/vagrant/blob/v2.3.7/plugins/kernel_v2/config/vm.rb#L575-L593

Until this is fixed, the following workaround should work:

Vagrant.configure(2) do |config|
  ...
  config.vm.provider 'qemu' do |qe, config|
    ...
    # XXX this is required to prevent the bug described at https://github.com/ppggff/vagrant-qemu/issues/39
    #     see https://github.com/ppggff/vagrant-qemu/blob/v0.3.4/lib/vagrant-qemu/action/start_instance.rb#L48-L52
    #     see https://github.com/hashicorp/vagrant/blob/v2.3.7/plugins/kernel_v2/config/vm.rb#L575-L593
    # NB "disabled: true" is used by the vagrant-qemu plugin, but will be
    #    ignored by vagrant itself, which means that network will still be
    #    used by the vagrant winrm communicator.
    config.vm.network 'forwarded_port', id: 'winrm', host_ip: '127.0.0.1', host: 55985, guest: 5985, auto_correct: true, disabled: true
    config.vm.network 'forwarded_port', id: 'winrm-ssl', host_ip: '127.0.0.1', host: 55986, guest: 5986, auto_correct: true, disabled: true
    qe.extra_netdev_args = 'hostfwd=tcp:127.0.0.1:55985-:5985'
...