hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.24k stars 4.44k forks source link

Port colloision occurs when trying to create multiple VMs at parallel #10169

Open sharavanan2010 opened 6 years ago

sharavanan2010 commented 6 years ago

Vagrant version

2.0.2 python: 3.4.3

Host operating system

Windows 10 64bit

Guest operating system

Windows 10 32 or 64 bit

Vagrantfile

https://gist.github.com/sharavanan2010/f3473ce8b1e9b9f615b0aee5134f0840

Debug output

VM1: https://gist.github.com/sharavanan2010/71034d52c8710134d25c31f7f16eb8f5 VM2: https://gist.github.com/sharavanan2010/5a59b9e44fb6aaf68e8d6ce22add9e00

Expected behavior

Able to create multiple VMs in parallel

Actual behavior

Failed to create multiple VMs in parallel

Steps to reproduce

  1. Open two command prompts
  2. In each command prompt, try to init a image.
  3. Run 'vagrant up' command in two command prompts in parallel.

This issue was occurring when two or more parallel VMs trying to get ports at the same time, i.e. same port is assigned to both machines. I saw 2203, 2204 got assigned to both VMs.

Is this known bug? any solution for this?

References

sharavanan2010 commented 6 years ago

Any thoughts?

briancain commented 6 years ago

Hi @sharavanan2010 - what version of the vmware plugin are you using?

sharavanan2010 commented 6 years ago

@briancain Product: VMware® Workstation 14 Pro Version: 14.1.1 build-7528167

vagrant plugin list vagrant-omnibus (1.5.0) vagrant-vmware-desktop (1.0.0)

This was occurring randomly, not all then time when executing vagrant up command. Previously we used vmware 12.5.9, that time we have faced port collision when trying to run/keep more than one boxes in running state.

is this fixed in Vmware 14.x?

briancain commented 6 years ago

@sharavanan2010 - I recommend upgrading your vagrant-vmware-desktop plugin and the utility service to the latest and try again. I think this issue has since been resolved, but try that and let me know. Thanks!

sharavanan2010 commented 6 years ago

@briancain I tried below steps, no luck. Still facing the port collision issue when spinning a VM. STR_1:

  1. Installed vagrant 2.0.2 + Vagrant vmware utility 1.0.4 + Vmware 12.5.9
  2. Created a VM and shutdown it.
  3. Upgraded Vmware to 14.0.0 build-6661328
  4. Tried to create a VM, port collision occurred. Restoring Virtual Network Editor settings not helped. I can't create any image after this failure

STR_2:

  1. Already installed vagrant 2.0.2 + Vagrant vmware utility 1.0.4 + Vmware 12.5.9.
  2. Uninstall Vmware 12.5.9 and installed Vmware 14.0.0 build-6661328
  3. Tried to create a VM, the same port collision error.

Debug Logs: https://gist.github.com/sharavanan2010/457c1a7751518a5016b80820a8a70db5

In all scenarios, I don't have a single entry in Vmware C:\programData\vmware\vmnetnat.conf file.

STR_3:

  1. Installed vagrant 2.0.2 + Vagrant vmware utility 1.0.0 + Vmware 14.0.0 build-6661328
  2. Created a VM and shutdown it.
  3. Upgraded vmware-desktop utility to 1.0.4
  4. Power on previously created VM.
  5. Spin a new VM, the same port collision error. Log1: https://gist.github.com/sharavanan2010/6eac6c801a9a415ccec87267f69d894a Log2: https://gist.github.com/sharavanan2010/2c592e74fdf262e3ee1d7ebb59026a5e Log3: https://gist.github.com/sharavanan2010/56f354a53534642d93dd8a5225400146

STR_4: Current environment, vagrant 2.0.2 + Vagrant vmware utility 1.0.0 + Vmware 14.0.0 build-6661328

  1. Upgraded vagrant to 2.1.4.
  2. Uninstalled and installed vagrant-omnibus and vagrant-vmware-desktop plugins and licensed
  3. Created a VM and keep it in running state
  4. Created another VM and keep it in running state
  5. Creating 3rd VM, failed with port collision error. The conflict port "2204" was not present in vmnetnat.conf file, [incomingtcp] 55985 = 192.168.254.138:5985 55986 = 192.168.254.138:5986 2222 = 192.168.254.138:22 2200 = 192.168.254.139:5985 2201 = 192.168.254.139:5986 2202 = 192.168.254.139:22 After step5, I was not able to create any VM, all are failed with port collision error. Log1: https://gist.github.com/sharavanan2010/4ee88eee4d8681da9df1cb2c9a07bd2a C:\Windows\system32>vagrant -v Vagrant 2.1.4 C:\Windows\system32>vagrant plugin list vagrant-omnibus (1.5.0, global) vagrant-vmware-desktop (1.0.4, global)

Please let me know if you need more logs or anything else.

chrisroberts commented 6 years ago

Hi there,

The VMware provider does not support parallel actions (which is why the --parallel option is disabled for the provider). It should provide better locks to prevent separate processes from attempting to do things like claim specific host ports. If you have provisioners on each of the guests that you want to run in parallel, that can be accomplished by first bring up all the guests with vagrant up --no-provision. After the guests are running you can run a vagrant provision GUEST_NAME in separate windows and have them run in parallel.

sharavanan2010 commented 6 years ago

@chrisroberts I was trying to spin a VM without applying any provision. Provisioning also one of the major scenario, but first I tried to get plain VM without a provision. Otherwise we should wait for VMWare team to fix this.

DanHam commented 6 years ago

Hi all.

EDIT: Skip to comment below to see issue reproduced with Vagrant directly

I'm pretty certain I am running into the same issue. I'm using Vagrant as the driver for test-kitchen and have a CentOS and Debian box configured for testing.

Note that test-kitchen creates the boxes sequentially rather than in parallel.

Using VMware as the Vagrant Provider:

The same port (2201) is assigned to both the CentOS and Debian instance:

$ kitchen create
-----> Starting Kitchen (v1.23.2)
-----> Creating <default-centos-kitchen-ansible>...
       Bringing machine 'default' up with 'vmware_fusion' provider...
       ==> default: Cloning VMware VM: 'foosite/centos-kitchen-ansible'. This can take some time...
       ==> default: Checking if box 'foosite/centos-kitchen-ansible' is up to date...
       ==> default: Verifying vmnet devices are healthy...
       ==> default: Preparing network adapters...
       ==> default: Fixed port collision for 22 => 2222. Now on port 2201.
       ==> default: Starting the VMware VM...
       ==> default: Waiting for the VM to receive an address...
       ==> default: Forwarding ports...
           default: -- 22 => 2201
       ==> default: Waiting for machine to boot. This may take a few minutes...
           default: SSH address: 127.0.0.1:2201
           default: SSH username: coadmin
           default: SSH auth method: private key
       ==> default: Machine booted and ready!
       ==> default: Setting hostname...
       ==> default: Configuring network adapters within the VM...
           default: SSH address: 127.0.0.1:2201
           default: SSH username: coadmin
           default: SSH auth method: private key
       ==> default: Machine not provisioned because `--no-provision` is specified.
       [SSH] Established
       Vagrant instance <default-centos-kitchen-ansible> created.
       Finished creating <default-centos-kitchen-ansible> (0m33.32s).
-----> Creating <default-debian-kitchen-ansible>...
       Bringing machine 'default' up with 'vmware_fusion' provider...
       ==> default: Cloning VMware VM: 'foosite/debian-kitchen-ansible'. This can take some time...
       ==> default: Checking if box 'foosite/debian-kitchen-ansible' is up to date...
       ==> default: Verifying vmnet devices are healthy...
       ==> default: Preparing network adapters...
       ==> default: Fixed port collision for 22 => 2222. Now on port 2201.
       ==> default: Starting the VMware VM...
       ==> default: Waiting for the VM to receive an address...
       ==> default: Forwarding ports...
           default: -- 22 => 2201
       ==> default: Waiting for machine to boot. This may take a few minutes...
           default: SSH address: 127.0.0.1:2201
           default: SSH username: debadmin
           default: SSH auth method: private key
       ==> default: Machine booted and ready!
       ==> default: Setting hostname...
       ==> default: Configuring network adapters within the VM...
       ==> default: Machine not provisioned because `--no-provision` is specified.
       [SSH] Established
       Vagrant instance <default-debian-kitchen-ansible> created.
       Finished creating <default-debian-kitchen-ansible> (0m27.35s).
-----> Kitchen is finished. (1m2.87s)

If I attempt to perform a kitchen login, the command fails for the first instance that was brought up, but succeeds for the second:

CentOS:

$ kitchen login default-centos-kitchen-ansible -l debug
D      [Vagrant command] BEGIN (vagrant --version)
D      [Vagrant command] END (0m0.15s)
D      Login command: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o LogLevel=VERBOSE -i /Users/dan/.ssh/id_rsa -p 2201 coadmin@127.0.0.1 (Options: {})
Warning: Permanently added '[127.0.0.1]:2201' (ECDSA) to the list of known hosts.
Permission denied (publickey).

Debian:

$ kitchen login default-debian-kitchen-ansible -l debug
D      [Vagrant command] BEGIN (vagrant --version)
D      [Vagrant command] END (0m0.13s)
D      Login command: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o LogLevel=VERBOSE -i /Users/dan/.ssh/id_rsa -p 2201 debadmin@127.0.0.1 (Options: {})
Warning: Permanently added '[127.0.0.1]:2201' (ECDSA) to the list of known hosts.
Authenticated to 127.0.0.1 ([127.0.0.1]:2201).
Linux kitchen 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
debadmin@kitchen:~$

Using Virtualbox as the Vagrant Provider:

The issue does not occur. Vagrant assigns each box a unique port - 2201 and 2202 in this case:

$ kitchen create
-----> Starting Kitchen (v1.23.2)
-----> Creating <default-centos-kitchen-ansible>...
       Bringing machine 'default' up with 'virtualbox' provider...
       ==> default: Importing base box 'foosite/centos-kitchen-ansible'...
==> default: Matching MAC address for NAT networking...
       ==> default: Checking if box 'foosite/centos-kitchen-ansible' is up to date...
       ==> default: Setting the name of the VM: kitchen-motd-default-centos-kitchen-ansible
       ==> default: Fixed port collision for 22 => 2222. Now on port 2201.
       ==> default: Clearing any previously set network interfaces...
       ==> default: Preparing network interfaces based on configuration...
           default: Adapter 1: nat
       ==> default: Forwarding ports...
           default: 22 (guest) => 2201 (host) (adapter 1)
       ==> default: Running 'pre-boot' VM customizations...
       ==> default: Booting VM...
       ==> default: Waiting for machine to boot. This may take a few minutes...
           default: SSH address: 127.0.0.1:2201
           default: SSH username: coadmin
           default: SSH auth method: private key
       ==> default: Machine booted and ready!
       ==> default: Checking for guest additions in VM...
       ==> default: Setting hostname...
       ==> default: Machine not provisioned because `--no-provision` is specified.
       [SSH] Established
       Vagrant instance <default-centos-kitchen-ansible> created.
       Finished creating <default-centos-kitchen-ansible> (0m44.05s).
-----> Creating <default-debian-kitchen-ansible>...
       Bringing machine 'default' up with 'virtualbox' provider...
       ==> default: Importing base box 'foosite/debian-kitchen-ansible'...
==> default: Matching MAC address for NAT networking...
       ==> default: Checking if box 'foosite/debian-kitchen-ansible' is up to date...
       ==> default: Setting the name of the VM: kitchen-motd-default-debian-kitchen-ansible
       ==> default: Fixed port collision for 22 => 2222. Now on port 2202.
       ==> default: Clearing any previously set network interfaces...
       ==> default: Preparing network interfaces based on configuration...
           default: Adapter 1: nat
       ==> default: Forwarding ports...
           default: 22 (guest) => 2202 (host) (adapter 1)
       ==> default: Running 'pre-boot' VM customizations...
       ==> default: Booting VM...
       ==> default: Waiting for machine to boot. This may take a few minutes...
           default: SSH address: 127.0.0.1:2202
           default: SSH username: debadmin
           default: SSH auth method: private key
       ==> default: Machine booted and ready!
       ==> default: Checking for guest additions in VM...
       ==> default: Setting hostname...
       ==> default: Machine not provisioned because `--no-provision` is specified.
       [SSH] Established
       Vagrant instance <default-debian-kitchen-ansible> created.
       Finished creating <default-debian-kitchen-ansible> (0m37.50s).
-----> Kitchen is finished. (1m23.80s)

Running kitchen login then works as expected for both instances:

CentOS:

$ kitchen login default-centos-kitchen-ansible -l debug
D      [Vagrant command] BEGIN (vagrant --version)
D      [Vagrant command] END (0m0.14s)
D      Login command: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o LogLevel=VERBOSE -i /Users/dan/.ssh/id_rsa -p 2201 coadmin@127.0.0.1 (Options: {})
Warning: Permanently added '[127.0.0.1]:2201' (ECDSA) to the list of known hosts.
Authenticated to 127.0.0.1 ([127.0.0.1]:2201).
[coadmin@kitchen ~]$

Debian:

$ kitchen login default-debian-kitchen-ansible -l debug
D      [Vagrant command] BEGIN (vagrant --version)
D      [Vagrant command] END (0m0.14s)
D      Login command: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o LogLevel=VERBOSE -i /Users/dan/.ssh/id_rsa -p 2202 debadmin@127.0.0.1 (Options: {})
Warning: Permanently added '[127.0.0.1]:2202' (ECDSA) to the list of known hosts.
Authenticated to 127.0.0.1 ([127.0.0.1]:2202).
Linux kitchen 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
debadmin@kitchen:~$

Relevant Info

Vagrant: Vagrant 2.1.5 Vagrant Plugins:

vagrant-aws (0.7.2, global)
  - Version Constraint: > 0
vagrant-google (2.2.0, global)
  - Version Constraint: > 0
vagrant-share (1.1.9, global)
  - Version Constraint: > 0
vagrant-vmware-fusion (5.0.4, global)
  - Version Constraint: > 0

VMware: VMware Fusion 8.5.8 build-5824040 Release Virtualbox: Virtualbox 5.2.18r124319

OS:

ProductName:    Mac OS X
ProductVersion: 10.11.6
BuildVersion:   15G22010
DanHam commented 6 years ago

Actually, I can reproduce with this Vagrantfile:

# -*- mode: ruby -*-
# vi: set ft=ruby :

# Number of CentOS nodes
num_nodes = 2

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"

  config.vm.provider 'vmware_fusion' do |vf|
    vf.whitelist_verified = true
  end

  (1..num_nodes).each do |n|
    config.vm.define vmname = "centos#{n}" do |node|
      node.vm.hostname = vmname + '.local'
    end
  end
end

Using VMware as the Vagrant Provider

Vagrant assigns port 2201 to both VM's:

$ vagrant up --provider vmware_fusion
Bringing machine 'centos1' up with 'vmware_fusion' provider...
Bringing machine 'centos2' up with 'vmware_fusion' provider...
==> centos1: Cloning VMware VM: 'centos/7'. This can take some time...
==> centos1: Checking if box 'centos/7' is up to date...
==> centos1: Verifying vmnet devices are healthy...
==> centos1: Preparing network adapters...
==> centos1: Fixed port collision for 22 => 2222. Now on port 2201.
==> centos1: Starting the VMware VM...
==> centos1: Waiting for the VM to receive an address...
==> centos1: Forwarding ports...
    centos1: -- 22 => 2201
==> centos1: Waiting for machine to boot. This may take a few minutes...
    centos1: SSH address: 127.0.0.1:2201
    centos1: SSH username: vagrant
    centos1: SSH auth method: private key
    centos1:
    centos1: Vagrant insecure key detected. Vagrant will automatically replace
    centos1: this with a newly generated keypair for better security.
    centos1:
    centos1: Inserting generated public key within guest...
    centos1: Removing insecure key from the guest if it's present...
    centos1: Key inserted! Disconnecting and reconnecting using new SSH key...
==> centos1: Machine booted and ready!
==> centos1: Setting hostname...
==> centos1: Configuring network adapters within the VM...
    centos1: SSH address: 127.0.0.1:2201
    centos1: SSH username: vagrant
    centos1: SSH auth method: private key
==> centos1: Rsyncing folder: /Users/dan/working/vagrant/foo/ => /vagrant
==> centos2: Cloning VMware VM: 'centos/7'. This can take some time...
==> centos2: Checking if box 'centos/7' is up to date...
==> centos2: Verifying vmnet devices are healthy...
==> centos2: Preparing network adapters...
==> centos2: Fixed port collision for 22 => 2222. Now on port 2201.
==> centos2: Starting the VMware VM...
==> centos2: Waiting for the VM to receive an address...
==> centos2: Forwarding ports...
    centos2: -- 22 => 2201
==> centos2: Waiting for machine to boot. This may take a few minutes...
    centos2: SSH address: 127.0.0.1:2201
    centos2: SSH username: vagrant
    centos2: SSH auth method: private key
    centos2:
    centos2: Vagrant insecure key detected. Vagrant will automatically replace
    centos2: this with a newly generated keypair for better security.
    centos2:
    centos2: Inserting generated public key within guest...
    centos2: Removing insecure key from the guest if it's present...
    centos2: Key inserted! Disconnecting and reconnecting using new SSH key...
==> centos2: Machine booted and ready!
==> centos2: Setting hostname...
==> centos2: Configuring network adapters within the VM...
    centos2: SSH address: 127.0.0.1:2201
    centos2: SSH username: vagrant
    centos2: SSH auth method: private key
==> centos2: Rsyncing folder: /Users/dan/working/vagrant/foo/ => /vagrant

Note that the vagrant up sometimes fails when bringing up the second node with:

Guest-specific operations were attempted on a machine that is not
ready for guest communication. This should not happen and a bug
should be reported.

However, when the vagrant up succeeds, attempting to vagrant ssh to each instance gives:

$ vagrant ssh centos1
Permission denied (publickey).
$ vagrant ssh centos2
Last login: Wed Sep 19 18:04:18 2018 from 172.16.135.2
[vagrant@centos2 ~]$

Note that sometimes the failure occurs with the second node, not the first.

Info from vagrant ssh-config shows things aren't quite right:

Host centos1
  HostName 172.16.135.205
  User vagrant
  Port 22
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/dan/working/vagrant/foo/.vagrant/machines/centos1/vmware_fusion/private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host centos2
  HostName 127.0.0.1
  User vagrant
  Port 2201
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/dan/working/vagrant/foo/.vagrant/machines/centos2/vmware_fusion/private_key
  IdentitiesOnly yes
  LogLevel FATAL

Using Virtualbox as the Vagrant Provider

With Virtualbox everything works as expected:

$ vagrant up --provider virtualbox
... <Blah...>

$ vagrant ssh centos1
[vagrant@centos1 ~]$ exit
logout
Connection to 127.0.0.1 closed.

$ vagrant ssh centos2
[vagrant@centos2 ~]$ exit
logout
Connection to 127.0.0.1 closed.

$ vagrant ssh-config
Host centos1
  HostName 127.0.0.1
  User vagrant
  Port 2201
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/dan/working/vagrant/foo/.vagrant/machines/centos1/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host centos2
  HostName 127.0.0.1
  User vagrant
  Port 2202
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/dan/working/vagrant/foo/.vagrant/machines/centos2/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL
chrisroberts commented 6 years ago

@DanHam Would you please upgrade to the latest version of the vagrant-vmware-desktop and vagrant-vmware-utility and see if the behavior still persists. Thanks!

DanHam commented 6 years ago

@chrisroberts Thanks - that fixed it!

Sorry - I was not aware that the vagrant-vmware-fusion plugin had been deprecated/replaced!

Following my recent upgrade to Vagrant 2.1.5 I performed a vagrant plugin update immediately afterwards but was not warned that the vagrant-vmware-fusion plugin was deprecated... unless of course I missed it.

If there is a way you could warn users about the plugin being replaced when they do a plugin update that would be really helpful!

Just for completeness, this is now fixed for me with:

$ vagrant --version
Vagrant 2.1.5

$ vagrant plugin list
vagrant-aws (0.7.2, global)
vagrant-google (2.2.0, global)
vagrant-share (1.1.9, global)
vagrant-vmware-desktop (1.0.4, global)

Thanks again!