hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.24k stars 4.44k forks source link

Vagrant with vmware_desktop provider port forwarding issues #10881

Open vohi opened 5 years ago

vohi commented 5 years ago

Vagrant version

Vagrant 2.2.4 vagrant-vmware-desktop (2.0.3, global)

Host operating system

macOS 10.14 Mojave

Guest operating system

various, including

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
end

Debug output

Expected behavior

Machine comes up, port 22 is forwarded to 2222, ssh works

Actual behavior

Error message

Steps to reproduce

This started after a few days of things working. Reinstalling VMWare Fusion, or the vagrant vmware provider plugin, did not help.

I've looked into a bunch of things to see where state is saved and not correctly cleaned up:

Failed to setup Vagrant VMWare API service - failed to setup Vagrant VMware driver - open /opt/vagrant-vmware-desktop/settings/nat.json: permission denied

$ launchctl stop com.vagrant.vagrant-vmware-utility
$ launchctl start com.vagrant.vagrant-vmware-utility

helps, esp after manually cleaning up the nat.json file; but not permanently. The file keeps deteriorating. Changing the permissions on /opt/vagrant-vmware-desktop/** helps (ie the port forwarding rules are correctly removed), but not permanently.

The VMware Fusion settings related to port forwarding are, in the current state when things are broken:

$ cat /Library/Preferences/VMware\ Fusion/networking
VERSION=1,0
answer VNET_1_HOSTONLY_NETMASK 255.255.255.0
answer VNET_1_HOSTONLY_SUBNET 192.168.188.0
answer VNET_1_VIRTUAL_ADAPTER yes
answer VNET_1_DHCP yes
answer VNET_1_NAT no
answer VNET_8_DHCP yes
answer VNET_8_NAT yes
answer VNET_8_HOSTONLY_NETMASK 255.255.255.0
answer VNET_8_HOSTONLY_SUBNET 172.16.103.0
answer VNET_8_VIRTUAL_ADAPTER yes
add_nat_portfwd 8 tcp 2222 172.16.103.153 22 vagrant: /users/vohi/dev/env/test/.vagrant/machines/default/vmware_desktop/9e0dbd82-00b8-4f55-8353-a9e00ba39b44/macos 10.13.vmx

The macOS 10.13 machine was already destroyed with vagrant. Removing that line manually helps, but things keep coming back.

$ cat /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf
# VMware NAT configuration file
# Manual editing of this file is not recommended. Using UI is preferred.

[host]

# NAT gateway address
ip = 172.16.103.2
netmask = 255.255.255.0

# VMnet device if not specified on command line
device = vmnet8

# Allow PORT/EPRT FTP commands (they need incoming TCP stream ...)
activeFTP = 1

# Allows the source to have any OUI.  Turn this on if you change the OUI
# in the MAC address of your virtual machines.
allowAnyOUI = 1

# Controls if (TCP) connections should be reset when the adapter they are
# bound to goes down
resetConnectionOnLinkDown = 1

# Controls if (TCP) connection should be reset when guest packet's destination
# is NAT's IP address
resetConnectionOnDestLocalHost = 1

# Controls if enable nat ipv6
natIp6Enable = 0

# Controls if enable nat ipv6
natIp6Prefix = fd15:4ba5:5a2b:1008::/64

[tcp]

# Value of timeout in TCP TIME_WAIT state, in seconds
timeWaitTimeout = 30

[udp]

# Timeout in seconds. Dynamically-created UDP mappings will purged if
# idle for this duration of time 0 = no timeout, default = 60; real
# value might be up to 100% longer
timeout = 60

[netbios]
# Timeout for NBNS queries.
nbnsTimeout = 2

# Number of retries for each NBNS query.
nbnsRetries = 3

# Timeout for NBDS queries.
nbdsTimeout = 3

[incomingtcp]

# Use these with care - anyone can enter into your VM through these...
# The format and example are as follows:
#<external port number> = <VM's IP address>:<VM's port number>
#8080 = 172.16.3.128:80

[incomingudp]

# UDP port forwarding example
#6000 = 172.16.3.0:6001

Using the vmrun tool to inspect port forwarding routines gives me nothing:

$ vmrun listPortForwardings vmnet8
Total port forwardings: 0

References

This seems to have been a problem for a lot of people, f.ex

Disabling all port forwarding rules, and settings things up manually using vmrun setPortForwarding and vmrun deletePortForwarding, produces more reliable results, but doesn't integrate well with the vagrant workflow.

briancain commented 5 years ago

Hey there @vohi - Are you running the vmware utility as root on your machine? It needs to be root as you mentioned to read those networking files properly (among other reasons).

punkrokk commented 5 years ago

@vohi ?

vohi commented 5 years ago

Hey, sorry for not responding, I haven't done much with this recently. Checking just now, I have

root 17602 0.0 0.0 558442080 4616 ?? Ss 29May19 1:09.59 /opt/vagrant-vmware-desktop/bin/vagrant-vmware-utility api -port=9922

which looks correct. Running a few cycles of launching and destroying boxes that use the VMware provider (a generic/ubuntu1804 box from the vagrant cloud, and a locally crafted macOS10.13 box), I can't see any problems right now.

I'm quite certain that I did not deliberately stop and restart the vmware-utility process when the problem started to occur, so unclear to me why the process would not have run as root.

1player commented 5 years ago

Same issue here, vagrant-vmware-utility is running as root, the virtual machine works on first start after creation, subsequent starts (or after a reboot) fail with that error. The only solution is deleting all the vmware nat.conf files. No other machine is running or exists at all on VMware Fusion.

I won't be able to test any workaround as I've just requested a refund for the plugin.

opfpqgoon commented 4 years ago

I'm having related issues when attempting to run a multimachine Vagrant file. This worked fine for a few hours, but now I cannot get the 2nd machine to properly forward as it's using the same IP for both forward rules.

I have verified with gui = true that the 2 guests are indeed getting different IP addresses

Fusion: Professional Version 11.5.1 (15018442) Vagrant: 2.2.7 Vagrant VMware Utility: 1.0.7

/Library/Preferences/VMware\ Fusion/vmnet8/nat.conf

[incomingtcp]

# Use these with care - anyone can enter into your VM through these...
# The format and example are as follows:
#<external port number> = <VM's IP address>:<VM's port number>
#8080 = 172.16.3.128:80
2222 = 192.168.126.134:22
2200 = 192.168.126.134:22

/opt/vagrant-vmware-desktop/settings/nat.json

{
  "fwds": [
    {
      "enable": false,
      "device": "8",
      "protocol": "tcp",
      "hostport": 2222,
      "guestip": "192.168.126.134",
      "guestport": 22,
      "description": "vagrant: 5edecc43-c18d-4e71-a248-a137c08c7cad/tumbleweedbasebox.vmx"
    },
    {
      "enable": false,
      "device": "8",
      "protocol": "tcp",
      "hostport": 2200,
      "guestip": "192.168.126.134",
      "guestport": 22,
      "description": "vagrant: d75bfab6-fb7c-4f5e-aa64-7b6e77e90947/tumbleweedbasebox.vmx"
    }
  ]
}

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.synced_folder '.', '/vagrant', disabled: true

  config.vm.define "test1" do |test1|
    test1.vm.box = "tumbleweed"
    test1.vm.hostname = "test1"
  end

  config.vm.define "test2" do |test2|
    test2.vm.box = "tumbleweed"
    test2.vm.hostname = "test2"
  end

end

Adding port_forward_network_pause doesn't work as the default ssh port forward is setup before this pause

opfpqgoon commented 4 years ago

I think I've found why! If you've destroyed and brought up an environment a lot, there are stale entries in /var/db/vmware/vmnet-dhcpd-vmnet8.leases

Deleting this file doesn't resolve the issue though. After restarting fusion and vagrant up the 2nd machine still got the same IP as the first.

It appears that vagrant-vmware-utility uses the dhcp leases to determine the IP of the guest, and matches based on mac address. If you have stale lease info then it gets the wrong IP for the forward.

Guest OS: openSUSE tumbleweed with open-vm-tools installed

/var/db/vmware/vmnet-dhcpd-vmnet8.leases

lease 192.168.126.137 {
    starts 0 2020/02/02 22:33:53;
    ends 0 2020/02/02 23:03:53;
    hardware ethernet 00:0c:29:6f:f9:5f;
    uid ff:29:76:7d:19:00:01:00:01:25:c8:7d:c7:00:0c:29:76:7d:19;
}
lease 192.168.126.137 {
    starts 0 2020/02/02 22:33:53;
    ends 0 2020/02/02 22:43:02;
    hardware ethernet 00:0c:29:6f:f9:5f;
    uid ff:29:76:7d:19:00:01:00:01:25:c8:7d:c7:00:0c:29:76:7d:19;
}
lease 192.168.126.134 {
    starts 0 2020/02/02 22:43:02;
    ends 0 2020/02/02 23:13:02;
    hardware ethernet 00:0c:29:88:eb:0a;
    uid ff:29:76:7d:19:00:01:00:01:25:c8:7d:c7:00:0c:29:76:7d:19;
}
lease 192.168.126.134 {
    starts 0 2020/02/02 22:44:00;
    ends 0 2020/02/02 23:14:00;
    hardware ethernet 00:0c:29:a1:18:67;
    uid ff:29:76:7d:19:00:01:00:01:25:c8:7d:c7:00:0c:29:76:7d:19;
}
lease 192.168.126.134 {
    starts 0 2020/02/02 22:44:00;
    ends 0 2020/02/02 22:44:00;
    abandoned;
}
lease 192.168.126.137 {
    starts 0 2020/02/02 22:44:11;
    ends 0 2020/02/02 23:14:11;
    hardware ethernet 00:0c:29:a1:18:67;
    uid ff:29:76:7d:19:00:01:00:01:25:c8:7d:c7:00:0c:29:76:7d:19;
}
opfpqgoon commented 4 years ago

Looking at vagrant up --debug it appears that the call to getGuestIPAddress never works so it's forced to look at the DHCP leases.

I've verified that the vm I used to create the box has open-vm-tools installed, and I can indeed get the guest IP with getGuestIPAddress

When I try to get the IP of a running vagrant guest, I get the following: Error: The VMware Tools are not running in the virtual machine:, however checkToolsState results in "installed"

brian-farrell commented 4 years ago

I arrived here after reading through other Closed issues on here that essentially deal with Port-Forwarding problems in the vagrant-vmware-utility for either VMWare Fusion or Desktop.

I am running MacOS 10.14.6 (18G4032), with Vagrant 2.2.9 and vagrant-vmware-utility 1.0.9 and VMWare Fusion Professional Version 11.5.3 (15870345).

I'm just working through the "Getting Started" tutorial at https://www.vagrantup.com/intro/getting-started/index.html and I am continually running into one of two errors when I try to bring the box back up after halting it:

"Some of the defined forwarded ports would collide with existing forwarded ports on VMware network devices. This can be due to existing Vagrant-managed VMware machines, or due to manually configured port forwarding with VMware."

-OR-

"Vagrant failed to apply the requested port forward. The following error message was generated while attempting to apply the port forward rule: Port forward conflict on host port 2222"

I've tried different remedies, like deleting the offending entry in the [incomingtcp] section of the /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf file, or deleting the /opt/vagrant-vmware-desktop/settings/nat.json file. Of course, both of these files are returned to their error-prone state after I successfully get the box to run again, using the vagrant up command.

It doesn't seem to matter, the problem just keeps coming back. I even switched to using the centos/7 image from the Vagrant Cloud, only to run into the same problems.

I did take the advice from here, about Opening the VMWare Fusion GUI first, and this does seem to solve the problem, but that really shouldn't be the solution. https://github.com/hashicorp/vagrant/issues/9624#issuecomment-376621609 If this does need to be the solution, then Vagrant or the vagrant-vmware-utility should handle checking to see if the VMWare App is open, and open it first before attempting to start-up the box.

2020-05-21_18-33-15 2020-05-21_18-47-14