hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.26k stars 4.43k forks source link

Vagrant times out connecting to VM after reboot for adding to domain #8930

Open chrisgannon6 opened 7 years ago

chrisgannon6 commented 7 years ago

Vagrant version

Vagrant 1.9.8

Vagrant plugins

vagrant-share (1.1.9, system) vagrant-triggers (0.5.3)

Vagrant provider

VMWare Workstation 12.5.7

Host operating system

Windows 7 x64

Guest operating system

Windows Server 2016

Vagrantfile

# Plugin requirements:
# vagrant plugin install vagrant-windows-domain

# -*- mode: ruby -*-
# vi: set ft=ruby :

require 'fileutils'
require 'open3'
require 'json'

$config_node = (JSON.parse(File.read("Config.json")))
$config_box = $config_node["box"]
$config_box_winrm = $config_box["winrm"]
$config_domain = $config_node["domain"]

# The "2" in Vagrant.configure configures the configuration version.
Vagrant.configure(2) do |config|
    config.vm.box = $config_box["name"]
    config.vm.box_url = $config_box["url"]
    config.vm.synced_folder ".", "/vagrant", type: "smb", smb_username: "domain\\user", smb_password: "password"
    config.vm.guest = :windows
    config.vm.boot_timeout = 1000
    config.vm.communicator = "winrm"
    config.winrm.username = $config_box_winrm["username"]
    config.winrm.password = $config_box_winrm["password"]
    config.winrm.max_tries = 80
    config.winrm.timeout = 1000

    # Workstation Provider settings
    config.vm.provider "vmware_workstation" do |vmware, override|
        override.vm.box_url = $config_box["url"]
        vmware.gui = true
        vmware.vmx["memsize"] = "12288"
        vmware.vmx["numvcpus"] = "4"
        vmware.vmx["displayName"] = $config_domain["computername"]
        override.vm.network "public_network", type: "dhcp"
    end

    # ESX Provider settings
    config.vm.provider :vsphere do |vsphere, override|
        override.vm.box = "dummy"
        override.vm.box_url = "file:////fileserver/boxes/dummy.box"
        vsphere.data_center_name = 'Engineering'
        vsphere.host = 'vcenterserver.domain'
        vsphere.name = $config_domain["computername"] + "." + $config_domain["name"]
        vsphere.cpu_count = 4
        vsphere.memory_mb = 12288
        vsphere.clone_from_vm = true
        vsphere.vm_base_path = 'Folder/Path'
        vsphere.template_name = 'Templates/win2016packer'
        vsphere.user = 'DOMAIN\USERNAME'
        vsphere.password = 'YOURPASSWORD'
        vsphere.insecure = true
    end

    # Setting MaxShellsPerUser to 70
    config.vm.provision :shell, inline: 'Set-Item WSMan:\localhost\Shell\MaxShellsPerUser 70'

    # Disable UAC.
    config.vm.provision :shell, path: "DisableUac.ps1"

    # Disable Firewall.
    config.vm.provision :shell, path: "DisableFirewall.ps1"

    # Add the VM to the domain.

    config.vm.provision :windows_domain do |domain|
        domain.domain = $config_domain["name"]
        domain.computer_name = $config_domain["computername"]
        # This user has privilege to add machines to the domain.
        domain.username = $config_domain["username"]
        domain.password = $config_domain["password"]
    end

    # Check VM added to domain and renamed, if this fails it will shutdown the vm, and the script will fail provisioning
    config.vm.provision :shell, path: "validateHostnameUpdated.ps1", args: [$config_domain["computername"]]

    config.trigger.after :destroy do
        run 'powershell -noprofile -executionpolicy bypass -command ".\remove_from_ad.ps1"'
    end

end

Debug output

https://gist.github.com/chrisgannon6/eb8956196b35d83b4787408be6fba7fc

Expected behavior

Vagrant should find the correct IP address and connect to it via WinRM after rebooting. This used to work in earlier versions of Vagrant - the last version I tested working 1.8.1.

Actual behavior

After adding the VM to the domain, the VM reboots. Vagrant attempts to connect to the VM, but times out (at least 10 minutes after it was already available).

Steps to reproduce

  1. vagrant up above Vagrantfile, with VMWare Workstation provider
  2. Observe that on first boot of the VM it connects via WinRM OK.
  3. Several provisioners run, followed by the domain provisioner which adds the VM to the domain. A reboot is then triggered.
  4. The VM restarts. Vagrant tries to connect to the VM eventually timing out. In the meantime, it is possible to ping and remote desktop the the server while it tries.

Note: This only occurs with VMWare Workstation as the provider. It does not occur when vsphere is the provider. Suspect this has to do with the fact Workstation has a public network defined for it. However this used to work fine in Vagrant 1.8.1.

Also have tried a number of configuration options to attempt to resolve this. eg. - setting enable_vmrun_ip_lookup = false

References

Are there any other GitHub issues (open or closed) that should be linked here? For example:

chrisgannon6 commented 7 years ago

Tested with 2.0.0. Result is the same.

clong commented 7 years ago

Interesting, I'm running into a similar issue with Fusion. Right after I issue the command to join the domain, it drops the winrm connection:

==> win10: VERBOSE: Performing the operation "Join in domain 'windomain.local'" on target "win10".
DEBUG winrmshell: [WinRM] Waiting for output...
DEBUG winrmshell: [WinRM] Processing output
 INFO interface: info: HasSucceeded : True
 INFO interface: info: ==> win10: HasSucceeded : True
==> win10: HasSucceeded : True
 INFO interface: info: ComputerName : win10
 INFO interface: info: ==> win10: ComputerName : win10
==> win10: ComputerName : win10
 INFO interface: info: WARNING: The changes will take effect after you restart the computer win10.
 INFO interface: info: ==> win10: WARNING: The changes will take effect after you restart the computer win10.
==> win10: WARNING: The changes will take effect after you restart the computer win10.
DEBUG winrmshell: [WinRM] cleaning up command_id: 063F4777-58BC-4F4A-86E7-C8F9F358202D on shell_id 6AF10D1E-FD75-4201-983C-5D047DAAE8C6
ERROR warden: Error occurred: Vagrant timed out while attempting to connect via WinRM. This usually
means that the VM booted, but there are issues with the WinRM configuration
or network connectivity issues. Please try to `vagrant reload` or
`vagrant up` again.
clong commented 6 years ago

Problem persists with 2.0.1 and vmware-fusion-plugin 5.0.4

santhoshm153 commented 6 years ago

Is there any update on this? Will this be fixed soon? It would help immensely if anyone can provide a workaround for Joining domain and running provisioners as domain user

clong commented 6 years ago

This continues to be an issue with the vagrant-vmware-desktop plugin as well. Does anyone have any idea what the ultimate root cause of the problem is here or what debug steps can be taken to isolate the issue?