oscar-stack / vagrant-hosts

Manage static DNS on vagrant guests
Other
317 stars 26 forks source link

provisioning of hosts freezes #65

Open LuboVarga opened 8 years ago

LuboVarga commented 8 years ago

When I have configured provisioning of hosts for all machines in config file before listing machines, vagrant provisioning freezes on hosts provisioning step. Simplified Vagrant config (actually not able to reproduce my problem):

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
    config.vm.box = "vagrant-centos-65-x86_64-minimal"
    config.vm.box_url = "http://files.brianbirkinbine.com/vagrant-centos-65-x86_64-minimal.box"

    if Vagrant.has_plugin?("vagrant-cachier")
        config.cache.scope = :box
    end

    config.vm.provider "virtualbox" do |v|
        v.memory = 1024
        v.cpus = 1
    end

    # provision host name resolutions between vistual machines. uses plugin https://github.com/oscar-stack/vagrant-hosts
    if Vagrant.has_plugin?("vagrant-hosts")
        config.vm.provision :hosts do |hostsprovisioner|
          hostsprovisioner.autoconfigure = true
          hostsprovisioner.sync_hosts = true
        end
     else
        config.vm.provision "shell", inline: "echo vagrant-hosts plugin is missing, hostname resolution of other virtual machines will not work."
     end

    # list of different machines starts here:
    config.vm.define "machine1", primary: true do |cfg|
        cfg.vm.hostname = "machine1"
        cfg.vm.network "private_network", ip: "192.168.101.10"
    end

    config.vm.define "machine2", autostart: false do |cfg|
        cfg.vm.hostname = "machine2"
        cfg.vm.network "private_network", ip: "192.168.101.11"
    end
end

Running vagrant up with debug points to not ending some ssh connection:

 vagrant up --debug --provision vagrant-backend-lbamqp

.... normal startup and provisioning ....

GuestOSType="Linux26_64"
GuestAdditionsRunLevel=2
GuestAdditionsVersion="4.3.4 r91027"
GuestAdditionsFacility_VirtualBox Base Driver=50,1464966035858
GuestAdditionsFacility_VirtualBox System Service=50,1464966048788
GuestAdditionsFacility_Seamless Mode=0,1464966035858
GuestAdditionsFacility_Graphics Mode=0,1464966035858
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 31999
DEBUG subprocess: Exit status: 0
 INFO interface: Machine: metadata ["provider", :virtualbox, {:target=>:"vagrant-backend"}]
DEBUG ssh: Uploading: /tmp/tmp-hosts20160603-29916-1fntcy7 to /tmp/hosts
DEBUG ssh: Re-using SSH connection.
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: cat /tmp/hosts > /etc/hosts (sudo=true)
DEBUG ssh: pty obtained for connection
DEBUG ssh: stdout: [root@vagrant-backend-lbamqp ~]# 
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...

Given simplified version of script does not stuck, like my original vagrant file (I have provisioning of salt stack there and a few minor things). First time it started to stuck while going up after addition of hosts. Now I am not able to simulate it with simplified environment, but on other side, I am not able to get rid of provision freezing on my work vagrant file. I have also tried to delete .vagrant directory (there was some older machine names which are not used now) and also I have tried to delete ~/.vagrant.d/data dir. Nothing have helped.

I hope that debug log will be enough to fix this problem. If no, please instruct me how to provide more help.

Sharpie commented 8 years ago

Is this behavior consistent? Or does it only occur some of the time?

Also, I'm curious about the last few lines of output:

 INFO interface: Machine: metadata ["provider", :virtualbox, {:target=>:"vagrant-backend"}]
DEBUG ssh: Uploading: /tmp/tmp-hosts20160603-29916-1fntcy7 to /tmp/hosts
DEBUG ssh: Re-using SSH connection.
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: cat /tmp/hosts > /etc/hosts (sudo=true)
DEBUG ssh: pty obtained for connection
DEBUG ssh: stdout: [root@vagrant-backend-lbamqp ~]# 
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...

The cat /tmp/hosts > /etc/hosts command is the last action run by the hosts provisioner. But, after that, we see a PTY being allocated for a SSH connection --- which should be the first action taken by the SSH connection, not the last.

Perhaps the root cause of this is that the Vagrant shell provider isn't enabling the synchronous mode of Net:SSH which is causing a race condition when the command is executed before the PTY is allocated. I've seen commands fail/hang in weird ways from using PTY support before, perhaps this is the source of those issues.

Anyway, thanks for the bug report! Perhaps there's a better way to work around the requiretty defaults, such as letting host provisioning fail without halting Vagrant operations.

liveder commented 7 years ago

Some additional information for the related issue.

Reproducible every time with: MacOS: 10.12 (Sierra) Vagrant: 1.8.5 VirtualBox: 5.0.28

DEBUG guest: Found cap: change_host_name in redhat INFO guest: Execute capability: change_host_name #<Vagrant::Machine: puppet (VagrantPlugins::ProviderVirtualBox::Provider)>, "puppet.sandbox.local" DEBUG ssh: Re-using SSH connection. INFO ssh: Execute: hostname -f | grep '^puppet.sandbox.local$' (sudo=false) DEBUG ssh: stdout: puppet.sandbox.local

DEBUG ssh: Exit status: 0 DEBUG ssh: Uploading: /var/folders/fg/xwzzp0s95dbcbk0_f0r7lx0c0000gn/T/tmp-hosts20161028-6240-h0iyjm to /tmp/hosts DEBUG ssh: Re-using SSH connection. DEBUG ssh: Re-using SSH connection. INFO ssh: Execute: cat /tmp/hosts > /etc/hosts (sudo=true) DEBUG ssh: pty obtained for connection DEBUG ssh: stdout: export TERM=vt100 stty raw -echo export PS1= export PS2= export PROMPT_COMMAND= printf bccbb768c119429488cfd109aacea6b5-pty cat /tmp/hosts > /etc/hosts exitcode=$? printf bccbb768c119429488cfd109aacea6b5-pty exit $exitcode

DEBUG ssh: stdout: DEBUG ssh: stdout: [root@puppet ~]# DEBUG ssh: Sending SSH keep-alive... DEBUG ssh: Sending SSH keep-alive... DEBUG ssh: Sending SSH keep-alive...

Vagrant file:

  if Vagrant.has_plugin?('hosts')
    node_config.vm.provision :hosts do |hosts|
      hosts.add_localhost_hostnames = false
      puppet_nodes.each do |node|
        hosts.add_host node[:ip], ["#{node[:hostname]}.#{domain}", node[:hostname]]
      end
    end
  else
    puts 'ERROR: `vagrant-hosts` plugin for Vagrant is not installed. Read README.md for more details.'
    exit 1
  end
LuboVarga commented 5 years ago

Problem still presented. Some more information.

I have started vagrant up vmjb vmjava (two non-existen/destroyed machines at once).

After vmjb has been provisioned, up and running, vmjava is being to started up. It will froze when (I think according output in console) vmjava is trying to updata vmjb:/etc/hosts.

While vagrant up is frozen like this:

Unmounting Virtualbox Guest Additions ISO from: /mnt
Got different reports about installed GuestAdditions version:
Virtualbox on your host claims:   4.3.4
VBoxService inside the vm claims: 5.2.6
Going on, assuming VBoxService is correct...
Got different reports about installed GuestAdditions version:
Virtualbox on your host claims:   4.3.4
VBoxService inside the vm claims: 5.2.6
Going on, assuming VBoxService is correct...
Got different reports about installed GuestAdditions version:
Virtualbox on your host claims:   4.3.4
VBoxService inside the vm claims: 5.2.6
Going on, assuming VBoxService is correct...
Restarting VM to apply changes...
==> vmjava: Attempting graceful shutdown of VM...
==> vmjava: Booting VM...
==> vmjava: Waiting for machine to boot. This may take a few minutes...
    vmjava: SSH address: 127.0.0.1:2201
    vmjava: SSH username: root
    vmjava: SSH auth method: password
==> vmjava: Machine booted and ready!
==> vmjava: Checking for guest additions in VM...
==> vmjava: Setting hostname...
==> vmjava: Configuring and enabling network interfaces...
    vmjava: SSH address: 127.0.0.1:2201
    vmjava: SSH username: root
    vmjava: SSH auth method: password
==> vmjava: Mounting shared folders...
    vmjava: /vagrant => /home/luvar/work/nike-salt/vagrant
    vmjava: /srv/work => /home/luvar/work
    vmjava: /srv/salt => /home/luvar/work/nike-salt/salt
    vmjava: /srv/pillar => /home/luvar/work/nike-salt/pillar
==> vmjava: Running provisioner: hosts...
==> vmjb: Updating hosts on: vmjb

On vmjb machine there seems to be /tmp/hosts present and not swapped yet:

➜  vagrant git:(master) ✗ vagrant ssh vmjb
Last login: Thu Nov 15 09:20:38 2018 from 10.0.2.2
[root@vmjb vagrant]# cat /tmp/hosts 
127.0.0.1 localhost
127.0.1.1 vmjb
192.168.100.66 vmjava
192.168.100.70 vmtel1.vagrant.nike.sk vmtel1
192.168.100.67 vmjb
[root@vmjb vagrant]# cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 vmjb
192.168.100.70 vmtel1.vagrant.nike.sk vmtel1
192.168.100.67 vmjb

On vmjava (that one which provisioning is currently frozen on another console), it seems that /etc/hosts is fine (through I would suggest to delete /tmp/hosts after successfully copy):

➜  vagrant git:(master) ✗ vagrant ssh vmjava
Last login: Thu Nov 15 03:20:36 2018 from 10.0.2.2
[root@vmjava ~]# cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 vmjava
192.168.100.66 vmjava
192.168.100.70 vmtel1.vagrant.nike.sk vmtel1
192.168.100.67 vmjb
[root@vmjava ~]# cat /tmp/hosts
127.0.0.1 localhost
127.0.1.1 vmjava
192.168.100.66 vmjava
192.168.100.70 vmtel1.vagrant.nike.sk vmtel1
192.168.100.67 vmjb