hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.28k stars 4.43k forks source link

bridged network failing #921

Closed jperry closed 11 years ago

jperry commented 12 years ago

Hi,

My vagrant box is having issues recently trying to connect through a bridged network. This is the output:

[centos] Matching MAC address for NAT networking...
[centos] Clearing any previously set forwarded ports...
[centos] Forwarding ports...
[centos] -- 22 => 2222 (adapter 1)
[centos] Creating shared folders metadata...
[centos] Clearing any previously set network interfaces...
[centos] Available bridged network interfaces:
1) en0: Ethernet 2
2) en1: AirPort
What interface should the network bridge to? 1
[centos] Preparing network interfaces based on configuration...
[centos] Booting VM...
[centos] Waiting for VM to boot. This can take a few minutes.
[centos] VM booted and ready for use!
[centos] Configuring and enabling network interfaces...
rake aborted!
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifup eth1 2> /dev/null

Tasks: TOP => test => vagrant:provision => vagrant:up
(See full trace by running task with --trace)
twirrim commented 12 years ago

I've been hitting the same problem with CentOS. The vagrant box was produced using the veewee centos 6.2 template, and actually works fine. The interface is up and running and bridged, but Vagrant seems a little confused about it.

I've put a copy of the box up here: http://paulgraydon.co.uk/master.box. The relevant Vagrantfile section:

  config.vm.define :master do |master|
    master.vm.box = "Centos6"
    master.vm.network :bridged
  end

Running that command manually without redirect:

[root@master ~]# ifup eth1

Determining IP information for eth1...dhclient(1250) is already running - exiting. 

This version of ISC DHCP is based on the release available
on ftp.isc.org.  Features have been added and other changes
have been made to the base software release in order to make
it work better with this distribution.

Please report for this software via the Red Hat Bugzilla site:
    http://bugzilla.redhat.com

exiting.
 failed.
nickpresta commented 12 years ago

I also sometimes hit the same problem running CentOS 6.0 x86_64 using the Veewee template.

gregburek commented 12 years ago

I'm getting the same on RHEL 6.2 and CentOS 6.2 guests and I believe that I figured out why this is happening: on RHEL like distros, ifup fails if the interface is already up and has dhclient listening on it.

@twirrim: I booted your box with no eth1 and saw a /etc/sysconfig/network-scripts/ifcfg-eth1 of:

#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=dhcp
ONBOOT=yes
DEVICE=eth1
#VAGRANT-END

This means that before vagrant comes in to change out that file, dhclient has already requested an address and an ifup command will fail.

My RHEL 6.2 box's original /etc/sysconfig/network-scripts/ifcfg-eth1 doesn't exist. The first boot with an additional nic works fine. But on a reload, the VM boots with dhcp configured and dhclient will not allow for a clean ifup:

$ vagrant up
[node1] Importing base box 'rhel62'...
[node1] Matching MAC address for NAT networking...
[node1] Clearing any previously set forwarded ports...
[node1] Forwarding ports...
[node1] -- 22 => 2222 (adapter 1)
[node1] Creating shared folders metadata...
[node1] Clearing any previously set network interfaces...
[node1] Preparing network interfaces based on configuration...
[node1] Booting VM...
[node1] Waiting for VM to boot. This can take a few minutes.
[node1] VM booted and ready for use!
[node1] Configuring and enabling network interfaces...
[node1] Mounting shared folders...
[node1] -- v-root: /vagrant
$ vagrant reload
[node1] Attempting graceful shutdown of VM...
[node1] Clearing any previously set forwarded ports...
[node1] Forwarding ports...
[node1] -- 22 => 2222 (adapter 1)
[node1] Creating shared folders metadata...
[node1] Clearing any previously set network interfaces...
[node1] Preparing network interfaces based on configuration...
[node1] Booting VM...
[node1] Waiting for VM to boot. This can take a few minutes.
[node1] VM booted and ready for use!
[node1] Configuring and enabling network interfaces...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifup eth1 2> /dev/null

Looking at ./plugins/guests/redhat/guest.rb it seems that there is an ifdown command before the ifup, but the network interface seems to stay up. I'll try running vm.channel.sudo("/sbin/ifdown eth#{interface} 2> /dev/null", :error_check => true) to see if it succeeds.

dcrosta commented 12 years ago

I can confirm this behavior on a (different) CentOS 6.2 box, also built from the veewee template. I've confirmed as well that setting ONBOOT=no in /etc/sysconfig/network-scripts/ifcfg-eth1 allows the halt/up cycle to succeed. This is just a work-around, obviously, not a solution to the root problem.

blalor commented 11 years ago

I'm also having this problem with config.vm.network :hostonly, :dhcp and a CentOS 6.3 box I built myself with Veewee.

leifmadsen commented 11 years ago

Yep same issue here. Makes working with :bridged interfaces frustrating.

jphalip commented 11 years ago

This is probably related to #997 (maybe a duplicate). FWIW, the error went away for me after restarting VirtualBox...

mitchellh commented 11 years ago

Has anyone made progress on this in figuring out a fix? It seems to be that the RedHat guest is "broken" when configuring network interfaces. I'm not a big RedHat or CentOS user so I'm unsure where the root issue is, would love community help here.

blalor commented 11 years ago

I think the key is in @gregburek's comment above: https://github.com/mitchellh/vagrant/issues/921#issuecomment-7750018

ellisio commented 11 years ago

It appears if you get this error, doing the following works:

vagrant ssh
sudo su
vi /etc/sysconfig/network-scripts/ifcfg-eth1

Change ONBOOT=yes to ONBOOT=no then run vagrant reload.

I literally just did this and it worked.

I also have this in my Vagrantfile: config.vm.provision :shell, :path => "developer/networking.sh"

networking.sh:

#!/bin/bash
rm -f /etc/udev/rules.d/70-persistent-net.rules
rm -f /etc/sysconfig/network-scripts/ifcfg-eth1
/etc/init.d/network restart

This is for CentOS 6.4 (Final).

chilicat commented 11 years ago

My solution to the problem was to ignore simple the exit code of the ifup command. The advantage of this solution is that if a user reboots the machine (without vagrant) that the network connection will re-established.

vm.communicate.sudo("/sbin/ifup eth#{interface} 2> /dev/null", :error_check => false)
leifmadsen commented 11 years ago

@chilicat I don't quite follow how you're using that, as 'communicate' seems to be an undefined method.

alexandrem commented 11 years ago

I have this issue with Centos63, Vagrant 1.1.4, VirtualBox 4.2.10 on OSX Lion.

I am unable to have the private_network working on my guest.

The real problem is that the networking isn't being setup correctly on the guest system. Even if you try to ignore the error, by adding the line:

vm.communicate.sudo("/sbin/ifup eth#{interface} 2> /dev/null", :error_check => false)

to the Vagrant code (plugins/guests/redhad/guest.rb), the network bridge isn't brought up, so your VM isn't reachable by the IP that you try to assign it.

I tried with latest Vagrant 1.2 and it was even worse, I had some "network_scripts_dir" capability error for the redhat guest. I saw that there was a huge refactoring in the network code for all guests, maybe something is broken there, or then it's my system.

Anyway, a real fix would be greatly appreciated!

mitchellh commented 11 years ago

Alright, so the main issue I'm confused about is that RedHat guest does an ifdown prior to the ifup. Why is it failing even though the prior interface went down? Sorry for the somewhat simple questions, I don't have an easily accessible RH box lying around.

ellisio commented 11 years ago

My guess (off the top of my head) is that there is something in the network initialization section that reverts the bridged connection on boot, CentOS boots, then Vagrant tries to replace the network configuration and restarts networking. In that time CentOS freaks out and doesn't know where to get its networking information from.

Using the steps I posted above seem to resolve the issue (6.4 w/ Vagrant 1.1.4).

mitchellh commented 11 years ago

Your approach does work, but I'm afraid of changing things to ONBOOT=no by default for people who might reboot their machines outside of Vagrant and lose their networks... I'd love a solution that didn't make that compromise.

ellisio commented 11 years ago

I can test when I get into the office tomorrow, but I'm pretty sure that change still holds when running "shutdown -(r,h) now".

The reason we need this is because we have a cluster of VMs for each dev to simulate load balancing and MySQL replication so we need IPs to stick after the VMs come online. (Curse you Percona!)

Sent from my iPhone

On Apr 7, 2013, at 4:26 PM, Mitchell Hashimoto notifications@github.com wrote:

Your approach does work, but I'm afraid of changing things to ONBOOT=no by default for people who might reboot their machines outside of Vagrant and lose their networks... I'd love a solution that didn't make that compromise.

— Reply to this email directly or view it on GitHub.

blalor commented 11 years ago

Steps to reproduce:

  1. use this Vagrantfile
Vagrant.configure("2") do |config|
  config.vm.box = "CentOS-6.3-x86_64-reallyminimal"
  config.vm.box_url = "https://s3.amazonaws.com/1412126a-vagrant/CentOS-6.3-x86_64-reallyminimal.box"
  config.vm.network :private_network, type: :dhcp
end
  1. vagrant up
  2. vagrant ssh
  3. sudo halt
  4. vagrant up fails; result:
Bringing machine 'default' up with 'virtualbox' provider...
[default] Setting the name of the VM...
[default] Clearing any previously set forwarded ports...
[default] Creating shared folders metadata...
[default] Clearing any previously set network interfaces...
[default] Preparing network interfaces based on configuration...
[default] Forwarding ports...
[default] -- 22 => 2222 (adapter 1)
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
[default] VM booted and ready for use!
[default] Configuring and enabling network interfaces...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifup eth1 2> /dev/null
mitchellh commented 11 years ago

FIXED! It ended up being a race condition with booting. So I was able to keep the ONBOOT, and just fixed this by adding a retry loop around it. Heh.

78d4d0a7902b6eb20b4353bb2b158f00e6d36a8a

blalor commented 11 years ago

Woo hoo! Thanks!

ellisio commented 11 years ago

Sweeeeeet!!

Sent from my iPhone

On Apr 8, 2013, at 12:09 PM, Brian Lalor notifications@github.com wrote:

Woo hoo! Thanks! — Reply to this email directly or view it on GitHub.

leifmadsen commented 11 years ago

:+1:

xorti commented 11 years ago

Awesome! Today I've run into that issue, glad that it's now fixed!

chiefy commented 11 years ago

Cheetos.

ellisio commented 11 years ago

Any ETA on 1.1.6 being released? I'm now seeing this behavior with 1.1.5 on Ubuntu as well...

chiefy commented 11 years ago

@awellis13 I just installed the gem from source and it seems to work well, you can do that until an official release is made.

ellisio commented 11 years ago

Eh, I'll do that for my personal dev. I need an official release for work so we can go through our Nazi process of pushing new software to our developers. haha

On Apr 15, 2013, at 10:40 AM, Christopher Najewicz notifications@github.com wrote:

@awellis13 I just installed the gem from source and it seems to work well, you can do that until an official release is made.

— Reply to this email directly or view it on GitHub.

ellisio commented 11 years ago

@chiefy How did you install the 1.1.x via gem? gem install vagrant installs 1.0.7. I even ran gem update --system to make sure I'm on the latest RubyGems 2.0.3 version, still 1.0.7 comes down...

chiefy commented 11 years ago

I'd be sure to uninstall whatever version you have currently, then...

gem install specific_install
gem specific_install -l https://github.com/mitchellh/vagrant.git
STLMikey commented 11 years ago

I am still encountering this error with host-only networking, vagrant version 1.2.2

owain68 commented 11 years ago

Same issue here:

virtual box 4.2.12 vagrant 1.2.2 and this box http://puppet-vagrant-boxes.puppetlabs.com/centos-64-x64-vbox4210.box

and the following vagrant file bits

  config.vm.box = "puppetlabs"
  config.vm.guest = :linux
  config.vm.synced_folder "..", "/project"

I appear to be going around in circles on this.

bjwschaap commented 11 years ago

I can also reproduce. Using VirtualBox 4.2.12, Vagrant 1.2.2 (from deb package) and this box http://puppet-vagrant-boxes.puppetlabs.com/centos-64-x64-vbox4210-nocm.box which gets new guest tools using vagrant-vbguest plugin.

My Vagrantfile:

Vagrant.configure("2") do |config|
   config.vm.box = "centos-64-minimal"
   config.vm.guest = :linux
   config.vm.hostname = "puppet"
   config.vm.network :private_network, ip: "10.10.10.1"  
   config.vm.synced_folder "transfer", "/tmp/transfer"
   config.vm.provider :virtualbox do |vb|
      vb.gui = true
      vb.customize ["modifyvm", :id, "--memory", 1024]
   end
end

I can't seem to get rid of this..

owain68 commented 11 years ago

@bjwschaap I think I have worked around it. Try this. Go into Virtualbox directly, delete all the files associated with the VM and then re-import the vagrant box. It seems to work after that until you do another vagrant box add with the same name. Give it a try.

mmohiudd commented 11 years ago

@bjwschaap try with an ip: 10.10.10.2, I was having the same issue with CentOS 6.3. Initially I had

config.vm.network :private_network, ip: "192.168.100.1" 

But changed the IP to '192.168.100.101' and it worked! Also you may try with auto_config => false. This will not configure the network. You would need to configure your network manually. You can use provision for this - probably shell, haven't tried with puppet or chef yet.

I have:

Vagrant 1.2.2
VirtualBox 4.2.12
gsvicky commented 11 years ago

Same issue here as well. Virtual box 4.2.12, Vagrant 1.2.2, and box is https://dl.dropbox.com/u/7225008/Vagrant/CentOS-6.3-x86_64-minimal.box

Vagrant File:

Vagrant.configure("2") do |config|
   config.vm.define :node1 do |node1|
    node1.vm.box = "centosbox"
    node1.vm.provision :shell, :path => "bootstrap-node.sh"
    node1.vm.network :public_network
    node1.vm.hostname = "node1"
  end
end
vagrant up node1 
gives the following error:
[node1] Configuring and enabling network interfaces...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
/sbin/ifup eth1 2> /dev/null
Tried all the solutions listed above without results. Any help to get around this would be greatly appreciated.
garlandkr commented 11 years ago

OSX 10.8.3 Vagrant 1.2.2 VirtualBox 4.2.12

Vagrant.configure("2") do |config|
  config.vm.define :myvm do |myvm_conf|
    myvm_conf.vm.box = "myvm"
    myvm_conf.vm.network :private_network, type: "
    myvm_conf.vm.network :forwarded_port, guest:80, host:8080
    myvm_conf.vm.box_url = "my.box"
    myvm_conf.vm.synced_folder ".", "/opt/blah"
    myvm_conf.vm.provider :virtualbox do |vb|
      vb.customize ["modifyvm", :id, "--memory", "1024"]
      vb.customize ["modifyvm", :id, "--cpus", "2"]
      vb.customize ["modifyvm", :id, "--name", "VM"]
      vb.customize ["setextradata", :id, "VBoxInternal2/SharedFoldersEnableSymlinksCreate/v-root", "1"]
    end
  end
end
michael-harrison commented 11 years ago

I've experienced this problem too. Following is how I ended up getting to it:

  1. I created a box using veewee with the centos-64-x64-vbox4210-nocm template provided by PuppetLabs (see https://github.com/puppetlabs/puppet-vagrant-boxes).
  2. Based on the created VM I packaged it using the following:

    vagrant package --base centos-64-x64-vbox4210-nocm --output centos-64-x64-vbox4210-nocm.box

  3. Once I had the packaged VM I used it to create a new box for my own setup of Puppet called puppet_metal using a number of bash scripts to install only the necessary items. This box was to be used as a base for a number of different machines so once it was created I packaged the resulting VM.
  4. I used puppet_metal to make the first new machine with the following config:

The new box

Vagrant.configure('2') do |config|
  config.vm.box = 'puppet_metal'
  config.vm.box_url = 'https://s3-ap-southeast-2.amazonaws.com/ntech-boxes/puppet_metal.box'
  config.vm.network :forwarded_port, guest: 22, host: 8022
  config.vm.network :private_network, ip: '10.202.202.10'
  config.vm.hostname = 'thebox.example.com'
end

At this point I started to experience the error:

[default] Configuring and enabling network interfaces...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifup eth1 2> /dev/null

I found it odd that everything had worked fine up until this point so I did some digging and found that eth1 was not being mounted so I had a look at it's config file and found the following:

/etc/sysconfig/network-scripts/ifcfg-eth1

#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=none
IPADDR=10.202.202.10
NETMASK=255.255.255.0
DEVICE=eth1
PEERDNS=no
#VAGRANT-END
#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=none
IPADDR=10.202.202.10
NETMASK=255.255.255.0
DEVICE=eth1
PEERDNS=no
#VAGRANT-END
#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=none
IPADDR=10.202.202.10
NETMASK=255.255.255.0
DEVICE=eth1
PEERDNS=no
#VAGRANT-END

For some reason the configuration had been repeated 3 times. I'm not sure of the significance of this but hope it helps in identifying the problem. Following are the details of my setup:

OSX 10.8.3 Vagrant 1.2.2 VirtualBox 4.2.12

jprosevear commented 11 years ago

I get the repeated entries as well - they are caused by the 3 retries in the exception handler. The actual command failing is something like: /sbin/arping -c 2 -w 3 -D -I eth1 10.10.0.1

in /etc/sysconfig/network-scripts/ifup-eth, output if run manually is:

ARPING 10.10.0.1 from 0.0.0.0 eth1 Unicast reply from 10.10.0.1 [0A:00:27:00:00:00] 1.285ms Sent 1 probes (1 broadcast(s)) Received 1 response(s)

this is the same output you get on precise64, except arping isn't installed there by default, so its not cutting short the ifup process.

The MAC that claims it is my host adapter (OSX 10.8.4) (clipped output from ifconfig): vboxnet0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether 0a:00:27:00:00:00 inet 10.10.0.1 netmask 0xffffff00 broadcast 10.10.0.255

jprosevear commented 11 years ago

ARPCHECK="no" set as an environment variable when doing the ifup get's rid of the error

jprosevear commented 11 years ago

I posted https://github.com/mitchellh/vagrant/pull/1815 for this.

michael-harrison commented 11 years ago

I've given this a test on my local using 1.2.3.dev but I'm still getting an error :(

[default] Configuring and enabling network interfaces...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

ARPCHECK=no /sbin/ifup eth1 2> /dev/null

I've tried it with both private and public networks:

config.vm.network :public_network
config.vm.network :private_network, ip: '10.202.202.10'
jprosevear commented 11 years ago

Turn on vagrant logging and see what the exact error coming from the ifup command is (or ssh into the vagrant box and run the command manually with sudo). The error I posted the fix for should look something like:

[vagrant@localhost ~]$ sudo /sbin/ifup eth1 Error, some other host already uses address 10.10.0.1.

michael-harrison commented 11 years ago

@jprosevear I've turned on the logging and created a gist for your review: https://gist.github.com/michael-harrison/5746092

Following is the Vagrant file in full:

Vagrant.configure('2') do |config|
  config.vm.box = 'puppet_metal'
  config.vm.box_url = 'https://s3-ap-southeast-2.amazonaws.com/ntech-boxes/puppet_metal.box'
#  config.vm.network :forwarded_port, guest: 22, host: 8022
  config.vm.network :private_network, ip: '10.202.202.11'
  config.vm.hostname = 'duster.sp11.ntechhosting.com'

  config.vm.provision :puppet do |puppet|
    puppet.module_path = 'modules'
    puppet.options = '--verbose --debug'
  end

  config.vm.provider :virtualbox do |vb|
    vb.customize ["modifyvm", :id, "--memory", "2048"]
  end
end
jprosevear commented 11 years ago

Need VAGRANT_LOG=DEBUG

michael-harrison commented 11 years ago

@jprosevear sorry, the gist has been updated with the debug log. I did note the following error:

INFO ssh: Execute: /sbin/ifdown eth1 2> /dev/null (sudo=true)
DEBUG ssh: stdout: ERROR    : [ipv6_test_device_status] Missing parameter 'device' (arg 1)
DEBUG ssh: Exit status: 0
michael-harrison commented 11 years ago

@jprosevear Quick FYI for the interim I've commented out the network configuration to continue the work that I'm doing.

jprosevear commented 11 years ago

You have a different error than I fixed: DEBUG ssh: stdout: Device eth1 does not seem to be present, delaying initialization.

michael-harrison commented 11 years ago

@jprosevear I agree based on the debug logs. Do you think it's worthwhile raising another issue?

michael-harrison commented 11 years ago

@Aigeruth The environment variable is already being set in the master (which I did my testing with) but made no real difference to what I'm seeing. Based on my logging the error isn't happening on the ifup, it's on the ifdown: https://gist.github.com/michael-harrison/5746092)

7178 INFO ssh: Execute: /sbin/ifdown eth1 2> /dev/null (sudo=true) 7179 DEBUG ssh: stdout: ERROR : [ipv6_test_device_status] Missing parameter 'device' (arg 1)

ellisio commented 11 years ago

Call it a hunch, but if you're having an device not found on ifdown I feel this the base box you're using was not built correctly.

Sent from my iPhone

On Jun 13, 2013, at 9:47 PM, michael-harrison notifications@github.com wrote:

@Aigeruth The environment variable is already being set in the master (which I did my testing with) but made no real difference to what I'm seeing. Based on my logging the error isn't happening on the ifup, it's on the ifdown: https://gist.github.com/michael-harrison/5746092)

7178 INFO ssh: Execute: /sbin/ifdown eth1 2> /dev/null (sudo=true) 7179 DEBUG ssh: stdout: ERROR : [ipv6_test_device_status] Missing parameter 'device' (arg 1)

— Reply to this email directly or view it on GitHub.