hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.2k stars 4.43k forks source link

Slow networking (due to IPv6?) on CentOS 6.x #1172

Closed dcrosta closed 11 years ago

dcrosta commented 11 years ago

I've been struggling with this issue for a while, and I think I've finally narrowed it down to something Vagrant is doing (though I am not sure what). Here's the situation:

  1. I create a CentOS 6.3 box by way of manually installing the CentOS 6.3 minimal ISO.
  2. I configure the box until it passes Veewee validation
  3. I disable IPv6.

    This deserves a little explanation: I've found that in VirtualBox (on my machines at least), a CentOS guest with IPv6 enabled has approximately a 5 second delay while it attempts to make network calls. Presumably there is a 5 second timeout trying to use an IPv6 address, after which it falls back to the IPv4 address.

    In any event, disabling IPv6 makes using the box much more pleasant than with it enabled, and nothing I am doing requires IPv6.

  4. I package the box using vagrant basebox export centos-6.3-x86_64 and add to Vagrant with vagrant add centos-6.3 centos-6.3-x86_64.box
  5. I vagrant init and vagrant up
  6. Once inside the VM, I run time curl http://www.google.com/. Inside the Vagrant-launched box, I get a time just over 5 seconds, strongly suggesting the same behavior as I attempted to disable in step 3.
  7. If manually import the ~/.vagrant.d/centos-6.3-x86_64/box.ovf file into VirtualBox, and run through the VirtualBox GUI, I do not experience the delay described in step 6.

I don't have any reason to believe that IPv6 is being re-enabled by Virtualbox -- the changes made in step 3 are present in the Vagrant-launched VM -- but the fact that the delay is the same suggests that at least something similar, if not identical, is going on.

I'm a little bit at a loss as to where to look next. Does anyone have any experience solving or working around this issue, or have any suggestions on where to look next?

dcrosta commented 11 years ago

@miketheman points out this could also be related to this stackoverflow question -- indeed, if I manually update /etc/resolv.conf to use my DNS server as configured on the host, then the 5 second delay vanishes.

cnf commented 11 years ago

try adding options single-request-reopen to your /etc/resolv.conf

RH tries both ipv4 and ipv6 resolve queries over the same socket, and the ipv6 can take some time to time out. The above option makes your system open a new connection for each request.

dcrosta commented 11 years ago

Thanks, that helped.

I also tried disabling Virtualbox's DNS proxy with

config.vm.customize ["modifyvm", :id, "--natdnsproxy1", "off"]
config.vm.customize ["modifyvm", :id, "--natdnshostresolver1", "off"]

which also resolved the issue. /etc/resolv.conf now has the two DNS servers configured on my host machine directly in it.

Anyone have any idea which of these solutions, if any, are most appropriate? I'll be happy to provide a patch, but I'd appreciate some guidance on where and how to do it.

blalor commented 11 years ago

There is an option in one of the config files on the guest (not sure if it's dhcpd's conf or something in /etc/sysconfig) that will add this option to /etc/resolv.conf in such a way that it isn't overwritten when an IP address is obtained. That, coupled with #1313, should give us enough flexibility to work around these networking issues.

mstyne commented 11 years ago

@dcrosta Thanks for your notes on this issue, it seems to have done the trick for me.

jamesmoriarty commented 11 years ago

Your my heroes @dcrosta and @cnf. Thanks.

mitchellh commented 11 years ago

Because this is an issue more dealing with how boxes are made rather than something Vagrant can really do itself, I'm going to close this.

jamesmoriarty commented 11 years ago

Just for the record you can do something like this in your vagrant file to fix it:

config.vm.provision :shell, inline: "if [ ! $(grep single-request-reopen /etc/resolv.conf) ]; then echo 'options single-request-reopen' >> /etc/resolv.conf && service network restart; fi"
driskell commented 10 years ago

Hi all,

I'm late to the party, but for me the best way seems to pop this in the default ifcfg-eth0:

RES_OPTIONS=single-request-reopen

dhclient-script takes this into account and places "options single-request-reopen" in resolv.conf for you.

rjocoleman commented 10 years ago

@driskell It's better to put it into /etc/sysconfig/network so it will take effect on all interfaces regardless of name.

To summarise this problem - slow DNS resolution leading to timeouts when using Vagrant is an issue with specifically Virtualbox and Centos 6 or Fedora 19+ guests.

Here's how Chef's Bento boxes (provisionerless vagrant boxes) handles this for Centos 6 and Fedora (As an aside Chef's base boxes are quite nice and worth a look and they don't actually include Chef).

If you're using Virtualbox and must use a box that doesn't have the appropriate workaround in place already you could adapt the techniques above for use in a vagrant shell provisioner localed above any other provisioners in your Vagrantfile so it runs first e.g.

Centos:

config.vm.provision :shell, inline: "if [ ! $(grep single-request-reopen /etc/sysconfig/network) ]; then echo RES_OPTIONS=single-request-reopen >> /etc/sysconfig/network && service network restart; fi"

Fedora 19+:

$script = <<SCRIPT
# fix bug to enable nm-dispatcher on Fedora 19 only https://bugzilla.redhat.com/show_bug.cgi?id=974811
if [[ $(rpm -q --queryformat '%{VERSION}\n' fedora-release) == 19 ]]; then
  yum -y upgrade NetworkManager
  systemctl enable NetworkManager-dispatcher.service
fi

cat >> /etc/NetworkManager/dispatcher.d/fix-slow-dns <<EOF
#!/bin/bash
echo "options single-request-reopen" >> /etc/resolv.conf
EOF
chmod +x /etc/NetworkManager/dispatcher.d/fix-slow-dns
service NetworkManager restart
SCRIPT

config.vm.provision 'shell', inline: $script
driskell commented 10 years ago

@rjocoleman Thanks! Might look at those.

Either ifcfg or network will both work. Ifcfg is my preference though as the config fixes issue with DHCP DNS and ifcfg is what sets DHCP. If it were to be made static the issue would probably go away so option becomes unnecessary. For public box I can see the benefit though as it will ensure it on no matter the end user interface config.

apple-corps commented 9 years ago

@rjocoleman your script broke the network on my vagrant box.

  config.vm.provision :shell, inline: "if [ ! $(grep single-request-reopen /etc/sysconfig/network) ]; then echo RES_OPTIONS=single-request-reopen >> /etc/sysconfig/network && service network restart; fi"

It looks like it started it's trying to resolve hosts via IPV6 :

==> nn02: 14: PYCURL ERROR 7 - "Failed to connect to 2a02:2498:1:3d:5054:ff:fed3:e91a: Network is unreachable"

simono commented 9 years ago

Here's an Ansible example to fix this on Ubuntu/Debian. https://gist.github.com/simono/73e37d8c3a45664c7045

minorOffense commented 9 years ago

This may be a dumb question but does this still apply to CentOS 7? I mean the issue primarily, I realize the script fix might not work since it's a whole different OS version.

mconigliaro commented 9 years ago

@minorOffense yes, I encountered this in CentOS 7 too. I fixed it by disabling NetworkManager, then adding the following:

# /etc/sysconfig/network
RES_OPTIONS="single-request-reopen"
# /etc/sysconfig/network-scripts/ifcfg-*
NM_CONTROLLED=no
PEERDNS=yes
seeafish commented 9 years ago

If anyone needs to workaround this issue in test-kitchen, the vbox settings are mapped like this:

---
driver:
  name: vagrant
  customize:
    natdnsproxy1: "off"
    natdnshostresolver1: "off"
filex commented 9 years ago

I have used the RES_OPTIONS-workaround on CentOS 7. But it stopped working with CentOS 7.1. (To be precise: The box was fine after initial creation, but the resolv.conf was broken after a vagrant reload).

This is due to a bug in RHEL/CentOS that I have filed today: http://bugs.centos.org/view.php?id=8490

apple-corps commented 9 years ago

centOS 7 uses a systemd service to set hostname, you might want to have a look at that. I don't know why they bother including /etc/sysconfig/network anymore. It's not populated by a manual install.