hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.27k stars 4.43k forks source link

Vagrant manipulating localhost entry in /etc/hosts #7263

Closed johnoooo closed 8 years ago

johnoooo commented 8 years ago

Hi all,

I just came across this recently. For many Unix guest systems, there are plugins that manipulate /etc/hosts when setting the machine's hostname. This is true at least for RHEL/CentOS, Fedora, Debian, Ubuntu.

There are two flavours (as I read it from the code)

  1. add the hostname to 127.0.0.1 as first entry
  2. add the hostname to 127.0.1.1

I doubt that this is a good idea at all and would like to suggest to review this mechanism and possibly to remove it or to revise it significantly.

Binding a global DNS name to a localhost interface is somehow a contradiction itself. If somebody has a configuration with a single network interface behind a NAT box (the default setup), he/she should use localhost to refer to the machine and doesn't need to set the VM hostname at all. For all other cases, the FQDN must not be assigned to the localhost interface.

Why is this so bad? Consider a case with multiple machines in an isolated network or a case with a machine with a private(default) and a public network interface. Harm occurs when the hostname is used by an application to determine the IP address to which they bind to. In this case, such applications will not be visible on the public network interface.

johnoooo commented 8 years ago

This is a minimal example, done with Vagrant 1.8.1. It is for the case with a private and a public network interface, e.g., in a multi-machine setup

Vagrant.configure(2) do |config|
  config.vm.box = "centos/7"
  config.vm.network :public_network, ip: "192.168.99.99", netmask: "255.255.255.0", bridge: "eno1"
  config.vm.hostname = "test.mydomain"
end

This results in a /etc/hosts file of

127.0.0.1   test.mydomain test localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6     
andyfeller commented 8 years ago

I've gotta say the vagrant-1.8.1/plugins/guests/redhat/cap/change_host_name.rb has really wasted a lot of my time debugging vagrant vms all because of the FQDN being bound to localhost. As far as I understand it, there is only a small handful of cases where you want this scenario.

andyfeller commented 8 years ago

Found someone else that was having this issue and had a hack to deal with it but should see about addressing this in vagrant itself

johnoooo commented 8 years ago

It also took me an hour to track down why my services were not communicating.

Another pretty simple workaround is to completely rewrite the /etc/hosts file in one of your bootstrap scripts. The vagrant-modifications only happen once when provisioning the box. However, @afelle1, this doesn't solve your problem.

andyfeller commented 8 years ago

Yeah I have a shell provisioner that rewrites local host line after it's been provisioned but before ansible provisioner. I just don't want to copy pasta this in every vagrant file because that's just fail sauce.

sethvargo commented 8 years ago

Hi all,

We just did a fairly large refactor of guest capabilities (one of which is setting the hostname). While I understand the arguments here, almost every online guide for setting the hostname recommends adding the hostname to the /etc/hosts. In the past, Vagrant has overwritten the localhost entry, but this will no longer happen in the next release of Vagrant.

This has not traditionally been a problem, and we try to optimize for the ease-of-use in Vagrant. Not every Vagrant user is a sysadmin or has the need for a really complex networked environment; they just want something to work out of the box. Due to the possibility that removing the /etc/hosts modification would introduce incompatible breaking changes, we are going to leave the behavior as-is. There are existing work-arounds for users who do not want to have the local hostname in the /etc/hosts as identified above, and the modification of /etc/hosts only occurs when the user customizes the hostname using config.vm.hostname = "...". A potentially easier solution is to omit that configuration and set the hostname via a custom shell script instead. Thank you for understanding! 😄

johnoooo commented 8 years ago

Hmm ... okay, it is your tool. Leaving it as-it-is is not a good option, as afelle1 and me and some other authors asking questions about this on stackexchange. It is really a waste of time to track down this issue when it happens -- and it happens more often than you assume (I would say).

Just to give an example, how not-rare this case is: https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide

They use in their quick-start guide a small network setup of some head nodes and some worker nodes to test/explore a local Hadoop setup. The only reason why they (intentionally or by accident) did not run into this trap is because they simply overwrite the whole /etc/hosts file from outside after the creation of each box (which is one of the mentioned workarounds as soon as you are aware about the problem).

As soon as you use any smoother update mechanism for /etc/hosts, e.g., a manipulation by a script just adding more entries or (which I did not test) a configuration management system, you will be trapped.

robinbowes commented 8 years ago

This really is a braindead vagrant "feature". It means the guest OS does not behave as expected.

At the very least, there should be an option to disable the behaviour.

Grrrr.

FlorianHeigl commented 8 years ago

it's true that not every vagrant user is a sysadmin, but we have to deal with the outfall everywhere, for example since those users try to replicate this on real servers. devs try to build their cloud images by the vagrant example. etc. so, if not every user knows how to get this right, then it is your problem, or their problem, but you're making it other people's problem.

you make things easy at the first look, but you're burning productivity elsewhere. only distinction is that people who break things aren't the ones who notice and need to debug it.

so, if they come here and tell you about a problem, please understand the actual issue is a few 100 times bigger. Anyway. in my case it doesn't even work.

I wish I had never ever looked into why.

Echnaton70 commented 7 years ago

I can't believe, why it is not possible to deactivate this feature with a simple option.

In my opinion vagrant move away for professional using.

lmf-mx commented 7 years ago

@sethvargo Apologies if I'm missing some context in your reply earlier.

The modification of /etc/hosts happens whether or not config.vm.hostname is set. Even worse when set.

On the xenial64 box I tried, you end up with two 127.0.0.1 lines, one for localhost and one for the Vagrantfile setting, and one for 127.0.1.1 which looks to be pulled from the directory name.

olehermanse commented 6 years ago

I was annoyed by this "feature" today. My workaround is to add this to provisioning script (bootstrap.sh) :

# Remove entries created by vagrant:
sed -i "/$(hostname)/d" /etc/hosts

# Optionally add what you want:
echo "" >> /etc/hosts
grep 'buildslave' /etc/hosts || echo '192.168.100.100 buildslave' >> /etc/hosts
echo "" >> /etc/hosts

Removing the hostname entry created by vagrant, and then adding what I want. (buildslave is the hostname of one of my VMs, I want this added to /etc/hosts/ on all machines).

Having a built in option to disable vagrant editing /etc/hosts would be better.

GastonGonzalez commented 6 years ago

I ran into this issue as well. I am using Vagrant with Ansible as the provisioner. Under my common tasks I added the following Ansible task. This simply removes loopback address entry that matches the Ansible hostname that is being provisioned.

- name: prevent hostname from binding to the loopback address
  command: sed -i '/127.0.0.1\t{{ansible_hostname}}\t{{ansible_hostname}}/d' /etc/hosts
  ignore_errors: true
  changed_when: true
andyfeller commented 6 years ago

So I think the major issue here is that the change_host_name capability assumes the IP (in other words, 127.0.0.1) we want the hostname associated with and the ideal would be expanding the capability for the desired, optional IP indicated within the Vagrantfile:

module VagrantPlugins
  module GuestDebian
    module Cap 
      class ChangeHostName
        def self.change_host_name(machine, name)
          comm = machine.communicate

          if !comm.test("hostname -f | grep '^#{name}$'", sudo: false)
            basename = name.split(".", 2)[0]
            comm.sudo <<-EOH.gsub(/^ {14}/, '') 
              # Set the hostname
              echo '#{basename}' > /etc/hostname
              hostname -F /etc/hostname

              if command -v hostnamectl; then
                hostnamectl set-hostname '#{basename}'
              fi

              # Prepend ourselves to /etc/hosts
              grep -w '#{name}' /etc/hosts || {
                sed -i'' '1i 127.0.0.1\\t#{name}\\t#{basename}' /etc/hosts
              }

              # Update mailname
              echo '#{name}' > /etc/mailname

              # Restart hostname services
              if test -f /etc/init.d/hostname; then
                /etc/init.d/hostname start || true
              fi

              if test -f /etc/init.d/hostname.sh; then
                /etc/init.d/hostname.sh start || true
              fi
            EOH
          end 
        end 
      end 
    end 
  end
end
brianjmurrell commented 6 years ago

I have to agree also that this behaviour of adding the VM's hostname to the loopback address is very nasty behaviour.

While I don't disagree with the general sentiment of "ease of use", that cannot come at the expense of doing things that are just wrong, which this business of adding the hostname to the loopback address most definitely is.

Please find another way to provide your "ease of use" without making very braindead configuration hacks that simply don't work in the real world.

taqtiqa-mark commented 6 years ago

I think the canonical Vagrant approach is to use a plugin to handle special features, either the current behavior or the desired behavior raised by people in this thread.

Given Vagrant won't change.... it seems that the desired "new" behavior should be encapsulated in a plugin, such as vagrant-hostmanager.

Has anyone above used the vagrant-hostmanager and found that it falls short?

brianjmurrell commented 6 years ago

@taqtiqa-mark But Vagrant, as it is, making this invalid change to the hosts file must change. It's a broken configuration that is coming out of the box (NPI) by default.

If this broken configuration is desired for some "ease of use" use-cases, then this broken configuration should be moved into the/a plugin and chosen by users who want it.

I shouldn't have to go undoing broken configuration settings.

taqtiqa-mark commented 6 years ago

@brianjmurrell . I believe Vagrant dev's ack. you are technically right - the issue is tagged as a bug. I can accept some bugs live-on because the technically "right-thing" would break too many "more-important-things". Its a std cost-benefit trade-off - again the issue has been ack as a bug.

So.... vagrant-hostmanager works? If so maybe make a documentation pull request to bring this issue to the surface. If there are other "more-important-things" that would benefit from the "right-thing" being default behavior you are likely to swing the cost-benefit calculus holding back a change in the next major version release.

Personally vagrant-hostmanager works for me, but I don't have sufficient expertise to say that what it does should become the default behavior - maybe it too is not doing the "right-thing" ;)

Please link this issue to any documentation pull requests - I like to see what ends up being considered best practice.

robinbowes commented 6 years ago

The problem is not the default behaviour, the problem is that the behaviour can not be turned off.

As I said on 2nd Sep, 2016:

This really is a braindead vagrant "feature". It means the guest OS does not behave as expected. At the very least, there should be an option to disable the behaviour.

taqtiqa-mark commented 6 years ago

@robinbowes, my understanding is that the whole point of the plugin architecture is so that various default behaviors can be turned off, and even something else turned on in its place.

Again, I'll ask:

robinbowes commented 6 years ago

vagrant-hostmanager does a different thing.

It may be possible, by reciting the correct incantations with the wind behind you and the moon in the correct phase, to overwrite the braindead vagrant "feature" of writing the hostname to the loopback address. But, as @brianjmurrell has already stated, twice, we shouldn't have to use plugins to fix broken stuff.

If hashicorp want vagrant to add the hostname to loopback address then add that capability as a plugin. Hell, you could even enable it by default, but give us some way to easily turn it off if we don't want that behaviour.

meshy commented 6 years ago

To see some (mostly unsuccessful) attempts to work around this issue, see

meshy commented 6 years ago

@sethvargo in light of the comments that have been made here since your last comment, would you be amenable to re-opening this issue?

thomasbratt commented 5 years ago

A few people have hit this in my company as well.

We set up vagrant to use a non-loopback network interface and set the hostname... and then wonder why any service configured to bind to the hostname is inaccessible from outside the guest.

So vagrant must be failing a large class of users - those that want to bring up a VM installed with software that exposes a network port outside of the guest.

Please consider making this not broken by default :)

IceKickr commented 5 years ago

This issue is making Vagrant very difficult to work with.

MBRHSastec commented 5 years ago

what if I have an @ip with a corresponding domain to add to the hosts file ?

adamelliotfields commented 4 years ago

I understand this is an old (and closed) issue, but it is a top Google search result so I'll add my solution in the hopes that it might help somebody else.

I'm using Vagrant to run Rancher Kubernetes. In your cluster.yml there is a hostname_override property to identify your nodes by hostname instead of IP address. On Vagrant VirtualBox, these host names will end up getting NAT'ed to 10.0.2.15.

NAME             STATUS   ROLES               AGE   VERSION   INTERNAL-IP
rancher-master   Ready    controlplane,etcd   55s   v1.16.2   10.0.2.15
rancher-worker   Ready    worker              53s   v1.16.2   10.0.2.15

In my case, I want the internal IP to be on the eth1 interface (private network with static IP).

To fix it, I added the following lines to my provision script:

ip_address=$(ip addr show eth1 | grep -w inet | awk '{ sub("/.*", "", $2); print $2 }')
sed -i "/$HOSTNAME/c\\$ip_address\t$HOSTNAME" /etc/hosts

When I wanted to remove the entry from the hosts file, I used this:

grep -v "$HOSTNAME" /etc/hosts | tee /etc/hosts > /dev/null

I'm using the bento/debian-10.1 box, so you might need to alter these commands if you're using a different Linux distribution.

cvquesty commented 4 years ago

The RFC-stated purpose of the loopback address space (127.0.0.0/8) is for on-host communications only:

127.0.0.0/8 - This block is assigned for use as the Internet host loopback address. A datagram sent by a higher-level protocol to an address anywhere within this block loops back inside the host. This is ordinarily implemented using only 127.0.0.1/32 for loopback. As described in [RFC1122], Section 3.2.1.3, addresses within the entire 127.0.0.0/8 block do not legitimately appear on any network anywhere.

As such, those who are setting an IP in their Vagrant host intended for the FQDN of that host, Vagrant itself is already parsing the Vagrantfile. It should look to see if the FQDN or $(HOSTNAME) is present or the user is using vagrant-hosts, DO NOTHING. If there is no reference to the FQDN or an IP, then feel free to amend the loopback with the full knowledge you're essentially breaking IETF recommendations, and can have no confidence that at any moment something could go horribly awry.

Follow the RFCs and best practices folks. A LOT of PhD's with a lot more knowledge than any of us made that library for our use for a reason. Let's not break the internet in the name of "ease of use"

ulidtko commented 4 years ago

This breaks tools: jconsole, visualvm

At the risk of beating up the dead horse: tools such as jconsole, visualvm can work remotely over TCP, in a kind of FTP-style protocol: server listens on a well-known port, and upon client connection, sends them a

here's another port on my IP, connect there

message. Sure thing, the "my IP" part is computed via getaddrinfo(gethostname()) fashion... which, with the braindead logic that Vagrant does, turns into

here's another port on 127.0.0.1, connect there

message being sent to the client. How the client (a tool running on dev machine) fails next, is obvious in retrospect — but infuriatingly confusing to figure out from a misleading "Connection refused" error.

Consider that both those tools are used primarily for debugging — so at this point one almost always becomes busy troubleshooting somebody else's Vagrantfile while troubleshooting a network problem while troubleshooting a JVM problem while debugging some Real Issue™.

The claimed ease of use leads to nasty stuff becoming much worse for those of us who get it sorted out.


Our current workaround is this:

-  config.vm.hostname = vm_fqdn
+  config.vm.provision "shell", inline: "hostnamectl set-hostname #{vm_fqdn}"

Works great. But I feel absolutely obliged to pile up a concrete real Vagrant-caused breakage of debug tools. Please, don't break my debug tools! Please?..

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.