hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.17k stars 4.43k forks source link

`vagrant up` fails if the hostname needs to be changed on Alpine #10584

Closed maxbrunet closed 5 years ago

maxbrunet commented 5 years ago

Vagrant version

2.2.3

Host operating system

Linux (openSUSE Tumbleweed)

Guest operating system

Alpine Linux 3.8

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "generic/alpine38"
  config.vm.hostname = "myhostname"
end

Debug output

https://gist.github.com/maxbrunet/7872a0c2f7ad87ce2a08b7e0438f83a0

Expected behavior

Alpine VM is up and requested hostname is set.

Actual behavior

vagrant up fails on Setting hostname....

==> default: Setting hostname...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

# Save current hostname saved in /etc/hosts
CURRENT_HOSTNAME_FULL="$(hostname -f)"
CURRENT_HOSTNAME_SHORT="$(hostname -s)"

# New hostname to be saved in /etc/hosts
NEW_HOSTNAME_FULL='myhostname'
NEW_HOSTNAME_SHORT="${NEW_HOSTNAME_FULL%%.*}"

# Update sysconfig
sed -i 's/\(HOSTNAME=\).*/\1myhostname/' /etc/sysconfig/network

# Set the hostname - use hostnamectl if available
if command -v hostnamectl; then
  hostnamectl set-hostname --static 'myhostname'
  hostnamectl set-hostname --transient 'myhostname'
else
  hostname 'myhostname'
fi

# Update ourselves in /etc/hosts
if grep -w "$CURRENT_HOSTNAME_FULL" /etc/hosts; then
  sed -i -e "s/( )$CURRENT_HOSTNAME_FULL( )/$NEW_HOSTNAME_FULL/g" -e "s/( )$CURRENT_HOSTNAME_FULL$/$NEW_HOSTNAME_FULL/g" /etc/hosts
fi
if grep -w "$CURRENT_HOSTNAME_SHORT" /etc/hosts; then
  sed -i -e "s/( )$CURRENT_HOSTNAME_SHORT( )/$NEW_HOSTNAME_SHORT/g" -e "s/( )$CURRENT_HOSTNAME_SHORT$/$NEW_HOSTNAME_SHORT/g" /etc/hosts
fi

# Restart network
service network restart

Stdout from the command:

127.0.0.1   localhost.localdomain localhost localhost.localdomain localhost
::1     localhost localhost.localdomain
127.0.0.1   localhost.localdomain localhost localhost.localdomain localhost
::1     localhost localhost.localdomain

Stderr from the command:

sed: /etc/sysconfig/network: No such file or directory
 * service: service `network' does not exist

Steps to reproduce

  1. Run vagrant up with the provided Vagrantfile

References

There's this comment mentioning a similar error: https://github.com/hashicorp/vagrant/issues/2745#issuecomment-50225776 But there's other issues, it seems that changing the hostname is generally problematic.

The network service is called networking in Alpine and it looks like it uses the ChangeHostName class of GuestALT, is it the right guest class? Maybe it should use GuestLinux which doesn't support it, and skip the task or error as not supported.

https://github.com/hashicorp/vagrant/blob/37dc3dc6489e2a0ecc7b20ca73719e8c1ce2a4e2/plugins/guests/alt/cap/change_host_name.rb#L39

Also the script doesn't seem to fail on the first error (sed on /etc/sysconfig/network), maybe it should test if the file exists first and/or use something like set -e to be able to fail in middle.

ladar commented 5 years ago

An Alpine guest plugin sure would be nice.

colonelpopcorn commented 5 years ago

There is one, but it seems to have trouble setting the hostname anyway. You can probably submit a fix for network service not existing here

This also could be an issue with this particular box.

ladar commented 5 years ago

Not sure where to patch, and not much a Ruby programmer, but I thought some pseudo code might help someone else fix this issue. The above looks like shell logic, and if it is, I can update directly, if someone points me at the right file. In terms of how to fix this issue:

First, vagrant should check for the existence of the /etc/sysconfig/network file before trying to modify it via sed. That is the correct file on most systems (Red Hat and it's children for example), but not the correct file for Alpine.

If the file is missing (or perhaps it should work regardless of the previous step), vagrant should check for the existence of the /etc/network/ directory, and it exists, check for the existence of the /etc/network/interfaces file. If the file exists, run the sed find/replace operation on that file.

Finally, it should check for the existence of the /etc/init.d/networking file. If it exists, it should run service networking restart instead of the default service network restart command. It could also run the script directly via /etc/init.d/networking restart and avoid any service command dependency.

P.S. I noticed the logic checks for the existence of the hostnamectl before defaulting to hostname when it does the update, but skips this step at the top, where it always relies on the hostname -f and hostname -s commands to work.

chrisroberts commented 5 years ago

@ladar Hi! I had asked about the explicit guest identification as ALT in a previous issue but I can seem to find it at the moment. Anyway, yes, there is an alpine plugin to provide guest caps, and this issue seems to be a box issue. This line:

https://github.com/lavabit/robox/blob/master/tpl/generic-alpine35.rb#L35

results in the guest being identified incorrectly as ALT linux. With the vagrant-alpine plugin installed, and overriding the guest to alpine: config.vm.guest = :alpine should get things working.

Cheers!

ladar commented 5 years ago

@chrisroberts much obliged for the fix. I didn't realize vagrant started bundling the Alpine guest agent. I will make the change so it gets used in the next box build.

In the future, you're welcome to submit a pull request, you know, so you get credit for the legwork!

chrisroberts commented 5 years ago

@ladar Vagrant itself doesn't ship alpine guest support currently (just want to make sure there's no confusion here). The vagrant-alpine plugin provides the guest capabilities and needs to be installed for it to work correctly. And depending on how you want your boxes setup, you can require the plugin to be installed locally via the included Vagrantfile if you want (https://www.vagrantup.com/docs/vagrantfile/vagrant_settings.html#config-vagrant-plugins)

ladar commented 5 years ago

@chrisroberts thank you for the clarification. I wasn't aware you could trigger the auto installation of a plugin via an embedded Vagrantfile. I ran an experiment and noticed that if I require the plugin the setup process gets halted with a prompt; which is a good thing for security. Is it possible for me to check whether the vagrant-alpine plugin is installed and then use config.vm.guest = :alpine if it is, and config.vm.guest = :alt if it isn't?

ladar commented 5 years ago

I should say, good for security, but sadly, it probably isn't the right solution for a base box, as it will break automated systems which rely on the image.

chrisroberts commented 5 years ago

Is there a reason why it would fallback to ALT linux and not just the default linux? And for automated systems, they can use this: https://www.vagrantup.com/docs/other/environmental-variables.html#vagrant_install_local_plugins

ladar commented 5 years ago

Is there a reason why it would fallback to ALT linux and not just the default linux?

It seems I was only dictating :alt for the libvirt version of the box. I don't recall precisely why (it was way back in 2017). I searched my git commit history (for Robox and the repo I use to test the boxes) and found a reference which indicated :linux was causing vagrant to report an error.

https://github.com/lavabit/robox/commit/58dad25386a0a892a01a6f6ed8d0664e9873a380

and

https://github.com/lavabit/robox/commit/791189e85dadf1ab743899e566dc300018edf655

And for automated systems, they can use this: https://www.vagrantup.com/docs/other/environmental-variables.html#vagrant_install_local_plugins

What I meant, is that if I were to release an Alpine robox which required the vagrant-alpine plugin it would break all of the automated systems already in place, which aren't expecting this dependency (including my own).

Even if that weren't the case, I try not release robox builds which fail/error/complain to work "out of the box", if I can help it (ie know about an issue and can find a workaround). But setting the guest agent type to :alpine only if the plugin was available would be just fine, as it fits the "graceful degradation" paradigm.

chrisroberts commented 5 years ago

Just a side note on Vagrantfile usage:

https://github.com/lavabit/robox/blob/master/tpl/generic-alpine35.rb#L35

This isn't setting the guest to :alt for the libvirt provider only. It is setting it globally due to the use of config. To have it isolated to only the libvirt provider, you would want to use the override variable you have defined which would look like:

override.vm.guest = :alt
ladar commented 5 years ago

@chrisroberts thank you! My ruby syntax skills are embarrassingly primitive. I corrected the Alpine templates and pushed an updated template.

Unfortunately I'm already in the middle of a robox build (ala v1.9.10) so it's too late for me to incorporate the change thius time. Sadly, it takes 2+ days to build all 450+ images, so restarting is rather painful. But it should make it into the next release.

With any luck, I'll remember to actually test it out before I release the images! I'd like to also run a test and see if the vagrant-libvirt plugin even requires the :alt agent with Alpine. It's possible the issue was fixed already.

ladar commented 5 years ago

@chrisroberts I've also been wondering... is there a reason vagrant doesn't bundle an Alpine guest agent? It's one of my more popular boxes, and fits the vagrant ethos perfectly...

chrisroberts commented 5 years ago

@ladar Only because support was introduced as a plugin. I've been noticing an uptick in usage as well and will likely inquire with the author soon about the possibility of migrating it into vagrant proper.

pzlds commented 5 years ago

EDIT: Sorry for the rant, that wasn't appropriate. I'll rephrase my question to be a bit nicer (although it should probably go into a robox-specific issue anyways).

@ladar Would it be possible to include a warning for the cases where the OS type is overwritten with alt?

ladar commented 5 years ago

@pzlds because at some point not setting it to alt caused the provisioning process, aka vagrant up to fail entirely on some of the platforms I support. As I recall, with the default setting, the plugin went looking for files that didn't exist/work on Alpine. It might have been lsb_release (which my boxes now fake, but Alpine doesn't include), or some other etc config file.

Setting the type as alt fixed the issue. Perhaps with improvements to the vagrant provisioning process, it might be possible to switch that config option, without an issue. An Alpine plugin would be ideal, but it isn't installed by default... and I didn't want to release boxes that require external dependencies.

A new type might work, but because I'm travelling, I can't easily build all 6 Alpine variants, for all 5 providers, and then subsequently test all 30 images, to ensure the bug is resolve, so I've left the existing config as-is for the time being.

Perhaps adding box specific notes to the robox repo wiki section on this issue, and the workarounds could suffice for now.

ladar commented 5 years ago

I created a vagrant pull request ... https://github.com/hashicorp/vagrant/pull/11000 ... which should fix this issue (at least it will for the robox Alpine releases).

Can anyone test it?

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.