bedrocklinux / bedrocklinux-userland

This tracks development for the things such as scripts and (defaults for) config files for Bedrock Linux
https://bedrocklinux.org
GNU General Public License v2.0
608 stars 64 forks source link

Networking is disabled in Alpine #113

Open caydenlund opened 5 years ago

caydenlund commented 5 years ago

Bedrock Linux seems to break the networking in Alpine. This manifests itself in both a hijacked Alpine install (with no other strata) and a fetched Alpine stratum on an existing Bedrock install. Additionally, I've confirmed Alpine's networking is stable across reboots before installation of Bedrock, but I can't seem to make it work after Bedrock is installed. Note that networking still works on other existing inits on one install.

I've tested this behavior on several different virtual machines, both in QEMU and in VirtualBox. Always x86_64 architecture. Never on a physical device.

Release 0.7.3.

paradigm commented 5 years ago

After:

a freshly fetched Alpine's seems to work fine for me in QEMU/KVM. You said you tried it with a fetched Alpine. Can you give me a concrete list of steps to reproduce the issue with that workflow? Was it missing one of the items I've listd above. I do not recall why Bedrock did not set up networking on boot on fetch. That might be an oversight I need to fix.

When I get the chance, I will install Alpine in a VM and hijack it to see if I run into issues there.

caydenlund commented 5 years ago

Here are the steps that I took.

  1. Boot Alpine ISO in a new QEMU/KVM instance.
  2. Run setup-alpine with most "expected" values (dhcp, install to disk, etc.).
  3. Reboot without the ISO attached.
  4. Install Bedrock 0.7.3.
  5. Reboot.
  6. At this point, networking is still working just fine. Make no changes.
  7. Reboot again.
  8. No longer attached to internet. Run rc-update add networking sysinit.
  9. Reboot again.
  10. Still no internet access.

Note: /etc/network/interfaces is unchanged from the initial install:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp
    hostname ___
paradigm commented 5 years ago

Before handing the control off to the specified init, Bedrock makes several directories bind mounts, including /run. This is done to ensure things like sockets are global (i.e., visible to all strata) so that processes from one stratum can communicate with a daemon from another.

Some /etc/fstab management software only creates mounts if the given file path is not already a mount point, presumably to remain idempotent. Upon seeing Bedrock's bind mounts, they skip mounting the corresponding entries. To ensure things like a separate /home partition are mounted, Bedrock will mount /etc/fstab itself before handing control off to the specified init.

It seems Alpine's OpenRC (and possibly others, I've not checked) is hard-coded to mount tmpfs at /run in /lib/rc/sh/init.sh. Like /etc/fstab mounting software, this sees Bedrock's bind mount and skips mounting /run. However, unlike /etc/fstab, Bedrock does not know about it, and so it does not create the mount point itself. As a result, /run is not a tmpfs and not cleared on reboot. Upon rebooting, Alpine's networking stack sees a file in /run indicating the previous session's networking was enabled, and so it does not enable networking again.

A work around you may apply for the time being is to create a /etc/fstab entry to the effect of:

tmpfs /run tmpfs rw,nodev,nosuid 0 0

(edited typo'd /tmp above to /run)

Upon next reboot, Bedrock will create the mount point and Alpine's networking software will realize it is a fresh session and initialize networking.

I will read through /lib/rc/sh/init.sh to familiarize myself with all of the mount points Alpine's OpenRC is hard-coded to create. Provided I do not find any potential concerns, I currently plan to have Bedrock create these mount points itself.

caydenlund commented 5 years ago

I follow you there. You're absolutely right; it seems that the persistence of \run was causing the issue. Even simply clearing that directory upon system shutdown removes the issue I described. Assuming you meant \run instead of \tmp, the additional fstab entry does the job. On a hijacked Alpine install, nothing more was needed.

On a newly fetched Alpine stratum, however, I found that a couple more manual workarounds were necessary for me. Note that I took these steps on multiple virtual machines to ensure that it wasn't a one-time fluke. First of all, it seems that Alpine Looks for /etc/network/ on boot in the location that we know as /bedrock/strata/alpine/etc/network/. I ended up created a symbolic link to make it easier to work with, but I could have modified the file directly from that location. After that, init logged that it wasn't able to start the networking service because it could not find the udhcpc command. However, once I logged into the machine, udhcpc was in $PATH. I ended up creating a symbolic link from /usr/bin/udhcpc to /bedrock/cross/bin/udhcpc to solve the issue, and that worked fine, but it seems a bit messy. What am I missing here? It seems that in your case, you simply fetched the stratum, populated the interfaces, and added the networking service to init. Is there some kind of configuration step that I'm missing?

paradigm commented 5 years ago

I follow you there. You're absolutely right; it seems that the persistence of \run was causing the issue. Even simply clearing that directory upon system shutdown removes the issue I described. Assuming you meant \run instead of \tmp, the additional fstab entry does the job. On a hijacked Alpine install, nothing more was needed.

Yes, I meant run. Happy to hear that's working for you.

On a newly fetched Alpine stratum, however, I found that a couple more manual workarounds were necessary for me. Note that I took these steps on multiple virtual machines to ensure that it wasn't a one-time fluke. First of all, it seems that Alpine Looks for /etc/network/ on boot in the location that we know as /bedrock/strata/alpine/etc/network/. I ended up created a symbolic link to make it easier to work with, but I could have modified the file directly from that location. After that, init logged that it wasn't able to start the networking service because it could not find the udhcpc command. However, once I logged into the machine, udhcpc was in $PATH. I ended up creating a symbolic link from /usr/bin/udhcpc to /bedrock/cross/bin/udhcpc to solve the issue, and that worked fine, but it seems a bit messy. What am I missing here? It seems that in your case, you simply fetched the stratum, populated the interfaces, and added the networking service to init. Is there some kind of configuration step that I'm missing?

I figured it out.

The difference between my working setup and yours was that my root user's login shell came from alpine, where as yours probably came from another stratum that did not have udhcpc available locally. For example, maybe you hijacked a distro that uses bash as the root user's default login shell, but your fetched alpine does not provide bash, so Bedrock gets it from another stratum.

A work around you can apply for now is to either change your root user's shell to sh or install whatever shell you want your root user to have in the alpine stratum. Maybe also set priority = init in bedrock.conf.

I'm working on a new feature which will remove the need to set priority = init for this scenario at some point in the future. However, I do not currently have any ideas to make this situation just work without manual intervention to ensure the configured shell is available.

When you get the chance, please test the above mentioned work around to see if it resolves the issue for you. Assuming it does, please leave this issue open as a reminder for me to document the need for a work around here for Alpine's networking.

caydenlund commented 5 years ago

In a new virtual machine, changing root's shell to sh resulted in successful network initialization. After changing it back to bash, installing bash from Alpine's repositories resulted in successful networking. Note that all that worked without changing bedrock.conf.

caydenlund commented 5 years ago

Also, I ended up adding network to the etc list in bedrock.conf to make that directory global, instead of creating a symlink. No need for further intervention.

paradigm commented 5 years ago

Also, I ended up adding network to the etc list in bedrock.conf to make that directory global, instead of creating a symlink. No need for further intervention.

I wasn't sufficiently familiar with that file to know if that was safe. Do you know that other distros won't have conflicting expectations there?

For example, a relatively fresh netinst install of Debian contains a /etc/network/interfaces containing:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

and an empty /etc/network/interfaces.d.

I copied these to Alpine's local paths, and it errored, seemingly unhappy that it was being asked to source a glob that does not expand.

It could be worth experimenting here. Maybe I can have Bedrock enforce that directory exists and touch an empty file in it, for example. If there's an easy/clean way to make this file global so a fetched alpine can inherit it from another distro and remove the need for it to be manually set up, that would certainly be preferable.

paradigm commented 5 years ago

0.7.4 mounts /run, resolving part of the issue here.

The need for the root shell to be installed in Alpine stands. I have yet to find a good way to resolve that situation automatically. While I usually prefer to have Bedrock creatively work around issues than require upstream changes, the specific combination of requirements here may be difficult to work around cleanly. I might try to dig into its source and upstream something.