iximiuz / docker-to-linux

Make bootable Linux disk image (ab)using Docker
https://iximiuz.com/en/posts/from-docker-container-to-bootable-linux-disk-image/
659 stars 92 forks source link

Make filesystem writable before booting #19

Open ljleb opened 3 years ago

ljleb commented 3 years ago

I've been trying to install docker in the alpine dockerfile, on my local copy of the repository:

FROM alpine:latest
RUN apk add --no-cache linux-virt openrc docker
RUN rc-update add docker boot
RUN echo "root:root" | chpasswd

The problem I have currently is that, when the VM boots, the docker service cannot start. It seems to be unable to create mandatory files:

1623574205

I know that I can make the filesystem writeable and successfully start the service with this script:

mount -o remount,rw /
service docker start

However, I simply cannot afford to be forced to run this every time I want to use the image! Simply rebooting makes the filesystem read-only again...

Is there a way to tell qemu to mount the file system as rw, or update the image configuration from the builder container to automate this process?

ljleb commented 3 years ago

By the way, I tried to put:

RUN mount -o remount,rw /

at the end of the dockerfile, to no avail (I guess it was to be expected):

Step 4/6 : RUN rc-update add docker boot
 ---> Using cache
 ---> 82c2f7ed6fa5
Step 5/6 : RUN echo "root:root" | chpasswd
 ---> Using cache
 ---> 2da68b62cc55
Step 6/6 : RUN mount -o remount,rw /
 ---> Running in 986f633c3cfb
mount: can't find / in /proc/mounts
The command '/bin/sh -c mount -o remount,rw /' returned a non-zero code: 1
Makefile:19: recipe for target 'linux.tar' failed
ljleb commented 3 years ago

I had encountered this problem a while ago actually. I just remembered that I had been looking on google/stack overflow and eventually had came up with this solution:

echo -e "\n[Make filesystem writable]"
FS_ROW="$(blkid | awk -F\" "/$(basename "${LOOPDEVICE}")/ {print \"UUID=\"\$2\"\011/\011ext4\011defaults\0110\0110\"}")"
echo -e "fstab entry is '${FS_ROW}'"
echo -e "${FS_ROW}" > /os/mnt/etc/fstab

Putting this between the "[Configure grub]" and the "[Unmount]" sections basically adds an entry in /etc/fstab for the device UUID related to $LOOPDEVICE.

However, I can't recall if this script ever worked properly. I just happened to find it in an old clone of this repository.

In any case, whether this was working or not: it does not work right now. If this script ever worked before, then I believe what caused it to stop working is the switch from grub to syslinux.

[Edit:]

I also tried to use /dev/sda1, because it seems to be hardcoded in syslinux.cfg:

echo -e '/dev/sda1\011/\011ext4\011defaults,rw\011\060\011\060' >> /os/mnt/etc/fstab

It didn't work with /dev/sda1 nor with /dev/sda.

I also found that 2 entries already existed in /mnt/ect/fstab:

/dev/cdrom      /media/cdrom    iso9660 noauto,ro 0 0
/dev/usbdisk    /media/usb      vfat    noauto,ro 0 0

I also tried replacing ro with rw using sed -i 's/,ro/,rw/' /os/mnt/etc/fstab, but the filesystem stayed readonly. I guess /dev/cdrom and /dev/usbdisk don't have anything to do with /.

ljleb commented 3 years ago

Another thing I just realized is that syslinux.cfg contains this line:

  APPEND ro root=/dev/sda1 rootfstype=ext4 initrd=/boot/initramfs-virt

Should replacing ro with rw fix the issue? It doesn't seem to change anything (I tried it, still getting Read-only filesystem errors).

ljleb commented 3 years ago

I tried with the debian and ubuntu images, and now the file system is mounted as rw in each case! My first approach was the right one:

echo -e "\n[Make filesystem writable]"
blkid | awk -F\" "/$(basename "${LOOPDEVICE}")/ {print \"UUID=\"\$2\" / ext4 defaults 0 0\"}" >> /os/mnt/etc/fstab

Only, it doesn't seem to work on alpine the same way it does on debian. Any clue as to why /etc/fstab entries would be ignored in alpine?

ljleb commented 3 years ago

I found a workaround for alpine!

I simply created a script remount.start to be ran by the local service of openRC:

#!bin/sh
mount -o remount,rw /

Then I let docker copy the script over and add the local service to the boot runlevel:

COPY alpine/remount.start /etc/local.d/remount.start
RUN chmod +x /etc/local.d/remount.start
RUN rc-update add local boot

With this configuration, when openRC starts, docker fails it's initial startup. But then, after the local service started, docker successfully starts (and upon rebooting, it never fails to start again because the files it tries to create in its initial startup are already there):

1623633417

The issue with this workaround is that the file system is mounted read-write way too late in the boot process. Remounting the root file system should not be necessary: it should be initially mounted read-write to begin with.

suntong commented 3 years ago

then I believe what caused it to stop working is the switch from grub to syslinux.

Yeah, that's what I suspect in #17 too.

The problem I have currently is that, when the VM boots, the docker service cannot start

Hmm... Let me get 100% clear on this, you whole system is in VM? and when your VM boots, its docker service doesn't start automatically each time?

However, I simply cannot afford to be forced to run this every time I want to use the image! Simply rebooting makes the filesystem read-only again...

By "force run this" you meant force start your docker service each time when your VM boots? Please summarize your whole setup for the next person to easily understand. thx.

it doesn't seem to work on alpine the same way it does on debian. Any clue as to why /etc/fstab entries would be ignored in alpine?

See my question in #17, in which I think the UUID is wrong.

Should replacing ro with rw fix the issue?

Don't do that even it could. because ro is the standard practice here.

ljleb commented 3 years ago

Yeah, that's what I suspect in #17 too.

Actually, I was wrong. What was happening was alpine didn't want to use the configuration in /etc/fstab for some reason when booting the root file system. That solution (using UUIDs) worked flawlessly with debian and ubuntu. (I'll come back to you on the UUID matter in #17)

Hmm... Let me get 100% clear on this, you whole system is in VM? and when your VM boots, its docker service doesn't start automatically each time?

What I meant by "VM" is actually qemu. I've been trying to start a qemu VM of the alpine image (into which I pre-installed docker from within the alpine Dockerfile of this repository using apk add docker). In the screen capture of the OP, you can see that the docker service is failing to start because the file system is read-only.

By "force run this" you meant force start your docker service each time when your VM boots? Please summarize your whole setup for the next person to easily understand. thx.

Sorry if I haven't been super specific in my explanations, I'll try my best to clear things up here.

Every time I started my custom alpine image with qemu, the docker openRC service failed to start. The docker service appears to need to create configuration files for it to work properly, and it couldn't because the file system was mounted read-only.

Because of this, I had to manually remount the file system read-write and restart openRC's failed init scripts by issuing service docker start every time I booted / rebooted the image (whether with qemu or on bare metal). I did not want to have to write these commands periodically (and I still don't want that), so I created this issue and started trying to fix it.

I hope I've been clearer now!

ro is the standard practice here.

I should have said this way earlier I believe, but I actually have no clue what I'm doing :) I've been messing around with small details and hoped for the best when booting an image.

TL;DR: I don't know what this value is for. I switched it back to ro locally when I recognized changing it wasn't having any effect on the "writableness" of the root file system during the docker openRC's init script.

suntong commented 3 years ago

Oh thanks for the detailed explanation.

I don't know alpine to make any comment, but in Debian's term, that openRC service is happening too early in the boot up sequence.

Debian's using systemd, in which you can specify when a certain service can be started, e.g., after network is ready or DNS is ready etc. Again, I don't know how to related that to alpine, just hoping that maybe it might ring a bell for you.

ljleb commented 3 years ago

I finally discovered why the root file system is not getting remounted as read-write upon booting.

OpenRC init scripts are located under /etc/init.d. The service responsible for remounting the root file system, the root service, cannot start automatically. The key lies in these lines, from the OP logs:

Service `hwdrivers' needs non existent service `dev'
Service `machine-id' needs non existent service `dev'

The root service depends on the dev service, which curiously does not exist under /etc/init.d. However, starting root manually:

/etc/init.d/root start

remounts the root file system as read-write flawlessly. Note that I could not make the root service start automatically when OpenRC enters the boot runlevel.

I have no idea whether the dev service is supposed to be generated or installed with an alpine package. I don't know what it is supposed to do. I don't even know where to find the sources for this, as looking on any search engine doesn't yield a single useful result (whoever thought that "dev" was a good name for a common service file were probably drunk or something...).

that openRC service is happening too early in the boot up sequence.

I believe you are right in saying that the docker service is starting too early in the boot sequence. I found online that using the "default" runlevel instead of the "boot" runlevel was more standard for user services, so I changed that. I still couldn't get the root service to start by itself however.

ljleb commented 3 years ago

I found that installing the udev package removes the warnings about 'dev' being not found, even though the file system stays in read-only and still has to be remounted manually.

suntong commented 3 years ago

The root service depends on the dev service

I don't know how the root service depends on the dev service in alpine but try to do the same thing and make OpenRC service depends on the root service.

Adphi commented 2 years ago

I also struggled with this problem...

Comparing with an alpine vm, I found that the dev service in use is not udev but mdev. It is provided by this package: busybox-initscripts which also provides syslog among other busybox based services.

But the real problem comes from the fact that init scripts use keywords that filter the steps according to environments. In this case, docker:

$ grep -r 'docker' /etc/init.d/

/etc/init.d/sysfs:      keyword -docker -lxc -prefix -systemd-nspawn -vserver
/etc/init.d/root:       keyword -docker -jail -lxc -openvz -prefix -systemd-nspawn -vserver
/etc/init.d/binfmt:     keyword -docker -lxc -openvz -prefix -systemd-nspawn -vserver
/etc/init.d/devfs:      keyword -docker -prefix -systemd-nspawn -vserver
/etc/init.d/save-keymaps:       keyword -docker -lxc -openvz -prefix -systemd-nspawn -uml -vserver -xenu
/etc/init.d/hwclock:    keyword -docker -lxc -openvz -prefix -systemd-nspawn -uml -vserver -xenu
/etc/init.d/procfs:     keyword -docker -lxc -openvz -prefix -systemd-nspawn -vserver
/etc/init.d/localmount: keyword -docker -jail -lxc -prefix -systemd-nspawn -vserver
/etc/init.d/hostname:   keyword -prefix -lxc -docker
/etc/init.d/termencoding:       keyword -docker -lxc -openvz -prefix -systemd-nspawn -uml -vserver -xenu
/etc/init.d/consolefont:        keyword -docker -lxc -openvz -prefix -systemd-nspawn -uml -vserver -xenu
/etc/init.d/numlock:    keyword -docker -lxc -openvz -prefix -systemd-nspawn -vserver
/etc/init.d/dmesg:      keyword -docker -lxc -prefix -systemd-nspawn -vserver
/etc/init.d/networking: keyword -jail -prefix -vserver -docker
/etc/init.d/fsck:       keyword -docker -jail -lxc -openvz -prefix -systemd-nspawn -timeout -vserver -uml
/etc/init.d/swap:       keyword -docker -jail -lxc -openvz -prefix -systemd-nspawn -vserver
/etc/init.d/netmount:   keyword -docker -jail -lxc -prefix -systemd-nspawn -vserver
/etc/init.d/mount-ro:   keyword -docker -lxc -openvz -prefix -systemd-nspawn -vserver
/etc/init.d/save-termencoding:  keyword -docker -lxc -openvz -prefix -systemd-nspawn -uml -vserver -xenu
/etc/init.d/net-online: keyword -docker -jail -lxc -openvz -prefix -systemd-nspawn -uml -vserver
/etc/init.d/swclock:    keyword -docker -lxc -openvz -prefix -systemd-nspawn -uml -vserver -xenu
/etc/init.d/urandom:    keyword -docker -jail -lxc -openvz -prefix -systemd-nspawn
/etc/init.d/cgroups:    keyword -docker -prefix -systemd-nspawn -vserver

So the solution is actually pretty simple: delete /.dockerenv 😁

To have the alpine's default configuration install alpine-base instead of openrc.

And one last thing, if you want the network service to start, it needs an empty file at /etc/network/interfaces.

Adphi commented 2 years ago

I forgot one thing, you have to enable the services needed for the boot process. So you may want to add this to the Dockerfile:

RUN for s in bootmisc hostname hwclock modules networking swap sysctl urandom syslog; do rc-update add $s boot; done
RUN for s in devfs dmesg hwdrivers mdev; do rc-update add $s sysinit; done
ljleb commented 2 years ago

Thanks for the help @Adphi, I want to test your proposed fix. I'll try to come back to this thread when I find some free time.

iximiuz commented 2 years ago

Thanks, @Adphi! This is super helpful!

Adphi commented 2 years ago

@iximiuz many thanks for all your work on this. I did a little go program based on this project: d2vm The Dockerfile used for alpine based image is here

iximiuz commented 2 years ago

Nice! I've been planning to turn this project into more or less production-ready Go binary but have had very little success finding time so far. Kudos for making it happen!

Adphi commented 2 years ago

A few days ago I was playing with the Alpine's mkimage and found that in the init script:

https://github.com/alpinelinux/mkinitfs/blob/224826dcee28425a81bae099ade87fad797a5674/initramfs-init.in#L642-L669

if [ -f "$sysroot/etc/.default_boot_services" -o ! -f "$ovl" ]; then
    # add some boot services by default
    rc_add devfs sysinit
    rc_add dmesg sysinit
    rc_add mdev sysinit
    rc_add hwdrivers sysinit
    rc_add modloop sysinit

    rc_add modules boot
    rc_add sysctl boot
    rc_add hostname boot
    rc_add bootmisc boot
    rc_add syslog boot

    rc_add mount-ro shutdown
    rc_add killprocs shutdown
    rc_add savecache shutdown

    rc_add firstboot default

    # add openssh
    if [ -n "$KOPT_ssh_key" ]; then
        pkgs="$pkgs openssh"
        rc_add sshd default
    fi

    rm -f "$sysroot/etc/.default_boot_services"
fi

So, in order to have a fully working open-rc configuration, all its needs is:

iximiuz commented 2 years ago

That's helpful, @Adphi! Thanks for sharing!

Tythos commented 1 month ago

<3 everything in this thread, especially @ljleb for following up on his own discoveries.