kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
28.93k stars 4.83k forks source link

Make custom image for KIC similar to minikube ISO #6942

Open afbjorklund opened 4 years ago

afbjorklund commented 4 years ago

Currently we are using the KIND base image for the KIC driver:

https://github.com/kubernetes/minikube/blob/master/hack/images/kicbase.Dockerfile

# which is an ubuntu 19.10 with an entry-point that helps running systemd
# could be changed to any debian that can run systemd
FROM kindest/base:v20200122-2dfe64b2 as base

This image (docker.io/kindest/base) is in turn based on Ubuntu:

https://github.com/kubernetes-sigs/kind/blob/master/images/base/Dockerfile

# start from ubuntu 19.10, this image is reasonably small as a starting point
# for a kubernetes node image, it doesn't contain much we don't need
FROM ubuntu:19.10

As with the other base images, this one starts from a rootfs tarball:

https://hub.docker.com/_/ubuntu

FROM scratch
ADD ubuntu-eoan-core-cloudimg-amd64-root.tar.gz /

See https://docs.docker.com/develop/develop-images/baseimages/


We might want to investigate using the same Linux that we are using for minikube.iso, a custom distribution built using Buildroot which also includes systemd and the runtimes.

https://github.com/kubernetes/minikube/tree/master/deploy/iso/minikube-iso

The main difference would be that the regular ISO also includes a kernel (currently 4.19), while the container image uses the host kernel (so it doesn't need to waste all that space)

Links:

There are lots of other small tricks needed, in order to make a running container image. Including a lot of workaround and hacks, to be able to run systemd in a a container...

As per above, you can see several of these in the original Ubuntu image (vs Ubuntu ISO), as well as from the KIND and KIC projects respectively. Some of these will need to be added.

https://github.com/kubernetes-sigs/kind/blob/master/images/base/files/usr/local/bin/entrypoint

afbjorklund commented 4 years ago

ping @medyagh

medyagh commented 4 years ago

This sounds like a good idea to do.

I support to have an option to run kic drivers with buildroot image in docker. at least as an alternative, that users could choose. same way that users choose their base ISO_URL we should be able to support different base images either, Ubuntu (kind base) or Buildroot

medyagh commented 4 years ago

@afbjorklund would you be intereted to take this? could I add this to the milestone?

afbjorklund commented 4 years ago

@medyagh : I will investigate, but don't know yet if it is reasonable to do in two weeks

afbjorklund commented 4 years ago

I would like to postpone this for the minikube 2.0 release, to become a target for Buildroot 2020.02

afbjorklund commented 4 years ago

One thing that needs to be addressed here is settling on a kernel version to support. Our current virtual machines use either 4.19 or 5.4 kernels, which are a bit "too new"

So if you just create a image from the same rootfs, it will not run on Ubuntu 16.04 or 18.04 FATAL: kernel too old

This is because the default settings in the glibc build is to run with the current kernel only. --enable-kernel=$(call qstrip,$(BR2_TOOLCHAIN_HEADERS_AT_LEAST)) \

BR2_TOOLCHAIN_HEADERS_AT_LEAST="4.19"

Besides glibc, systemd now also requires Stack Smashing Protection

BR2_SSP_NONE=y
# BR2_SSP_REGULAR is not set
# BR2_SSP_STRONG is not set
# BR2_SSP_ALL is not set
BR2_TOOLCHAIN_HAS_SSP=y

We should be able to lower this to something like 3.10 (or perhaps 4.0): (systemd requires BR2_TOOLCHAIN_HEADERS_AT_LEAST_3_10)

http://www.linuxfromscratch.org/lfs/view/systemd/chapter06/glibc.html

             --enable-kernel=3.2                    \
             --enable-stack-protector=strong        \

Not sure if we miss any major features by doing so, but can check.

LTS Ubuntu Kernel
20.04 Focal Fossa 5.4
18.04 Bionic Beaver 4.15
16.04 Xenial Xerus 4.4
14.04 Trusty Tahr 3.13
12.04 Precise Pangolin 3.2+
10.04 Lucid Lynx 2.6.32

https://en.wikipedia.org/wiki/Ubuntu_version_history#Table_of_versions

Not sure what LinuxKit (Docker Desktop VM) has, but it's 4.9 or 4.14 So it would not run an image built for 4.19 either, much less 5.4 LTS...

medyagh commented 4 years ago

we dont have to support ubuntu, we could just do whatever kernel version that makes sense for us. the goal would be use same version of everything in ISO and Base Image

afbjorklund commented 4 years ago

we dont have to support ubuntu, we could just do whatever kernel version that makes sense for us.

This is the user laptop we are talking about... It would be perfectly fine with an arbitrary 4.0 as well.

afbjorklund commented 4 years ago

the goal would be use same version of everything in ISO and Base Image

There should be no major side effects, of keeping the ISO glibc more compatible with old kernels

afbjorklund commented 4 years ago

Note: systemd requires 3.10 rather than 3.2

BR2_PACKAGE_SYSTEMD_ARCH_SUPPORTS BR2_TOOLCHAIN_HEADERS_AT_LEAST_3_10

It also requires glibc with stack-protector:

BR2_TOOLCHAIN_USES_GLIBC BR2_TOOLCHAIN_HAS_SSP


We have to bump it to 3.12, up from 3.10 (which was up from LFS 3.2)

WARNING: unmet direct dependencies detected for BR2_PACKAGE_LIBSECCOMP
  Depends on [n]: BR2_PACKAGE_LIBSECCOMP_ARCH_SUPPORTS [=y] && BR2_TOOLCHAIN_HEADERS_AT_LEAST_3_12 [=n]

It's needed by containerd and podman and crio (and also wanted by runc):

select BR2_PACKAGE_LIBSECCOMP

afbjorklund commented 4 years ago

There's some minor tweaks needed, for compatibility with busybox and other things. (like update-alternatives not being available - nor needed, or installing getent bin)

But other than, it seems to be "booting":

INFO: ensuring we can execute /bin/mount even with userns-remap
INFO: remounting /sys read-only
INFO: making mounts shared
INFO: fix cgroup mounts for all subsystems
INFO: clearing and regenerating /etc/machine-id
Initializing machine ID from random generator.
INFO: faking /sys/class/dmi/id/product_name to be "kind"
INFO: faking /sys/class/dmi/id/product_uuid to be random
INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
INFO: setting iptables to detected mode: legacy
[FAILED] Failed to listen on Docker Socket for the API.
See 'systemctl status docker.socket' for details.
[  OK  ] Listening on RPCbind Server Activation Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
[FAILED] Failed to start CRI-O Auto Update Script.
See 'systemctl status crio-wipe.service' for details.
[DEPEND] Dependency failed for Container Runtime Interface for OCI (CRI-O).
         Starting Shutdown CRI-O containers before shutting down the system...
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started Hyper-V FCOPY Daemon.
         Stopping Hyper-V FCOPY Daemon...
[  OK  ] Started Hyper-V Key Value Pair Daemon.
         Stopping Hyper-V Key Value Pair Daemon...
[  OK  ] Started Hyper-V VSS Daemon.
         Stopping Hyper-V VSS Daemon...
         Starting minikube automount...
         Starting Login Service...
[  OK  ] Started Shutdown CRI-O containers before shutting down the system.
[  OK  ] Stopped Hyper-V FCOPY Daemon.
[  OK  ] Stopped Hyper-V Key Value Pair Daemon.
[  OK  ] Stopped Hyper-V VSS Daemon.
         Starting Cleanup of Temporary Directories...
[  OK  ] Started Cleanup of Temporary Directories.
[  OK  ] Started minikube automount.
[  OK  ] Started Network Name Resolution.
[  OK  ] Reached target Network.
[  OK  ] Reached target Host and Network Name Lookups.
         Starting OpenSSH server daemon...
[FAILED] Failed to start Login Service.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Stopped Login Service.
         Starting Login Service...
[  OK  ] Started OpenSSH server daemon.
[  OK  ] Started Login Service.
[FAILED] Failed to start Wait for Network to be Configured.
See 'systemctl status systemd-networkd-wait-online.service' for details.
[  OK  ] Reached target Network is Online.
         Starting containerd container runtime...
         Starting NFS Mount Daemon...
         Starting NFS status monitor for NFSv2/3 locking....
[FAILED] Failed to start containerd container runtime.
See 'systemctl status containerd.service' for details.
[  OK  ] Reached target Multi-User System.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.
         Starting RPC bind service...
[  OK  ] Started RPC bind service.
[  OK  ] Started NFS Mount Daemon.
[  OK  ] Started NFS status monitor for NFSv2/3 locking..
         Starting NFS server and services...
[  OK  ] Started NFS server and services.
         Starting Notify NFS peers of a restart...
[  OK  ] Started Notify NFS peers of a restart.

The container runtimes failing to start is expected, because they ship unconfigured on the ISO.

That "systemd-networkd-wait-online" fails is not good (2 min timeout), but was also to be expected...

Systemd fails to understand that lo and the docker eth0 are online". So it is stuck in "pending":

[[0;4mIDX[[0m[[0;4m LINK[[0m[[0;4m TYPE    [[0m[[0;4m OPERATIONAL[[0m[[0;4m SETUP  [[0m
  1 lo   loopback [[0;1;32mcarrier    [[0m pending
147 eth0 ether    [[0;1;32mroutable   [[0m pending

2 links listed.

As usual with systemd, it also fails to understand that our terminal doesn't have any color support.

afbjorklund commented 4 years ago

This was another problem, not sure why though:

-- A start job for unit systemd-logind.service has begun execution.
May 21 20:29:03 412e878b8a55 systemd[222]: systemd-logind.service: Failed to set up mount namespacing: /run/systemd/unit-root/var/tmp: No such file or directory
May 21 20:29:03 412e878b8a55 systemd[222]: systemd-logind.service: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-logind: No such file or directory
-- Subject: Process /usr/lib/systemd/systemd-logind could not be executed

Or what implications this particular unit failure has.

afbjorklund commented 4 years ago

For some reason the initial output does not show on console:

systemd 244 running in system mode. (-PAM -AUDIT -SELINUX -IMA -APPARMOR -SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP -GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 -IDN -PCRE2 default-hierarchy=hybrid) Detected architecture x86-64.

And it seems like setting $container makes the network happier:

Detected virtualization docker.

[  OK  ] Started Network Service.
         Starting Wait for Network to be Configured...
         Starting Network Name Resolution...
[FAILED] Failed to start Login Service.  OK  ] Started Network Service.
         Starting Wait for Network to be Configured...
         Starting Network Name Resolution...
[FAILED] Failed to start Login Service.
See 'systemctl status systemd-logind.service' for details.
[  OK  ] Stopped Login Service.
         Starting Login Service...
[  OK  ] Started Network Name Resolution.
[  OK  ] Reached target Network.
[  OK  ] Reached target Host and Network Name Lookups.
         Starting OpenSSH server daemon...
[  OK  ] Started OpenSSH server daemon.
[  OK  ] Started Login Service.

Welcome to minikube
afbjorklund commented 4 years ago

docker import - buildroot < out/buildroot/output/images/rootfs.tar

Here is the current minikube.Dockerfile: (adopted from kindbase and kicbase)

FROM buildroot

COPY entrypoint /usr/local/bin/entrypoint
RUN chmod +x /usr/local/bin/entrypoint

# After installing packages we cleanup by:
# - removing unwanted systemd services
# - disabling kmsg in journald (these log entries would be confusing)
#
# Next we ensure the /etc/kubernetes/manifests directory exists. Normally
# a kubeadm debain / rpm package would ensure that this exists but we install
# freshly built binaries directly when we build the node image.
#
# Finally we adjust tempfiles cleanup to be 1 minute after "boot" instead of 15m
# This is plenty after we've done initial setup for a node, but before we are
# likely to try to export logs etc.
RUN echo "Ensuring scripts are executable ..." \
    && chmod +x /usr/local/bin/entrypoint \
 && echo "Installing Packages ..." \
    && find /lib/systemd/system/sysinit.target.wants/ -name "systemd-tmpfiles-setup.service" -delete \
    && rm -f /lib/systemd/system/multi-user.target.wants/* \
    && rm -f /etc/systemd/system/*.wants/* \
    && rm -f /lib/systemd/system/local-fs.target.wants/* \
    && rm -f /lib/systemd/system/sockets.target.wants/*udev* \
    && rm -f /lib/systemd/system/sockets.target.wants/*initctl* \
    && rm -f /lib/systemd/system/basic.target.wants/* \
    && echo "ReadKMsg=no" >> /etc/systemd/journald.conf \
 && echo "Ensuring /etc/kubernetes/manifests" \
    && mkdir -p /etc/kubernetes/manifests \
 && echo "Adjusting systemd-tmpfiles timer" \
    && sed -i /usr/lib/systemd/system/systemd-tmpfiles-clean.timer -e 's#OnBootSec=.*#OnBootSec=1min#'

# systemd exits on SIGRTMIN+3, not SIGTERM (which re-executes it)
# https://bugzilla.redhat.com/show_bug.cgi?id=1201657
STOPSIGNAL SIGRTMIN+3
# NOTE: this is *only* for documentation, the entrypoint is overridden later
ENTRYPOINT [ "/usr/local/bin/entrypoint", "/sbin/init" ]

USER docker
RUN mkdir /home/docker/.ssh
USER root
# kind base-image entry-point expects a "kind" folder for product_name,product_uuid
# https://github.com/kubernetes-sigs/kind/blob/master/images/base/files/usr/local/bin/entrypoint
RUN mkdir -p /kind

Note that minikube-automount will currently put it on the /boot partition (!).

afbjorklund commented 4 years ago

Some additional cleanup:

RUN rm -f /usr/sbin/minikube-automount \
    && echo '#!/bin/sh' > /usr/sbin/minikube-automount \
    && chmod +x /usr/sbin/minikube-automount
# Remove kernel modules
RUN rm -r /lib/modules/*
RUN systemctl enable sshd
afbjorklund commented 4 years ago

There are still a lot of assumptions about VM==Buildroot and KIC==Ubuntu in the code base :-(

// fastDetectProvisioner provides a shortcut for provisioner detection
func fastDetectProvisioner(h *host.Host) (libprovision.Provisioner, error) {
        d := h.Driver.DriverName()
        switch {
        case driver.IsKIC(d):
                return provision.NewUbuntuProvisioner(h.Driver), nil
        case driver.BareMetal(d):
                return libprovision.DetectProvisioner(h.Driver)
        default:
                return provision.NewBuildrootProvisioner(h.Driver), nil
        }
}

Maybe we should even make an Ubuntu ISO variant, just to try to iron some more of them out ?

I'm not sure how hard it will be, can probably reuse a lot of the packaging and some boot2docker:

https://github.com/tianon/boot2docker-debian

afbjorklund commented 4 years ago

Can make this available for early testing, but is not ready for public beta testing.

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 3 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 3 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/minikube/issues/6942#issuecomment-721841381): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
afbjorklund commented 3 years ago

/remove-lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten