kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.5k stars 1.56k forks source link

support remote podman #2233

Open afrittoli opened 3 years ago

afrittoli commented 3 years ago

What happened:

Kind checks for cgroup v2 support on my local host - where cgroup v2 is not supported:

$ KIND_EXPERIMENTAL_PROVIDER=podman kind create cluster -v7
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
ERROR: failed to create cluster: running kind with rootless provider requires cgroup v2, see https://kind.sigs.k8s.io/docs/user/rootless/
Stack Trace:
sigs.k8s.io/kind/pkg/errors.New
    sigs.k8s.io/kind/pkg/errors/errors.go:28
sigs.k8s.io/kind/pkg/cluster/internal/create.validateProvider
    sigs.k8s.io/kind/pkg/cluster/internal/create/create.go:250
sigs.k8s.io/kind/pkg/cluster/internal/create.Cluster
    sigs.k8s.io/kind/pkg/cluster/internal/create/create.go:70
sigs.k8s.io/kind/pkg/cluster.(*Provider).Create
    sigs.k8s.io/kind/pkg/cluster/provider.go:183
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.runE
    sigs.k8s.io/kind/pkg/cmd/kind/create/cluster/createcluster.go:80
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.NewCommand.func1
    sigs.k8s.io/kind/pkg/cmd/kind/create/cluster/createcluster.go:55
github.com/spf13/cobra.(*Command).execute
    github.com/spf13/cobra@v1.1.1/command.go:850
github.com/spf13/cobra.(*Command).ExecuteC
    github.com/spf13/cobra@v1.1.1/command.go:958
github.com/spf13/cobra.(*Command).Execute
    github.com/spf13/cobra@v1.1.1/command.go:895
sigs.k8s.io/kind/cmd/kind/app.Run
    sigs.k8s.io/kind/cmd/kind/app/main.go:53
sigs.k8s.io/kind/cmd/kind/app.Main
    sigs.k8s.io/kind/cmd/kind/app/main.go:35
main.main
    sigs.k8s.io/kind/main.go:25
runtime.main
    runtime/proc.go:225
runtime.goexit
    runtime/asm_amd64.s:1371

What you expected to happen:

I expected to be able to create kind cluster on my remote host via the podman provider.

How to reproduce it (as minimally and precisely as possible):

Run a fedora33 host via vagrant - this is my vagrant file.

Set these env variables:

export CONTAINER_HOST=ssh://vagrant@127.0.0.1:2222/run/podman/podman.sock
export CONTAINER_SSHKEY=/Users/andreafrittoli/tools/podman/.vagrant/machines/default/virtualbox/private_key

Build kind:

make
make install

Create a cluster

KIND_EXPERIMENTAL_PROVIDER=podman kind create cluster -v9

Anything else we need to know?:

As far as I can tell the cgroup validation was introduced in https://github.com/kubernetes-sigs/kind/pull/2129#discussion_r599190220 . The validation looks at the local host, possibly because podman info does not provide enough information?

This is what podman info returns for me:

host:
  arch: amd64
  buildahVersion: 1.20.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.27-2.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 8
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.8.15-301.fc33.x86_64
  linkmode: dynamic
  memFree: 3197255680
  memTotal: 4117811200
  ociRuntime:
    name: crun
    package: crun-0.19.1-2.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.19.1
      commit: 1535fedf0b83fb898d449f9680000f729ba719f5
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 0
  swapTotal: 0
  uptime: 2h 48m 40.84s (Approximately 0.08 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 1
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.1.2
  Built: 1619097693
  BuiltTime: Thu Apr 22 13:21:33 2021
  GitCommit: ""
  GoVersion: go1.15.8
  OsArch: linux/amd64
  Version: 3.1.2

Environment:

$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.15.7
BuildVersion:   19H1030

$ uname -a
Darwin <hostname> 19.6.0 Darwin Kernel Version 19.6.0: Mon Apr 12 20:57:45 PDT 2021; root:xnu-6153.141.28.1~1/RELEASE_X86_64 x86_64

Server: Version: 3.1.2 API Version: 3.1.2 Go Version: go1.15.8 Built: Thu Apr 22 14:21:33 2021 OS/Arch: linux/amd64

afrittoli commented 3 years ago

/area provider/podman

afrittoli commented 3 years ago

It looks like podman (at least in v3.1.2) provides similar information compared to docker about cgroups:

$ docker info | grep -i cgroup
 Cgroup Driver: cgroupfs
 Cgroup Version: 1

$ podman info | grep -i cgroup
  cgroupManager: systemd
  cgroupVersion: v2
afrittoli commented 3 years ago

This is what we expect from docker:

type dockerInfo struct {
    CgroupDriver    string   `json:"CgroupDriver"`  // "systemd", "cgroupfs", "none"
    CgroupVersion   string   `json:"CgroupVersion"` // e.g. "2"
    MemoryLimit     bool     `json:"MemoryLimit"`
    PidsLimit       bool     `json:"PidsLimit"`
    CPUShares       bool     `json:"CPUShares"`
    SecurityOptions []string `json:"SecurityOptions"`
}

For podman, we do get cgroup driver and version, as well as security options:

    "cgroupManager": "systemd",
    "cgroupVersion": "v2",
(...)
    "security": {
      "apparmorEnabled": false,
      "capabilities": "CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT",
      "rootless": false,
      "seccompEnabled": true,
      "selinuxEnabled": true
    },

but I still don't see anywhere the info about memory limit, pids limit and cpu shares support? The only place that info is used is in the validate code, so I wonder if we could echo a warning when those are not available, and go on with the cluster creation?

BenTheElder commented 3 years ago

We've never supported remote podman. At least in older versions none of that info was present.

Supporting remote podman is a feature. Podman support is also still experimental because it still regularly breaks due to breaking changes in the underlying tool.

BenTheElder commented 3 years ago

Remote hosts are also pretty low priority for us. KIND is meant to run clusters locally, and you could always run the kind command on the remote host directly.

There are other things we can't do reliably remotely, such as pick out a fixed random port available on the host.

see also #1778

afrittoli commented 3 years ago

Remote hosts are also pretty low priority for us. KIND is meant to run clusters locally, and you could always run the kind command on the remote host directly.

The reason why I care about remove execution is to give a seamless experience to mac and windows users. kind works fine with docker + remote execution, so I will give a try to docker rootless as an alternative to podman. Running kind on the remote host is an option of course, but it would not integrate so smoothly on the development workflow.

There are other things we can't do reliably remotely, such as pick out a fixed random port available on the host.

see also #1778

Thanks, good to know.

afrittoli commented 3 years ago

We've never supported remote podman. At least in older versions none of that info was present.

Supporting remote podman is a feature. Podman support is also still experimental because it still regularly breaks due to breaking changes in the underlying tool.

Heh, it sounds like I'm not going to start using it for CI then 😅

aojea commented 3 years ago

/assign

afrittoli commented 3 years ago

This is what I hacked together to get past the failing check https://github.com/kubernetes-sigs/kind/pull/2235 With this patch in kind now tries to create a cluster, but it gets stuck on ⠈⡱ Writing configuration 📜 :)

aojea commented 3 years ago

ah lovely, thanks, let me take a look

afbjorklund commented 3 years ago

For minikube, this was two different scenarios. Whether it was "really remote", or if it was "locally remote".

We have the regular case when running on Linux, then it is really running on the same machine - and fine. Then there is Docker Desktop, which tries its hardest to pretend that it is running locally (storage, networking)

This case was more when running on macOS and Windows, but without the full integration being present. So it was more like Docker Machine, where you also have a VM running but it is less transparent to the user ?


The new podman has a new feature called "podman machine" (not to be confused with my old "podman-machine")

It is starting a CoreOS VM locally, and is trying to get closer to the Docker Desktop experience... It's not really there yet, but the goal is that it would look like it is running locally - same as docker.

I'm not sure yet if we want to support it, at least not before it has storage and networking. https://github.com/kubernetes/minikube/issues/8003

But at least it has taken some steps getting closer to running Podman "on" macOS, I suppose ? So maybe it will eventually deserve a special case, different from the true remote - like with Vagrant.

$ podman machine --help
Manage a virtual machine

Description:
  Manage a virtual machine. Virtual machines are used to run Podman.

Usage:
  podman machine [command]

Available Commands:
  init        Initialize a virtual machine
  list        List machines
  rm          Remove an existing machine
  ssh         SSH into an existing machine
  start       Start an existing machine
  stop        Stop an existing machine
$ podman machine init
Downloading VM image: fedora-coreos-34.20210503.1.0-qemu.x86_64.qcow2.xz: done  
Extracting compressed file
$ podman machine start
Waiting for VM ...
$ podman machine list
NAME                     VM TYPE     CREATED         LAST UP
podman-machine-default*  qemu        50 seconds ago  Currently running
afbjorklund commented 3 years ago

In case someone is interested in running Podman on macOS, here are some more links:

Linking to the original information at the deprecated upstream project ("boot2podman"):

aojea commented 3 years ago

thanks to @ncdc for keep investigating, it seems this will be fixed by https://github.com/containers/podman/issues/11528#issuecomment-967205856