k3d-io / k3d

Little helper to run CNCF's k3s in Docker
https://k3d.io/
MIT License
5.39k stars 458 forks source link

[FEATURE] Podman support #84

Closed kkimdev closed 2 years ago

kkimdev commented 5 years ago

Podman is a Docker drop-in alternative https://podman.io/ and it fixed some architecture issues that Docker has, e.g., no daemon, rootless.

More info: https://developers.redhat.com/articles/podman-next-generation-linux-container-tools/

serverwentdown commented 2 years ago
[jdanek@fedora ~]$ podman --version
podman version 3.4.4

You'll need to upgrade to Podman v4. You can use the COPR if using Fedora: https://podman.io/blogs/2022/03/06/why_no_podman4_f35.html

jiridanek commented 2 years ago

@jegger can you try with podman 4? when I try it still does not work:

[root@fedora jdanek]# ~jdanek/Downloads/k3d-linux-amd64 cluster create
INFO[0000] Prep: Network                                
INFO[0000] Created network 'k3d-k3s-default'            
INFO[0000] Created image volume k3d-k3s-default-images  
INFO[0000] Starting new tools node...                   
INFO[0000] Starting Node 'k3d-k3s-default-tools'        
INFO[0001] Creating node 'k3d-k3s-default-server-0'     
INFO[0001] Creating LoadBalancer 'k3d-k3s-default-serverlb' 
INFO[0001] Using the k3d-tools node to gather environment information 
INFO[0001] HostIP: using network gateway 10.89.0.1 address 
INFO[0001] Starting cluster 'k3s-default'               
INFO[0001] Starting servers...                          
INFO[0001] Starting Node 'k3d-k3s-default-server-0'     
INFO[0005] All agents already running.                  
INFO[0005] Starting helpers...                          
INFO[0005] Starting Node 'k3d-k3s-default-serverlb'     
INFO[0012] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap... 
INFO[0014] Cluster 'k3s-default' created successfully!  
INFO[0014] You can now use it like this:                
kubectl cluster-info
[root@fedora jdanek]# kubectl cluster-info
Kubernetes master is running at https://0.0.0.0:37659

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: net/http: TLS handshake timeout
jiridanek commented 2 years ago
$ sudo podman logs k3d-k3s-default-server-0 |& grep luks
W0417 08:04:02.831384       2 fs.go:214] stat failed on /dev/mapper/luks-63cca6c4-98e1-467a-b8ee-acfac51b19ca with error: no such file or directory

I have btrfs LVM on LUKS, so I suspect redhat-et/microshift#629, kubernetes-sigs/kind#2411 could be a problem in k3s as well.

jiridanek commented 2 years ago

Also,

$ sudo podman exec -it k3d-k3s-default-server-0 kubectl logs pod/svclb-traefik-dj9dk lb-port-80 -n kube-system
+ trap exit TERM INT
+ echo 10.43.161.171
+ grep -Eq :
+ cat /proc/sys/net/ipv4/ip_forward
+ '[' 1 '!=' 1 ]
+ iptables -t nat -I PREROUTING '!' -s 10.43.161.171/32 -p TCP --dport 80 -j DNAT --to 10.43.161.171:80
iptables v1.8.4 (legacy): can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
jiridanek commented 2 years ago

Following the instructions from the kind issue (and loading iptables), I now got

# modprobe iptable-nat
# ~jdanek/Downloads/k3d-linux-amd64 cluster create --volume '/dev/mapper/luks-63cca6c4-98e1-467a-b8ee-acfac51b19ca:/dev/mapper/luks-63cca6c4-98e1-467a-b8ee-acfac51b19ca@server:0' --volume '/dev/dm-0:/dev/dm-0@server:0'

this allows k3s to start inside the containers, and I can use it with podman exec

$ sudo podman exec -it k3d-k3s-default-server-0 kubectl get nodes

but I cannot use it with k3d kubeconfig get config from my host machine

[jdanek@fedora ~]$ sudo netstat -nlpt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      2149/cupsd          
tcp       10      0 0.0.0.0:42451           0.0.0.0:*               LISTEN      133312/conmon       
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      1631/systemd-resolv 
tcp        0      0 127.0.0.54:53           0.0.0.0:*               LISTEN      1631/systemd-resolv 
tcp        0      0 0.0.0.0:5355            0.0.0.0:*               LISTEN      1631/systemd-resolv 
tcp6       0      0 ::1:631                 :::*                    LISTEN      2149/cupsd          
tcp6       0      0 :::5355                 :::*                    LISTEN      1631/systemd-resolv 
[jdanek@fedora ~]$ sudo podman exec -it k3d-k3s-default-server-0 kubectl get nodes
NAME                       STATUS   ROLES                  AGE   VERSION
k3d-k3s-default-server-0   Ready    control-plane,master   11m   v1.22.7+k3s1
[jdanek@fedora ~]$ sudo ~/Downloads/k3d-linux-amd64 kubeconfig get --all > k3s.conf
[jdanek@fedora ~]$ KUBECONFIG=k3s.conf kubectl get nodes
^C
[jdanek@fedora ~]$ KUBECONFIG=k3s.conf kubectl cluster-info
Kubernetes master is running at https://0.0.0.0:42451

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: net/http: TLS handshake timeout
serverwentdown commented 2 years ago

@jiridanek Thanks for the debugging work! It seems your cluster has already started. Can you also confirm that the generated kubeconfig is correct (#1045). You can paste it (with credentials redacted) here.

jiridanek commented 2 years ago

@serverwentdown

---
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: 
    server: https://0.0.0.0:42451
  name: k3d-k3s-default
contexts:
- context:
    cluster: k3d-k3s-default
    user: admin@k3d-k3s-default
  name: k3d-k3s-default
current-context: k3d-k3s-default
kind: Config
preferences: {}
users:
- name: admin@k3d-k3s-default
  user:
    client-certificate-data: 
    client-key-data: 
serverwentdown commented 2 years ago

@jiridanek I'll have to attempt to create a fresh VM to reproduce this, but I can only do that on Saturday. I'd suggest to try some things that might fix the connection problem:

jiridanek commented 2 years ago

I pretty much did the steps above as part of the Fedora 35 -> 36 upgrade, so I guess I'll have to wait for you to investigate. One thing I suspected was the https://0.0.0.0 address, which I know some tools don't accept as meaning "localhost" for purposes of connection (they are ok to listen on 0.0.0.0, but refuse to connect to 0.0.0.0). But I am not positive this is actually causing any problems here.

manics commented 2 years ago

I can reproduce the problem on a clean Fedora 35 system using Vagrant: https://github.com/manics/k3s-rootless/tree/main/k3d-podman-root

[root@fedora ~]# podman ps
CONTAINER ID  IMAGE                               COMMAND               CREATED        STATUS            PORTS                    NAMES
721101e65215  docker.io/rancher/k3s:v1.22.7-k3s1  server --tls-san ...  7 minutes ago  Up 7 minutes ago                           k3d-k3s-default-server-0
b8d49b05a75e  ghcr.io/k3d-io/k3d-proxy:5.4.1                            7 minutes ago  Up 7 minutes ago  0.0.0.0:46543->6443/tcp  k3d-k3s-default-serverlb

[root@fedora ~]# curl https://localhost:46543 -m 5 
curl: (28) Operation timed out after 5001 milliseconds with 0 out of 0 bytes received

[root@fedora ~]# podman exec k3d-k3s-default-serverlb curl -sk https://localhost:6443
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "Unauthorized",
  "reason": "Unauthorized",
  "code": 401
}

Note port-forwarding works fine outside k3d:

[root@fedora ~]# podman run -d --name nginx -P docker.io/library/nginx        0c3e864e37cf0bf92ebe5634915f82d227b3a666f9b6038f07ba3bb4813ad240

[root@fedora ~]# podman ps
CONTAINER ID  IMAGE                               COMMAND               CREATED         STATUS             PORTS                    NAMES
721101e65215  docker.io/rancher/k3s:v1.22.7-k3s1  server --tls-san ...  10 minutes ago  Up 10 minutes ago                           k3d-k3s-default-server-0
b8d49b05a75e  ghcr.io/k3d-io/k3d-proxy:5.4.1                            10 minutes ago  Up 10 minutes ago  0.0.0.0:46543->6443/tcp  k3d-k3s-default-serverlb
0c3e864e37cf  docker.io/library/nginx:latest      nginx -g daemon o...  1 second ago    Up 2 seconds ago   0.0.0.0:34653->80/tcp    nginx

[root@fedora ~]# curl http://localhost:34653
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...
manics commented 2 years ago

It looks like the k3d-k3s-default-serverlb container doesn't have an IP address in the output of podman inspect ("IPAddress":"), and there's also a warning about a missing /var/run:

[root@fedora ~]# podman inspect k3d-k3s-default-serverlb -f '{{json .NetworkSettings}}'
WARN[0000] Could not find mount at destination "/var/run" when parsing user volumes for container 0c8ae183c8e3fca92a1a07930a0aa5cb5bd9dba0228eb645393a0c418f95a4a1
{"EndpointID":"","Gateway":"","IPAddress":"","IPPrefixLen":0,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"","Bridge":"","SandboxID":"","HairpinMode":false,"LinkLocalIPv6Address":"","LinkLocalIPv6PrefixLen":0,"Ports":{"6443/tcp":[{"HostIp":"0.0.0.0","HostPort":"42791"}],"80/tcp":null},"SandboxKey":"/run/netns/netns-42e8574e-ffba-7388-cdcd-6c771a24ad79","Networks":{"k3d-k3s-default":{"EndpointID":"","Gateway":"10.89.0.1","IPAddress":"10.89.0.7","IPPrefixLen":24,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"da:c9:28:93:5a:c7","NetworkID":"k3d-k3s-default","DriverOpts":null,"IPAMConfig":null,"Links":null,"Aliases":["0c8ae183c8e3"]}}}

An IP address is seen inside the container:

[root@fedora ~]# podman exec k3d-k3s-default-serverlb ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether da:c9:28:93:5a:c7 brd ff:ff:ff:ff:ff:ff
    inet 10.89.0.7/24 brd 10.89.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::d8c9:28ff:fe93:5ac7/64 scope link
       valid_lft forever preferred_lft forever

For comparison my nginx container has an IP visible to podman inspect and no warning:


[root@fedora ~]# podman inspect nginx -f '{{json .NetworkSettings}}' 
{"EndpointID":"","Gateway":"10.88.0.1","IPAddress":"10.88.0.4","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"e6:55:1f:76:5b:bf","Bridge":""
,"SandboxID":"","HairpinMode":false,"LinkLocalIPv6Address":"","LinkLocalIPv6PrefixLen":0,"Ports":{"80/tcp":[{"HostIp":"","HostPort":"34653"}]},"SandboxKey":"/run/netns/netns-0bcf8132-131f-
85d0-a046-3dcd64aa492e","Networks":{"podman":{"EndpointID":"","Gateway":"10.88.0.1","IPAddress":"10.88.0.4","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0
,"MacAddress":"e6:55:1f:76:5b:bf","NetworkID":"podman","DriverOpts":null,"IPAMConfig":null,"Links":null,"Aliases":["0c3e864e37cf"]}}}
``
archseer commented 1 year ago

This seems to be a netavark issue. I've gotten the same problem as @manics after NixOS switched their networking stack over to netavark and everything started working again once I've switched back to cni. Fedora switched to netavark already much earlier last year.

archseer commented 1 year ago

Since this is a closed issue we should probably open a separate ticket