k8snetworkplumbingwg / multi-networkpolicy-iptables

MultiNetworkPolicy iptable based implementation
Apache License 2.0
13 stars 20 forks source link

multi-networkpolicy containerd #11

Closed elgamal2020 closed 3 years ago

elgamal2020 commented 3 years ago

Hello,

I have tried to deploy multi-networkpolicy so first the pods can’t be scheduled as there was an error in the deployment

See Line 110 in deploy.yml 
  add: ["SYS_ADMIN", "SYS_NET_ADMIN"]

It should be NET_ADMIN , after correcting I could get pod scheduled but then they are in error mode

kubectl -n kube-system logs multi-networkpolicy-ds-amd64-tccgv I0916 01:25:36.063146 1 server.go:174] Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config. E0916 01:25:46.160911 1 pod.go:448] failed to get cri client: failed to connect: failed to connect to unix:///host/run/crio/crio.sock, make sure you are running as root and the runtime has been started: context deadline exceeded F0916 01:25:46.161146 1 main.go:62] cannot create pod change tracker

I tried also to run the Daemonset as root

Then I have changed the container run-time to docker also tried to add the containerd sock , not sure if it supported

Here below the mdofied deploy.yml

cat deploy.yml
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: multi-networkpolicy
rules:
  - apiGroups: ["k8s.cni.cncf.io"]
    resources:
      - '*'
    verbs:
      - '*'
  - apiGroups:
      - ""
    resources:
      - pods
      - namespaces
    verbs:
      - list
      - watch
      - get
  # Watch for changes to Kubernetes NetworkPolicies.
  - apiGroups: ["networking.k8s.io"]
    resources:
      - networkpolicies
    verbs:
      - watch
      - list
  - apiGroups:
      - ""
      - events.k8s.io
    resources:
      - events
    verbs:
      - create
      - patch
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: multi-networkpolicy
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: multi-networkpolicy
subjects:
- kind: ServiceAccount
  name: multi-networkpolicy
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: multi-networkpolicy
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: multi-networkpolicy-ds-amd64
  namespace: kube-system
  labels:
    tier: node
    app: multi-networkpolicy
    name: multi-networkpolicy
spec:
  selector:
    matchLabels:
      name: multi-networkpolicy
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        tier: node
        app: multi-networkpolicy
        name: multi-networkpolicy
    spec:
      hostNetwork: true
      nodeSelector:
        kubernetes.io/arch: amd64
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: multi-networkpolicy
      containers:
      - name: multi-networkpolicy
        # crio support requires multus:latest for now. support 3.3 or later.
        image: ghcr.io/k8snetworkplumbingwg/multi-networkpolicy-iptables:snapshot-amd64
        imagePullPolicy: Always
        command: ["/usr/bin/multi-networkpolicy-iptables"]
        args:
        - "--host-prefix=/host"
        # uncomment this if runtime is docker
        - "--container-runtime=docker"
        # change this if runtime is different that crio default
        - "--container-runtime-endpoint=unix:///run/containerd/containerd.sock"
        # uncomment this if you want to store iptables rules
        - "--pod-iptables=/var/lib/multi-networkpolicy/iptables"
        resources:
          requests:
            cpu: "100m"
            memory: "80Mi"
          limits:
            cpu: "100m"
            memory: "150Mi"
        securityContext:
          privileged: true
          runAsUser: 0
          capabilities:
            add: ["SYS_ADMIN", "NET_ADMIN"]
        volumeMounts:
        - name: host
          mountPath: /host
        - name: var-lib-multinetworkpolicy
          mountPath: /var/lib/multi-networkpolicy
      volumes:
        - name: host
          hostPath:
            path: /
        - name: var-lib-multinetworkpolicy
          hostPath:
            path: /var/lib/multi-networkpolicy

and the logs

kubectl -n kube-system logs multi-networkpolicy-ds-amd64-6jx6t
I0916 02:01:43.685347       1 server.go:174] Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config.
I0916 02:01:43.888556       1 options.go:73] hostname: sysarch-k8s-master-nf-1
I0916 02:01:43.888587       1 options.go:74] container-runtime: docker
I0916 02:01:43.889247       1 namespace.go:79] Starting ns config controller
I0916 02:01:43.889581       1 shared_informer.go:223] Waiting for caches to sync for ns config
I0916 02:01:43.889297       1 server.go:160] Starting network-policy-node
I0916 02:01:43.982086       1 networkpolicy.go:81] Starting policy config controller
I0916 02:01:43.982131       1 shared_informer.go:223] Waiting for caches to sync for policy config
I0916 02:01:43.990857       1 net-attach-def.go:84] Starting net-attach-def config controller
I0916 02:01:43.990892       1 shared_informer.go:223] Waiting for caches to sync for net-attach-def config
I0916 02:01:44.183237       1 shared_informer.go:230] Caches are synced for policy config
I0916 02:01:44.183753       1 server.go:355] OnPolicySynced
I0916 02:01:44.183788       1 shared_informer.go:230] Caches are synced for net-attach-def config
I0916 02:01:44.183800       1 server.go:388] OnNetDefSynced
I0916 02:01:44.282163       1 shared_informer.go:230] Caches are synced for ns config
I0916 02:01:44.282200       1 server.go:421] OnNamespaceSynced
I0916 02:01:44.282207       1 server.go:100] Starting pod config
I0916 02:01:44.283253       1 pod.go:123] Starting pod config controller
I0916 02:01:44.283277       1 shared_informer.go:223] Waiting for caches to sync for pod config
E0916 02:01:45.190935       1 pod.go:368] failed to get pod(nti-mesh/linkerd-sp-validator-85bc85dcb4-7ltr2) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
I0916 02:01:45.380851       1 shared_informer.go:230] Caches are synced for pod config
I0916 02:01:45.380917       1 server.go:324] OnPodSynced
E0916 02:01:45.384701       1 pod.go:368] failed to get pod(nti-mesh/linkerd-destination-658f496bcd-8fxww) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0916 02:01:45.385515       1 server.go:453] cannot get nti-security/ncms-cainjector-66d4c5db74-g6pt2 podInfo: not found
E0916 02:01:45.385574       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.385937       1 server.go:453] cannot get nti-monitoring/ntim-prometheus-68769b565d-w9dnb podInfo: not found
E0916 02:01:45.385986       1 server.go:453] cannot get ns1/nf0-backend-f9ccf678d-qc5k8 podInfo: not found
E0916 02:01:45.386255       1 server.go:453] cannot get ns0/nf0-tgen-7687f9456d-wt4fw podInfo: not found
E0916 02:01:45.386630       1 server.go:453] cannot get nti-security/ncms-cainjector-66d4c5db74-g6pt2 podInfo: not found
E0916 02:01:45.386678       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.483181       1 server.go:453] cannot get nti-monitoring/ntim-prometheus-68769b565d-w9dnb podInfo: not found
E0916 02:01:45.483237       1 server.go:453] cannot get ns1/nf0-backend-f9ccf678d-qc5k8 podInfo: not found
E0916 02:01:45.483255       1 server.go:453] cannot get ns0/nf0-tgen-7687f9456d-wt4fw podInfo: not found
E0916 02:01:45.483382       1 server.go:453] cannot get nti-monitoring/ntim-prometheus-68769b565d-w9dnb podInfo: not found
E0916 02:01:45.483393       1 server.go:453] cannot get ns1/nf0-backend-f9ccf678d-qc5k8 podInfo: not found
E0916 02:01:45.483409       1 server.go:453] cannot get ns0/nf0-tgen-7687f9456d-wt4fw podInfo: not found
E0916 02:01:45.483471       1 server.go:453] cannot get nti-security/ncms-cainjector-66d4c5db74-g6pt2 podInfo: not found
E0916 02:01:45.483505       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.487501       1 pod.go:368] failed to get pod(nti-monitoring/ntim-prometheus-68769b565d-w9dnb) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0916 02:01:45.487733       1 server.go:453] cannot get nti-security/ncms-cainjector-66d4c5db74-g6pt2 podInfo: not found
E0916 02:01:45.487762       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.487793       1 server.go:453] cannot get ns1/nf0-backend-f9ccf678d-qc5k8 podInfo: not found
E0916 02:01:45.487820       1 server.go:453] cannot get ns0/nf0-tgen-7687f9456d-wt4fw podInfo: not found
E0916 02:01:45.487902       1 server.go:453] cannot get nti-security/ncms-cainjector-66d4c5db74-g6pt2 podInfo: not found
E0916 02:01:45.487927       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.487954       1 server.go:453] cannot get ns1/nf0-backend-f9ccf678d-qc5k8 podInfo: not found
E0916 02:01:45.487976       1 server.go:453] cannot get ns0/nf0-tgen-7687f9456d-wt4fw podInfo: not found
E0916 02:01:45.488269       1 pod.go:368] failed to get pod(nti-security/ncms-cainjector-66d4c5db74-g6pt2) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0916 02:01:45.488530       1 pod.go:368] failed to get pod(ns1/nf0-backend-f9ccf678d-qc5k8) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0916 02:01:45.488980       1 pod.go:368] failed to get pod(ns0/nf0-tgen-7687f9456d-wt4fw) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0916 02:01:45.489225       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.489374       1 server.go:453] cannot get nti-security/ncms-controller-9d47b9bfc-lkkp5 podInfo: not found
E0916 02:01:45.489622       1 pod.go:368] failed to get pod(nti-security/ncms-controller-9d47b9bfc-lkkp5) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
s1061123 commented 3 years ago

Thank you for the comment, @elgamal2020

multi-networkpolicy requires not only NET_ADMIN also privileges for entering pod network namespace. The following error message shows multi-networkpolicy cannot get crio.sock (unix socket for container runtime).

E0916 01:25:46.160911 1 pod.go:448] failed to get cri client: failed to connect: failed to connect to unix:///host/run/crio/crio.sock, make sure you are running as root and the runtime has been started: context deadline exceeded

This might be comes from several reason. Please check following:

elgamal2020 commented 3 years ago

Thanks @s1061123 for your answer , no actually what is wrong there is "SYS_NET_ADMIN" not existing as a capability

I changed line 110 from

add: ["SYS_ADMIN", "SYS_NET_ADMIN"] to

        add: ["SYS_ADMIN", "NET_ADMIN"]

I have added to run as root

runAsUser: 0

but then I don't have cri , my setup is using docker engine and containerd run time

I would like to ask you what is the correct argument to try for nsenter

 nsenter -h

Usage:
 nsenter [options] <program> [<argument>...]

Run a program with namespaces of other processes.

Options:
 -t, --target <pid>     target process to get namespaces from
 -m, --mount[=<file>]   enter mount namespace
 -u, --uts[=<file>]     enter UTS namespace (hostname etc)
 -i, --ipc[=<file>]     enter System V IPC namespace
 -n, --net[=<file>]     enter network namespace
 -p, --pid[=<file>]     enter pid namespace
 -U, --user[=<file>]    enter user namespace
 -S, --setuid <uid>     set uid in entered namespace
 -G, --setgid <gid>     set gid in entered namespace
     --preserve-credentials do not touch uids or gids
 -r, --root[=<dir>]     set the root directory
 -w, --wd[=<dir>]       set the working directory
 -F, --no-fork          do not fork before exec'ing <program>
 -Z, --follow-context   set SELinux context according to --target PID

 -h, --help     display this help and exit
 -V, --version  output version information and exit
s1061123 commented 3 years ago

As I mentioned, currently the root cause of your issue is following message:

E0916 01:25:46.160911 1 pod.go:448] failed to get cri client: failed to connect: failed to connect to unix:///host/run/crio/crio.sock, make sure you are running as root and the runtime has been started: context deadline exceeded

So you need to capture right container runtime socket, correctly. This is not related to change its privilege. In docker case, you may need to change --container-runtime.

[tohayash@tohayash-srv multi-networkpolicy-iptables]$ ./multi-networkpolicy-iptables -h
TBD

Usage:
  multi-networkpolicy-node [flags]

Flags:
      --container-runtime RuntimeKind       Container runtime using for the cluster. Possible values: 'cri', 'docker'.  (default cri)
      --container-runtime-endpoint string   Path to cri socket.
      --kubeconfig string                   Path to kubeconfig file with authorization information (the master location is set by the master flag).
      --master string                       The address of the Kubernetes API server (overrides any value in kubeconfig)
      --hostname-override string            If non-empty, will use this string as identification instead of the actual hostname.
      --host-prefix string                  If non-empty, will use this string as prefix for host filesystem.
      --network-plugins strings             List of network plugins to be be considered for network policies. (default [macvlan])
      --pod-iptables string                 If non-empty, will use this path to store pod's iptables for troubleshooting helper.
      --add_dir_header                      If true, adds the file directory to the header
      --alsologtostderr                     log to standard error as well as files
      --log_backtrace_at traceLocation      when logging hits line file:N, emit a stack trace (default :0)
      --log_dir string                      If non-empty, write log files in this directory
      --log_file string                     If non-empty, use this log file
      --log_file_max_size uint              Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
      --logtostderr                         log to standard error instead of files (default true)
      --skip_headers                        If true, avoid header prefixes in the log messages
      --skip_log_headers                    If true, avoid headers when opening log files
      --stderrthreshold severity            logs at or above this threshold go to stderr (default 2)
  -v, --v Level                             number for the log level verbosity
      --vmodule moduleSpec                  comma-separated list of pattern=N settings for file-filtered logging
      --log-flush-frequency duration        Maximum number of seconds between log flushes (default 5s)
  -h, --help                                help for multi-networkpolicy-node

Regarding nsenter, you don't use it directly because multi-networkpolicy-iptables use it, instead of you. If you are interested in, you can get various blog about it, as following: https://www.redhat.com/sysadmin/container-namespaces-nsenter

Hence I recommend to revert your change about privilege and just change --container-runtime first.

elgamal2020 commented 3 years ago

Hello,

I have docker engine on top on containerd runtime as the default setup by kubespray

I have the --container-runtime corectly set to use docker , what I dont know how to correctly configure the container-runtime-endpoint

so , I have tried this

  - "--container-runtime=docker"
        # change this if runtime is different that crio default
        - "--container-runtime-endpoint=unix:///run/containerd/containerd.sock"

I tried also

   args:
        - "--host-prefix=/host"
        # uncomment this if runtime is docker
        - "--container-runtime=docker"
        # change this if runtime is different that crio default
        - "--container-runtime-endpoint=/var/run/dockershim.sock"
        # uncomment this if you want to store iptables rules
        - "--pod-iptables=/var/lib/multi-networkpolicy/iptables"

in the logs , I always get


kubectl  -n kube-system logs multi-networkpolicy-ds-amd64-s78fl
I0917 02:00:34.383246       1 server.go:174] Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config.
I0917 02:00:34.687996       1 options.go:73] hostname: sysarch-k8s-master-nf-1
I0917 02:00:34.688040       1 options.go:74] container-runtime: docker
I0917 02:00:34.688676       1 server.go:160] Starting network-policy-node
I0917 02:00:34.688798       1 namespace.go:79] Starting ns config controller
I0917 02:00:34.688831       1 shared_informer.go:223] Waiting for caches to sync for ns config
I0917 02:00:34.780745       1 net-attach-def.go:84] Starting net-attach-def config controller
I0917 02:00:34.780847       1 shared_informer.go:223] Waiting for caches to sync for net-attach-def config
I0917 02:00:34.780963       1 networkpolicy.go:81] Starting policy config controller
I0917 02:00:34.780980       1 shared_informer.go:223] Waiting for caches to sync for policy config
I0917 02:00:34.983442       1 shared_informer.go:230] Caches are synced for net-attach-def config
I0917 02:00:34.983506       1 server.go:388] OnNetDefSynced
I0917 02:00:35.089476       1 shared_informer.go:230] Caches are synced for ns config
I0917 02:00:35.089527       1 server.go:421] OnNamespaceSynced
I0917 02:00:35.181582       1 shared_informer.go:230] Caches are synced for policy config
I0917 02:00:35.181807       1 server.go:355] OnPolicySynced
I0917 02:00:35.181842       1 server.go:100] Starting pod config
I0917 02:00:35.183570       1 pod.go:123] Starting pod config controller
I0917 02:00:35.183628       1 shared_informer.go:223] Waiting for caches to sync for pod config
E0917 02:00:36.290191       1 pod.go:368] failed to get pod(nti-monitoring/ntim-prometheus-68769b565d-w9dnb) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0917 02:00:36.290685       1 pod.go:368] failed to get pod(nti-security/ncms-cainjector-66d4c5db74-g6pt2) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0917 02:00:36.381430       1 pod.go:368] failed to get pod(ns1/nf0-backend-65f69877d7-vkb9t) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0917 02:00:36.381760       1 pod.go:368] failed to get pod(nti-security/ncms-controller-9d47b9bfc-lkkp5) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0917 02:00:36.382243       1 pod.go:368] failed to get pod(nti-mesh/linkerd-sp-validator-85bc85dcb4-7ltr2) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
E0917 02:00:36.382865       1 pod.go:368] failed to get pod(nti-mesh/linkerd-destination-658f496bcd-8fxww) network namespace: failed to get container info: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
I0917 02:00:36.384240       1 shared_informer.go:230] Caches are synced for pod config
I0917 02:00:36.384271       1 server.go:324] OnPodSynced

then see my contianers , they are perfectly using /run/containerd/containerd.sock


ps -aux | grep docker
root         762  2.8  1.0 2850952 128600 ?      Ssl  Apr14 6452:06 /usr/bin/dockerd --iptables=false --exec-opt native.cgroupdriver=systemd --data-root=/var/lib/docker --log-opt max-size=50m --log-opt max-file=5 --dns 10.233.0.3 --dns 127.0.0.53 --dns-search default.svc.sysarch --dns-search svc.sysarch --dns-search openstacklocal --dns-opt ndots:2 --dns-opt timeout:2 --dns-opt attempts:2
root        1672  0.0  0.4 988424 59632 ?        Sl   Apr14  33:44 /usr/bin/docker run --restart=on-failure:5 --env-file=/etc/etcd.env --net=host -v /etc/ssl/certs:/etc/ssl/certs:ro -v /etc/ssl/etcd/ssl:/etc/ssl/etcd/ssl:ro -v /var/lib/etcd:/var/lib/etcd:rw --memory=0 --blkio-weight=1000 --name=etcd1 quay.io/coreos/etcd:v3.4.13 /usr/local/bin/etcd
root        1837  0.0  0.0 108720  6432 ?        Sl   Apr14  18:04 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/fa4fe09a0ba76e7ee63781a8544b3e7ff6f84a8000c60f7db5341c9b7c69564c -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc -systemd-cgroup
root        2621  0.0  0.0 108720  5312 ?        Sl   Apr14  11:42 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/fbd07334f115e114aec1b6ba07f954e21c925a03d9d383cde2e483d568164e9b -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc -systemd-cgroup
root        2666  0.0  0.0 108720  5312 ?        Sl   Apr14  11:17 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/a1422e4b08e44fc061673c874f1dff4f00ebb252bd6e3a55cfd1b6faca542827 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc -systemd-cgroup
s1061123 commented 3 years ago

We only tested with kubeadm but it does not use kubeadm specific feature/configuration. But I don't know what is the root cause.