sylabs / singularity-cri

The Singularity implementation of the Kubernetes Container Runtime Interface
Apache License 2.0
114 stars 29 forks source link

Loopback (lo) network interface not present in container #351

Closed jjhursey closed 5 years ago

jjhursey commented 5 years ago

What are the steps to reproduce this issue?

  1. Create a Singularity container with ifconfig installed.
  2. kubectl create -f ./tiny.yaml (see tiny.yaml )
  3. kubectl exec PODNAME ifconfig to view the interfaces inside the container.

What happens?

I have two worker nodes: 1 running Docker and 1 running Singularity. I have noticed that the Singularity worker starts containers without the lo loopback network device.

With Docker I will see two interfaces:

shell$ kubectl exec test-sycri-s2dtx ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1430
        inet 10.9.69.34  netmask 255.255.255.255  broadcast 0.0.0.0
        ether be:37:dc:35:9e:95  txqueuelen 0  (Ethernet)
        RX packets 8  bytes 656 (656.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

With the Singularity CRI I only see eth0:

shell$ kubectl exec test-sycri-tmznc ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1430
        inet 10.9.106.212  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::1cd9:61ff:fe50:f44b  prefixlen 64  scopeid 0x20<link>
        ether 1e:d9:61:50:f4:4b  txqueuelen 0  (Ethernet)
        RX packets 8  bytes 656 (656.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 656 (656.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

What were you expecting to happen?

I expect to see both the eth0 and lo network devices inside the running container.

Any logs, error output, comments, etc?

I was monitoring the systemd journal (journalctl -f) while sycri was creating the image and did not notice any additional error messages of note.

This is an IBM Cloud Private 3.1.2 environment running Kubernetes v1.12.4 with Calico.

Environment?

OS distribution and version: RHEL 7.6 (ppc64le)

go version: 1.11.5

go env:

Singularity-CRI version: v1.0.0-beta.5

Singularity version: 3.2.1-1.el7

Kubernetes version: v1.12.4+icp-ee

sashayakovtseva commented 5 years ago

Hi @jjhursey,

Singularity-CRI uses CNI configs located in /etc/cni/net.d by default (can be changed in config file). Make sure your default CNI config (which is the first in the config dir) has loopback plugin as the first one, e.g.

{
    "cniVersion": "0.4.0",
    "name": "bridge",
    "plugins": [
        {
            "type": "loopback"
        },
        {
            "type": "bridge",
            "bridge": "sbr0",
            "isGateway": true,
            "ipMasq": true,
            "ipam": {
                "type": "host-local",
                "subnet": "10.22.0.0/16",
                "routes": [
                    { "dst": "0.0.0.0/0" }
                ]
            }
        },
        {
            "type": "firewall"
        },
        {
            "type": "portmap",
            "capabilities": {"portMappings": true},
            "snat": true
        }
    ]
}

LMK if that works for you.

jjhursey commented 5 years ago

The worker nodes have the configuration file pushed to them from the ICP/k8s management node when the kubelet starts up. The configuration being pushed is below:

shell$ cat /etc/cni/net.d/10-calico.conflist
{
    "name": "k8s-pod-network",
    "cniVersion": "0.3.0",
    "plugins": [
      {
        "type": "calico",
        "etcd_endpoints": "https://9.1.2.3:4001",
        "etcd_key_file": "/etc/cni/net.d/calico-tls/etcd-key",
        "etcd_cert_file": "/etc/cni/net.d/calico-tls/etcd-cert",
        "etcd_ca_cert_file": "/etc/cni/net.d/calico-tls/etcd-ca",
        "mtu": 1430,
        "log_level": "info",
        "ipam": {
          "type": "calico-ipam"
        },
        "policy": {
          "type": "k8s"
        },
        "kubernetes": {
            "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
        }
      },
      {
        "type": "portmap",
        "snat": true,
        "capabilities": {"portMappings": true}
      }
    ]
}

Adding the loopback to the top of the plugins list in 10-calico.conflist works for the next restart of sycri/kubelet - allowing lo to be mounted in new pods on that host:

{
    "name": "k8s-pod-network",
    "cniVersion": "0.3.0",
    "plugins": [
        {
            "type": "loopback"
        },
      {
        "type": "calico",
...

But that file is then overwritten when the kubelet starts up. I assume that since sycri has already read the original version then that's why it is working correctly.

So that does work, see below.

shell$ kubectl exec test-sycri-v66gl ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1430
        inet 10.9.106.240  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::1055:e5ff:fe48:ed5a  prefixlen 64  scopeid 0x20<link>
        ether 12:55:e5:48:ed:5a  txqueuelen 0  (Ethernet)
        RX packets 8  bytes 656 (656.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 656 (656.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

However, if I need to restart sycri/kubelet then I need to make sure to edit that file. The workaround is to create a copy of the 10-calico.conflist in the /etc/cni/net.d called 00-my-calico.conflist that adds the loopback. That seems to make the fix stick, however, any updates pushed from the ICP/k8s master would be lost.

The Docker k8s worker has the same original 10-calico.conflist file. I'm wondering why I need to explicitly specify the loopback plugin when using sycri and not with Docker. I don't know enough about the CNI protocol to say for sure if this is a sycri or ICP issue. I wonder if it is an implicit requirement (maybe just for Calico) that the lo device be mounted into every pod. From the calico conf examples I'm turning up (looking here ) they also don't have the loopback device listed.

sashayakovtseva commented 5 years ago

@jjhursey Good question about making loopback the default plugin and always run it. I will discuss it with @cclerget when he is back, but it looks like we don't want to be too smart here and prefer to follow CNI configuration as it is. If I understand correctly, you can include loopback into calico conflist (see installation guide). Not sure where dynamic conflist update comes from. Are you using calico controllers?

jjhursey commented 5 years ago

The calico-kube-controllers deployment is running, and loopback plugin is installed in /opt/cni/bin/. I can search around to see if there is a way to update the master copy of the file that is being pushed out.

It is this sentence in the Calico config documentation that makes me think that it should be a default, but it is unclear:

In addition to the CNI plugin specified by the CNI config file, Kubernetes requires the standard CNI loopback plugin.

Let me know what you find out.

I haven't tried any other CRI mechanisms to see how they react to this config file (e.g., CRI-O, containerd), just the docker(shim) CRI. Having the additional loopback plugin listed does not seem to have a negative impact on the Docker worker node.