Mellanox / k8s-rdma-sriov-dev-plugin

Kubernetes Rdma SRIOV device plugin
Apache License 2.0
109 stars 27 forks source link

Restart the plugin failed on some nodes. #7

Closed flymark2010 closed 6 years ago

flymark2010 commented 6 years ago

Hi, I have started a new node and create the sriov device plugin, everythin goes fine. Then I delete the plugin and then to create it again, but failed.

Here is the Pod log on that node:

# kubectl logs rdma-sriov-dp-ds-p8gvx -n kube-system
2018/07/20 06:26:27 Starting K8s RDMA SRIOV Device Plugin version= 0.2
2018/07/20 06:26:27 Starting FS watcher.
2018/07/20 06:26:27 Starting OS watcher.
2018/07/20 06:26:27 Reading /k8s-rdma-sriov-dev-plugin/config.json
2018/07/20 06:26:27 loaded config:  {"mode":"sriov","pfNetdevices":["ens5f0"]}
2018/07/20 06:26:27 sriov device mode
Configuring SRIOV on ndev= ens5f0 6
max_vfs =  9
cur_vfs =  9
vf = &{0 virtfn0 false false}
Fail to config vfs for ndev = ens5f0
Fail to configure sriov; error =  Link not found
2018/07/20 06:26:27 Starting to serve on /var/lib/kubelet/device-plugins/rdma-sriov-dp.sock
2018/07/20 06:26:27 Registered device plugin with Kubelet
exposing devices:  []

but the network interface ens5f0 is actually exist. Here is the ifconfig result:

ens5f0    Link encap:Ethernet  HWaddr 50:6b:4b:2f:1a:44  
          inet addr:10.128.1.17  Bcast:10.128.1.255  Mask:255.255.255.0
          inet6 addr: fe80::526b:4bff:fe2f:1a44/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2448992848 errors:0 dropped:61776 overruns:0 frame:0
          TX packets:1031347927 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:2935127310369 (2.9 TB)  TX bytes:590907008963 (590.9 GB)

ens5f1    Link encap:Ethernet  HWaddr 50:6b:4b:2f:1a:45  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ens5f4    Link encap:Ethernet  HWaddr 3a:2f:bb:84:34:85  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:554052 errors:0 dropped:36597 overruns:0 frame:0
          TX packets:32977 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:117147687 (117.1 MB)  TX bytes:5181028 (5.1 MB)

......

Beside ens5f0 and ens5f1, other network interface are created by the plugin, and even when I deleted the plugin, they are still exists.

paravmellanox commented 6 years ago

@flymark2010 , when plugin is deleted, those vhca interfaces will exist. However when you restart the plugin, you shouldn't see this error.

It is likely a driver issue which failed to initialize the Vfs. Can you please share your /var/log/messages driver log or /var/log/kern.log based on centos/ubuntu respectively?

flymark2010 commented 6 years ago

I restarted the Pod of the plugin, and catched log from /var/log/kern.log like below:

Jul 23 09:42:20 CQ-YJY-10-128-2-17 kernel: [345967.237844] aufs au_opts_verify:1597:dockerd[2537]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 23 09:42:20 CQ-YJY-10-128-2-17 kernel: [345967.283161] aufs au_opts_verify:1597:dockerd[11730]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 23 09:42:20 CQ-YJY-10-128-2-17 kernel: [345967.333785] aufs au_opts_verify:1597:dockerd[2442]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 23 09:42:20 CQ-YJY-10-128-2-17 kernel: [345967.507524] aufs au_opts_verify:1597:dockerd[11730]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 23 09:42:20 CQ-YJY-10-128-2-17 kernel: [345967.544930] aufs au_opts_verify:1597:dockerd[11730]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 23 09:42:20 CQ-YJY-10-128-2-17 kernel: [345967.573501] aufs au_opts_verify:1597:dockerd[11730]: dirperm1 breaks the protection by the permission bits on the lower branch

And the Pod log is the same with previous.

flymark2010 commented 6 years ago

The OS and kernel are respectively:

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:    16.04
Codename:   xenial
# uname -r
4.4.0-87-generic
flymark2010 commented 6 years ago

I think there's something related with the kernel config intel_iommu. I setted intel_iommu=off , then reboot and restart the plugin. Then the plugin runs normally, and miraculous the GPU P2P communication is OK, too(with intel_iommu=on, the GPU P2P communication will hang). The OFED driver tutorial mentioned that we should set intel_iommu=on. So I'm wondered if the setting intel_iommu=off will affect the RDMA SRIOV plugin use?

paravmellanox commented 6 years ago

@flymark2010 if you are using VMs than you likely need intel_iommu=on. But otherwise for using VFs in containers, you don't need it on because there is single kernel allocating and programming hw addresses in the NIC/HCA. GPU P2P might have some issues with iommu, I don't recall the details anymore as PCIe addressing gets complicated with iommu being on. So intel_iomu=off is fine for plugin, it won't affect.

flymark2010 commented 6 years ago

Ok, thanks!