Mellanox / docker-sriov-plugin

Docker networking plugin for SRIOV and passthrough interfaces
Apache License 2.0
79 stars 18 forks source link

Infiniband Card inside the container #6

Open psaini79 opened 5 years ago

psaini79 commented 5 years ago

Hi,

I have mlx4 ib card configured on my compute node. ibdev2netdev mlx4_0 port 1 ==> ib0 (Up) mlx4_0 port 2 ==> ib1 (Up)

I have boned both of them assigned the IP 192.168.208.0/24 subet. I was to use SRIOV plugin to assign VF interface to a container . I am unable to do so as I am getting following error:

docker network create -d sriov --subnet=194.168.1.0/24 -o netdevice=bondib0 mynet Error response from daemon: NetworkDriver.CreateNetwork: Fail to enable sriov: device bondib0 not found

This is expected as bondib0 is not eth card but I am wondering how can I expose the card VF of IB to a container.

Please provide the steps to configure IB VF into container using SRIOV plugin.

paravmellanox commented 5 years ago

@psaini79 please pass the PF netdevice. Is PF netdevice is named as bondib0? Is that what you specified when you created mynet network?

Please share output of below commands.

ip link show docker network inspect mynet

psaini79 commented 5 years ago

Please find the input below:

ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 90:e2:ba:9e:b4:c8 brd ff:ff:ff:ff:ff:ff 3: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 90:e2:ba:9e:b4:c9 brd ff:ff:ff:ff:ff:ff 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:10:e0:84:de:2a brd ff:ff:ff:ff:ff:ff 5: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bondeth0 state UP mode DEFAULT group default qlen 1000 link/ether 00:10:e0:84:de:2b brd ff:ff:ff:ff:ff:ff 6: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bondeth0 state UP mode DEFAULT group default qlen 1000 link/ether 00:10:e0:84:de:2b brd ff:ff:ff:ff:ff:ff 7: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 00:10:e0:84:de:2d brd ff:ff:ff:ff:ff:ff 8: ib0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast master bondib0 state UP mode DEFAULT group default qlen 4096 link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:10:e0:00:01:74:f6:c1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 9: ib1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast master bondib0 state UP mode DEFAULT group default qlen 4096 link/infiniband 80:00:02:09:fe:80:00:00:00:00:00:00:00:10:e0:00:01:74:f6:c2 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 10: bond100: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN mode DEFAULT group default link/ether f6:7c:6f:b0:81:4f brd ff:ff:ff:ff:ff:ff 11: bondib0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 65520 qdisc noqueue state UP mode DEFAULT group default link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:10:e0:00:01:74:f6:c1 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 12: bondeth0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 00:10:e0:84:de:2b brd ff:ff:ff:ff:ff:ff 13: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:15:4f:4a:e8 brd ff:ff:ff:ff:ff:ff 27: vethdb0c9b6@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether 36:3e:fc:9d:d9:d0 brd ff:ff:ff:ff:ff:ff link-netnsid 0 30: br-6876037b9ce0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default link/ether 02:42:3f:4e:ef:db brd ff:ff:ff:ff:ff:ff 31: br-ca4e56e5e90e: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:5d:d8:6f:45 brd ff:ff:ff:ff:ff:ff 37: veth2a4e4cd@if36: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-ca4e56e5e90e state UP mode DEFAULT group default link/ether 3e:91:f8:ce:48:11 brd ff:ff:ff:ff:ff:ff link-netnsid 1 38: ib0.8002@ib0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2044 qdisc pfifo_fast state LOWERLAYERDOWN mode DEFAULT group default qlen 4096 link/infiniband 80:00:02:10:fe:80:00:00:00:00:00:00:00:10:e0:00:01:74:f6:c1 brd 00:ff:ff:ff:ff:12:40:1b:80:02:00:00:00:00:00:00:ff:ff:ff:ff 45: ib0.fffa@ib0: <BROADCAST,MULTICAST> mtu 2044 qdisc noop state DOWN mode DEFAULT group default qlen 4096 link/infiniband 80:00:02:17:fe:80:00:00:00:00:00:00:00:10:e0:00:01:74:f6:c1 brd 00:ff:ff:ff:ff:12:40:1b:ff:fa:00:00:00:00:00:00:ff:ff:ff:ff

I was unable to create the mynet bridge on bondib0. I tred on both ib0 and ib1, I was getting the same error.

paravmellanox commented 5 years ago

@psaini79 what is bondib0 device, is this a bond device created on top of ib0 and ib1?

you should be able to create a docker network using $docker network create command on ib0 (netdev of the PF). If it is failing, please share the plugin logs from the start.

on a side note, docker network using sriov plugin over bonded netdevice is unsupported currently.

psaini79 commented 5 years ago

Thanks for the quick reply. I will create bridge on ib0 and ib1. However, I have a question,

Do I need to execute any other steps on ib0 to create a netdev of the PF? Is there any link which provide details of creating netdev of PF on ib0 or ib1 ?

paravmellanox commented 5 years ago

@psaini79 ib0 and ib1 are the netdev of the PF. Only use one of them since both share the same PCI function. ConnectX3 has only one PCI function.

There is no special configuration to be done on PF or ib0. this plugin does the necessary config.

Also I recommend you to switch to ConnectX5 which is a better PCIe representation.

psaini79 commented 5 years ago

Hi,

I tried to create bridge on ib0 but it failed with following error:

docker network create -d sriov --subnet=194.168.1.0/24 -o netdevice=ib0 mynet Error response from daemon: NetworkDriver.CreateNetwork: Fail to find physical port

I found following error:

2019/01/15 22:50:31 Mellanox sriov plugin started version=0.1.0 2019/01/15 22:50:31 Ready to accept commands. 2019/01/15 22:50:46 Entering go-plugins-helpers createnetwork 2019/01/15 22:50:46 CreateNetwork() : [ &{NetworkID:95fe3b356d440ae87c1d47d36a1c5c7a27f25e8ae267a167aa8294fd6388b4d1 Options:map[com.docker.network.generic:map[netdevice:ib0] com.docker.network.enable_ipv6:false] IPv4Data:[0xc42036a640] IPv6Data:[]} ] 2019/01/15 22:50:46 CreateNetwork IPv4Data len : [ 1 ] 2019/01/15 22:50:46 parseNetworkGenericOptions map[netdevice:ib0] 2019/01/15 22:50:46 Multiport driver for device: ib0 max_vfs = 64

paravmellanox commented 5 years ago

@psaini79 I think ConnectX3 support got broken in 0.1.0 version. I need some time to reproduce and resolve this. I need to know how you enabled sriov on connectx3 if you did and if you are using/not using MOFED or UEK3/4/5 or upstream kernel. Because all of them likely have different way to enable sriov on ConnectX3.

In meantime you should consider ConnectX5.

psaini79 commented 5 years ago

Thanks. At this moment, it is tough to switch to the ConnectX5 on this server. However, I will make sure in other server I will have Connect X5.

I am using following kernel: uname -a Linux 4.1.12-124.23.1.el7uek.x86_64 #2 SMP Tue Dec 4 19:27:42 PST 2018 x86_64 x86_64 x86_64 GNU/Linux

Linux Version Oracle Linux Server release 7.6

I will get you the details, how we enabled the sriov asap.

psaini2018 commented 5 years ago

I have looked the configuration and we have enable SRIOV at bios level and loaded MLX core. Please find the details below:

lsmod | grep mlx mlx4_ib 167936 2 ib_sa 36864 5 rdma_cm,ib_cm,mlx4_ib,rdma_ucm,ib_ipoib ib_mad 49152 4 ib_cm,ib_sa,mlx4_ib,ib_umad ib_core 106496 13 rdma_cm,ib_cm,ib_sa,iw_cm,mlx4_ib,ib_mad,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib,rds_rdma mlx4_core 299008 2 mlx4_ib

Also, All the VFs are activated: lspci | grep Mel | wc -l 64

lspci | grep Mel 03:00.0 InfiniBand: Mellanox Technologies MT27500 Family [ConnectX-3] 03:00.1 InfiniBand: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 03:00.2 InfiniBand: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

I am unable to make any progress, please look into this.