Mellanox / docker-sriov-plugin

Docker networking plugin for SRIOV and passthrough interfaces
Apache License 2.0
79 stars 17 forks source link

passthrough-plugin doesn't work anymore after container restarted #1

Closed sihara closed 6 years ago

sihara commented 6 years ago

This is great plugin. Thank you so much of sharing of codes. I played this plugin a bit, but I found a problem.

[root@mon ~]# docker pull mellanox/passthrough-plugin [root@mon ~]# docker run -it -d -v /run/docker/plugins:/run/docker/plugins --net=host --privileged mellanox/passthrough-plugin [root@mon ~]# docker network create -d passthrough --subnet=10.128.8.0/21 --gateway=10.128.9.254 -o netdevice=eno1 -o mode=sriov mynet [root@mon ~]# docker run -it -d --privileged --name centos7 --ip=10.128.8.211 --net=mynet --hostname centos7 centos:centos7 /sbin/init

[root@mon ~]# docker exec -it centos7 /bin/bash [root@centos7 /]# yum install -y net-tools iproute [root@centos7 /]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 10: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 22:70:40:94:7b:1d brd ff:ff:ff:ff:ff:ff [root@centos7 /]# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.128.8.211 netmask 255.255.248.0 broadcast 0.0.0.0 inet6 fe80::2070:40ff:fe94:7b1d prefixlen 64 scopeid 0x20 ether 22:70:40:94:7b:1d txqueuelen 1000 (Ethernet) RX packets 8180 bytes 11290853 (10.7 MiB) RX errors 0 dropped 63 overruns 0 frame 0 TX packets 1427 bytes 99147 (96.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10 loop txqueuelen 1 (Local Loopback) RX packets 98 bytes 7742 (7.5 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 98 bytes 7742 (7.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

At least, it's working very well here and this is what I expected. However, the problem shows up after stop docker container and restart them. Please see below.

[root@mon ~]# docker stop f081905f303c f081905f303c [root@mon ~]# docker stop 29aca98350df 29aca98350df

[root@mon ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f081905f303c centos:centos7 "/sbin/init" 3 minutes ago Exited (137) 8 seconds ago centos7 29aca98350df mellanox/passthrough-plugin "docker-passthrough-p" 4 minutes ago Exited (2) 3 seconds ago dreamy_hypatia

container "passthrough-plugin" works well, but other container which relies on "passthrough-plugin" container doesn't boot up

[root@mon ~]# docker start 29aca98350df 29aca98350df [root@ddnmon ~]# docker start f081905f303c Error response from daemon: failed to create endpoint centos7 on network mynet: NetworkDriver.CreateEndpoint: Plugin can not find network [ b42e8c53c8a721d1b0d9180314249b41296ac25d694f96e6cb5713b009d98dae ]. Error: failed to start containers: f081905f303c

[root@mon ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f081905f303c centos:centos7 "/sbin/init" 6 minutes ago Exited (137) 2 minutes ago centos7 29aca98350df mellanox/passthrough-plugin "docker-passthrough-p" 7 minutes ago Up 2 minutes dreamy_hypatia

paravmellanox commented 6 years ago

I will debug this further later this week. You might want to check in meantime if works without restarting plugin container or not.

paravmellanox commented 6 years ago

It appears that docker engine is not replaying the sequence of network creation when passthrough plugin is restarted. I am not sure if this taken care in new plugin model. It is going to take some time for me to sort out this issue. Is this blocker for you? If you don't restart passthrough plugin, it would work.

drwatson32 commented 6 years ago

I have the similar issue. For my case it's a blocker. If you create a passthrough nic, attach to another container and reboot the hypervisor, you will get broken container without any chance to restore. If you will take a look on https://github.com/yunify/docker-plugin-hostnic, they implemented nic persist via host volume. But they are tied to mac address, and not suitable for my case. Your approach with eth name is better for NFV case.

paravmellanox commented 6 years ago

ok. so I understand that both use cases have persistence requirement. I will provide this support. Since this plugin works based on netdevice names, it will do based on netdev names.

This will avoid the need for orchestration tool to remember the settings.

paravmellanox commented 6 years ago

Hi @sihara , @drwatson32, I have added the support for persisting the configuration for host restart, plugin restart.

Please make sure to pass additional option -v /etc/docker:/etc/docker at starting the plugin. Let me know how it goes. Please close the issue if it works for you. New release is available at, https://hub.docker.com/r/mellanox/passthrough-plugin/

paravmellanox commented 6 years ago

Hi @sihara @drwatson32 I am closing the issue as this functionality is now supported. If you happen to see this issue, please reopen.