Closed elvizlai closed 2 years ago
Thanks a lot for your feedback. Could you attach the error or failure message in the issue description? @elvizlai
@allencloud I update the issue with log appended.
journalctl
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08+08:00" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." module=containerd type=io.containerd.grpc.v1
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08+08:00" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." module=containerd type=io.containerd.grpc.v1
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08+08:00" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." module=containerd type=io.containerd.grpc.v1
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08+08:00" level=info msg=serving... address="/run/containerd/debug.sock" module="containerd/debug"
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08+08:00" level=info msg=serving... address="/var/run/containerd.sock" module="containerd/grpc"
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08+08:00" level=info msg="containerd successfully booted in 0.012541s" module=containerd
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08.763391573+08:00" level=info msg="success to start containerd" containerd-pid=3333 module=ctrd-supervisord
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08.768286594+08:00" level=info msg="success to create 5 containerd clients, connect to: /var/run/containerd.sock"
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08.76905849+08:00" level=info msg="Snapshotter is set to be overlayfs"
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08.769276734+08:00" level=info msg="invoke pre-start hook in plugin"
12月 26 17:34:08 host.localdomain pouchd[3326]: time="2018-12-26T17:34:08.854821156+08:00" level=warning msg="could not create bridge network for id 462d39135a6c114e13119f5874995dfc1e6cd505fd6abaee4e597c510c67fc51 bridge name p
12月 26 17:34:09 host.localdomain pouchd[3326]: time="2018-12-26T17:34:09.144878279+08:00" level=error msg="getEndpointFromStore for eid 1f76dc0ce9f8b2dd2d7be0a102e29d0e332228a409aba0f94bceba8c8efdd8a1 failed while trying to bu
12月 26 17:34:09 host.localdomain pouchd[3326]: time="2018-12-26T17:34:09.144940644+08:00" level=info msg="Removing stale sandbox 8e6085e6c56397fc030250618e0790b149047d61620b177738b9d6a7fbd33eac (84c27e996704fbfb5bc21c23e600d05
12月 26 17:34:09 host.localdomain pouchd[3326]: time="2018-12-26T17:34:09.145171058+08:00" level=warning msg="Failed deleting endpoint 1f76dc0ce9f8b2dd2d7be0a102e29d0e332228a409aba0f94bceba8c8efdd8a1: failed to get endpoint fro
12月 26 17:34:09 host.localdomain pouchd[3326]: "
12月 26 17:34:09 host.localdomain kernel: IPv6: ADDRCONF(NETDEV_UP): p0: link is not ready
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.1835] manager: (p0): new Bridge device (/org/freedesktop/NetworkManager/Devices/5)
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2494] device (p0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2558] ifcfg-rh: add connection in-memory (6e4554af-2497-4a60-b54c-32841523857e,"p0")
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2579] device (p0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2590] device (p0): Activation: starting connection 'p0' (6e4554af-2497-4a60-b54c-32841523857e)
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2613] device (p0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2618] device (p0): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2621] device (p0): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2652] device (p0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2660] device (p0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2663] device (p0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
12月 26 17:34:09 host.localdomain NetworkManager[2536]: <info> [1545816849.2732] device (p0): Activation: successful, device activated.
12月 26 17:34:09 host.localdomain dbus[2512]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
12月 26 17:34:09 host.localdomain systemd[1]: Starting Network Manager Script Dispatcher Service...
12月 26 17:34:09 host.localdomain pouchd[3326]: time="2018-12-26T17:34:09.301777137+08:00" level=info msg="start to listen to: unix:///var/run/pouchd.sock"
12月 26 17:34:09 host.localdomain polkitd[2539]: Unregistered Authentication Agent for unix-process:3320:25319 (system bus name :1.21, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8) (disconnect
12月 26 17:34:09 host.localdomain systemd[1]: Started pouch.
12月 26 17:34:09 host.localdomain dbus[2512]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
12月 26 17:34:09 host.localdomain systemd[1]: Started Network Manager Script Dispatcher Service.
12月 26 17:34:09 host.localdomain nm-dispatcher[3433]: req:1 'up' [p0]: new request (3 scripts)
12月 26 17:34:09 host.localdomain nm-dispatcher[3433]: req:1 'up' [p0]: start running ordered scripts...
@elvizlai Can you provide all the network information, ifconfig
@rudyfly First time init, the ifconfig
result(hidden inet with XXX)
when reboot, the p0 and vetha49ec6b(created by pouch run) is gone.
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 67.XXX.XXX.XXX netmask 255.255.240.0 broadcast 67.230.191.255
inet6 fe80::a8aa:ff:fe12:9bdc prefixlen 64 scopeid 0x20<link>
ether aa:aa:00:12:9b:dc txqueuelen 1000 (Ethernet)
RX packets 97729 bytes 101161122 (96.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 110256 bytes 58737804 (56.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 64 bytes 5184 (5.0 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 64 bytes 5184 (5.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
p0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.5.1 netmask 255.255.255.0 broadcast 192.168.5.255
inet6 fe80::42:c0ff:fea8:501 prefixlen 64 scopeid 0x20<link>
ether 02:42:c0:a8:05:01 txqueuelen 1000 (Ethernet)
RX packets 97060 bytes 55013917 (52.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 79219 bytes 55202494 (52.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vetha49ec6b: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::f4a2:ccff:fecf:d063 prefixlen 64 scopeid 0x20<link>
ether f6:a2:cc:cf:d0:63 txqueuelen 0 (Ethernet)
RX packets 97060 bytes 56372757 (53.7 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 79240 bytes 55203964 (52.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
set container restart=always, it will start when daemon recover, while it will set into activeSandbox and cause network can't be initialized, so bridge p0
can't be set. Without bridge p0
, the container network can't be set, so cause the problem.
set container restart=always, it will start when daemon recover, while it will set into activeSandbox and cause network can't be initialized, so bridge p0 can't be set. Without bridge p0, the container network can't be set, so cause the problem.
Do we have any solutions? @rudyfly And can we cover the fix in the next release of PouchContainer. @fuweid
I faced the same problem
@rudyfly
Thanks for your report, @elvizlai 😱 This is a priority/P1 issue which is highest. Seems to be severe enough. ping @alibaba/pouch , PTAL.
[root@csv-slave13 ~]# pouch run -d -p 8099:80 dockerhub.io/hjc-image-nginx:v1.0
Error: failed to run container f1d418: {"message":"failed to create endpoint f1d41862 on network bridge: adding interface veth99f8b71 to bridge p0 failed: could not find bridge p0: route ip+net: no such network interface"}
pouch network create -n pouchnet -d bridge --gateway 192.168.1.1 --subnet 192.168.1.0/24
测试完毕后
pouch network remove pouchnet
[root@csv-slave13 ~]# pouch run -d -p 8099:80 dockerhub.io/hjc-image-nginx:v1.0
Error: failed to run container f1d418: {"message":"failed to create endpoint f1d41862 on network bridge: adding interface veth99f8b71 to bridge p0 failed: could not find bridge p0: route ip+net: no such network interface"}
Ⅰ. Issue Description
Ⅱ. Describe what happened
Root VPC, some container(not all) not started and because missing pouch
p0
net interface.After reboot, MUSTsystemctl restart pouch
to recreate p0, thenpouch start container
manually.If there are any container can start(--restart always), then p0 won't create.
example:
I think p0 MUST create before vetheXXXX.
ifconfig
Ⅲ. Describe what you expected to happen
Ⅳ. How to reproduce it (as minimally and precisely as possible)
the container is not started as expected.
Ⅴ. Anything else we need to know?
systemctl staus pouch -l
Ⅵ. Environment:
pouch version
): latestuname -a
): 4.20