Closed molian6 closed 5 years ago
i've got a fresh AWS instance and am having the same problem
i had to add DOCKER_OPTS="unix://"
to /etc/default/docker
first, though.
@molian6 after a bit of fiddling around, a reboot did the trick. doesn't make any sense to me, though.
Same problem here. On Debian 9. The reboot did the trick on one of my VMs, not for the others. I had to install bridge-utils and it worked once, but not always. I had to hard reboot multiple times and it finally magically worked but it's weird. It's been doing that since about 3-4 days.
/cc @ijc @kolyshkin
This sort of failure to load a module is usually because the kernel package has been updated but the machine has not yet rebooted into that new kernel, so the modprobe
/insmod
is trying to load the module for the newer kernel onto the older kernel, which isn't always guaranteed to work (and fails with exactly these Unknown symbol
type messages in dmesg.
Sounds like there may be other issues being reported here, such as not having the necessary bridge utils package installed.
@Kexkey your "it worked once, but not always" sounds like something different to either of those issues, please open a fresh issue filing in the template etc.
More logs for you. First, I had the same error than the OP:
Error starting daemon: Error initializing network controller: Error creating default "bridge" network: package not installed
I tried a lot of stuff, installed bridge-utils, rebooted, and then...
time="2019-02-22T01:20:12.297264820Z" level=error msg="'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded." storage-driver=overlay2
time="2019-02-22T01:20:12.297313857Z" level=error msg="[graphdriver] prior storage driver overlay2 failed: driver not supported"
Error starting daemon: error initializing graphdriver: driver not supported
I removed everything docker, purged, rebooted, danced a little bit and got this:
Feb 22 03:29:26 docker-tests-06 systemd[1]: Starting Docker Application Container Engine...
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.750060714Z" level=info msg="parsed scheme: \"unix\"" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.750662510Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.751040941Z" level=info msg="parsed scheme: \"unix\"" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.751357237Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753217609Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 <nil>}]" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753292988Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753370816Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4208d3460, CONNECTING" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753511937Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4208d3460, READY" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753562133Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///run/containerd/containerd.sock 0 <nil>}]" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753578749Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753611326Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4206021f0, CONNECTING" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.753716897Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4206021f0, READY" module=grpc
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.755527290Z" level=error msg="'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded." storage-driver=overlay2
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.771600267Z" level=error msg="AUFS was not found in /proc/filesystems" storage-driver=aufs
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.773187317Z" level=error msg="'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded." storage-driver=overlay
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.773619425Z" level=error msg="Failed to built-in GetDriver graph devicemapper /var/lib/docker"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.787984795Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.788715924Z" level=warning msg="Your kernel does not support swap memory limit"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.789159419Z" level=warning msg="Your kernel does not support cgroup rt period"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.789598231Z" level=warning msg="Your kernel does not support cgroup rt runtime"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.790423780Z" level=info msg="Loading containers: start."
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.805490585Z" level=warning msg="Running modprobe bridge br_netfilter failed with message: modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.9.0-8-amd64/modules.dep.bin'\nmodprobe: WARNING: Module bridge not found in directory /lib/modules/4.9.0-8-amd64\nmodprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.9.0-8-amd64/modules.dep.bin'\nmodprobe: WARNING: Module br_netfilter not found in directory /lib/modules/4.9.0-8-amd64\n, error: exit status 1"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.807261642Z" level=warning msg="Running modprobe nf_nat failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.9.0-8-amd64/modules.dep.bin'\nmodprobe: WARNING: Module nf_nat not found in directory /lib/modules/4.9.0-8-amd64`, error: exit status 1"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: time="2019-02-22T03:29:26.808402590Z" level=warning msg="Running modprobe xt_conntrack failed with message: `modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.9.0-8-amd64/modules.dep.bin'\nmodprobe: WARNING: Module xt_conntrack not found in directory /lib/modules/4.9.0-8-amd64`, error: exit status 1"
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: Error starting daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.9.0-8-amd64/modules.dep.bin'
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: modprobe: FATAL: Module ip_tables not found in directory /lib/modules/4.9.0-8-amd64
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: iptables v1.6.0: can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: Perhaps iptables or your kernel needs to be upgraded.
Feb 22 03:29:26 docker-tests-06 dockerd[4538]: (exit status 3)
Feb 22 03:29:27 docker-tests-06 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Feb 22 03:29:27 docker-tests-06 systemd[1]: Failed to start Docker Application Container Engine.
Feb 22 03:29:27 docker-tests-06 systemd[1]: docker.service: Unit entered failed state.
Feb 22 03:29:27 docker-tests-06 systemd[1]: docker.service: Failed with result 'exit-code'.
Feb 22 03:29:29 docker-tests-06 systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Feb 22 03:29:29 docker-tests-06 systemd[1]: Stopped Docker Application Container Engine.
But iptables was working fine and the module was there. I rebooted once again and docker is working.
Not sure everything is related. What I know is that I never had those problems before with the same Debian 9 image. I hope those log snippets can help.
Thanks!
Yes, reboot is a work-around.
llc: Unknown symbol pskb_trim_rcsum_slow
Could it be that a module cannot be successfully loaded runtime??
Yes, reboot is a work-around.
Following @ijc's comment https://github.com/docker/for-linux/issues/598#issuecomment-466429627, it looks like it's not a "work-around", but the correct way to deal with this after a kernel upgrade.
This sort of failure to load a module is usually because the kernel package has been updated but the machine has not yet rebooted into that new kernel, so the
modprobe
/insmod
is trying to load the module for the newer kernel onto the older kernel, which isn't always guaranteed to work (and fails with exactly theseUnknown symbol
type messages in dmesg.I removed everything docker, purged, rebooted, danced a little bit and got this:
I'm confused; you mention you removed docker, but the log after rebooting shows it's trying to start docker.service
Sorry for the confusion, I meant I removed, rebooted and reinstalled before getting those errors.
I didn't do a kernel upgrade. Perhaps the Debian 9 image from the hosting company changed and made all that to happen. They say no though.
Anyway, I did some more testing and here are more details. During docker installation (sudo apt-get install docker-ce docker-ce-cli containerd.io), I get the "bridge network package not found" when it tries to start the daemon. I reboot. The mistake I was making was after the reboot, I was trying to install docker again and always got the same error... but...
...after the installation + reboot, docker is actually installed and working. Looks like I get the error only after the apt-get install. Sometimes the simplest solutions are the last tried.
All that said, it wasn't behaving like that until last week.
Thanks for your support!
i see this in my /var/log/dpkg.log
:
2019-02-21 10:42:55 upgrade linux-image-4.9.0-8-amd64:amd64 4.9.110-3+deb9u5 4.9.144-3
so i guess for me, there was a breaking kernel-module change in this update and a reboot was needed. would assume debian was suppsed to bump the package to linux-image-4.9.0-9-amd64
but did not.
I'm having a similar issue in Ubuntu, what I did was (failing to)install, reboot(without removing anything), reinstall(it will complete the installation).
Have a fresh Debian 9.8 virtual cloud server from Hetzner, same error.
Rebooting helped.
a fresh Debian from a hosting provider may not be the latest release
Debian 9.8 was released February 16th, 2019.
I have the same problem in aws using a fresh installation of debian 9.8. Not happens always. I got the problem in 4 nodes but two days before everything works fine.
If you upgrade the system before install docker works well.
On my end I can confirm that this issue is setup-dependant.
I have 2 AWS EC2 instances using the same official Debian AMI (stretch).
One of them is using puppet to setup a buch of things (java, jenkins, oss nexus) and then ansible to setup docker. It doesn't work, I need to reboot to load the llc, aufs and bridge kernel modules
The other one is using only ansible to setup docker. It works perfectly, llc, aufs and bridge are loaded using dkms without rebooting.
Ansible playbook are the same on both machines (same git repository).
Digging into it but I don't have a better fix than rebooting right now...
It's actually solved in debian-9.8.3. The issue was on the llc module. (Looking for the bug info)
This just failed for me on a fresh Amazon instance
# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 9.9 (stretch)
Release: 9.9
Codename: stretch
The solution was to reboot and install again.
Have the same problem too...
Debian 9.9 Linux host_name_here 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u4 (2019-07-19) x86_64 GNU/Linux
Error at install:
"Job for docker.service failed because the control process exited with error code."
"See \"systemctl status docker.service\" and \"journalctl -xe\" for details."
"invoke-rc.d: initscript docker, action \"start\" failed."
journalctl:
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.970865033Z" level=info msg="Starting up"
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.972257456Z" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.972715163Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.973111487Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] }" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.973518405Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.974489110Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc0006f77c0, CONNECTING" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.978683138Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc0006f77c0, READY" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.981144061Z" level=info msg="parsed scheme: \"unix\"" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.981168252Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.981190386Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] }" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.981202672Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.981269924Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc0005c0a50, CONNECTING" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.981283292Z" level=info msg="blockingPicker: the picked transport is not ready, loop back to repick" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.986678995Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc0005c0a50, READY" module=grpc
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.991934075Z" level=info msg="[graphdriver] using prior storage driver: overlay2"
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.998447940Z" level=warning msg="Your kernel does not support swap memory limit"
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.998470267Z" level=warning msg="Your kernel does not support cgroup rt period"
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.998478752Z" level=warning msg="Your kernel does not support cgroup rt runtime"
Aug 13 10:16:38 host_name_here dockerd[347]: time="2019-08-13T10:16:38.999072721Z" level=info msg="Loading containers: start."
Aug 13 10:16:39 host_name_here dockerd[347]: time="2019-08-13T10:16:39.010517966Z" level=warning msg="Running modprobe nf_nat failed with message: `modprobe: ERROR: could not insert 'nf_nat': Unknown symbol in module, or unknown parameter (see dmesg)\ninsmod /lib/modules/4.9.0-9-amd64/kernel/net/netfilter/nf_conntrack.ko`, error: exit status 1"
Aug 13 10:16:39 host_name_here dockerd[347]: time="2019-08-13T10:16:39.030530972Z" level=warning msg="Running modprobe xt_conntrack failed with message: `modprobe: ERROR: could not insert 'xt_conntrack': Unknown symbol in module, or unknown parameter (see dmesg)\ninsmod /lib/modules/4.9.0-9-amd64/kernel/net/netfilter/nf_conntrack.ko`, error: exit status 1"
Aug 13 10:16:39 host_name_here dockerd[347]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables --wait -t nat -N DOCKER: iptables v1.6.0: can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
modprobe iptable_nat:
modprobe: ERROR: could not insert 'iptable_nat': Unknown symbol in module, or unknown parameter (see dmesg)
dmesg:
[ 394.996205] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[ 394.997047] Bridge firewalling registered
[ 395.000755] nf_conntrack: Unknown symbol __siphash_aligned (err 0)
[ 395.000762] nf_conntrack: Unknown symbol siphash_4u64 (err 0)
[ 395.015965] nf_conntrack: Unknown symbol __siphash_aligned (err 0)
lsmod before reboot:
ip6table_filter 16384 0
ip6_tables 28672 1 ip6table_filter
iptable_filter 16384 0
dm_mod 118784 0
edac_core 57344 0
crct10dif_pclmul 16384 0
crc32_pclmul 16384 0
ghash_clmulni_intel 16384 0
evdev 24576 2
ppdev 20480 0
intel_rapl_perf 16384 0
parport_pc 28672 0
serio_raw 16384 0
parport 49152 2 parport_pc,ppdev
button 16384 0
ip_tables 24576 1 iptable_filter
x_tables 36864 4 ip_tables,iptable_filter,ip6table_filter,ip6_tables
autofs4 40960 2
ext4 589824 1
crc16 16384 1 ext4
jbd2 106496 1 ext4
crc32c_generic 16384 0
fscrypto 28672 1 ext4
ecb 16384 0
mbcache 16384 2 ext4
crc32c_intel 24576 2
aesni_intel 167936 0
aes_x86_64 20480 1 aesni_intel
glue_helper 16384 1 aesni_intel
lrw 16384 1 aesni_intel
gf128mul 16384 1 lrw
ablk_helper 16384 1 aesni_intel
cryptd 24576 3 ablk_helper,ghash_clmulni_intel,aesni_intel
nvme 28672 1
nvme_core 40960 3 nvme
i2c_piix4 24576 0
ena 94208 0
lsmod after reboot:
ipt_MASQUERADE 16384 1
nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE
nf_conntrack_netlink 40960 0
nfnetlink 16384 2 nf_conntrack_netlink
xfrm_user 36864 1
xfrm_algo 16384 1 xfrm_user
iptable_nat 16384 1
nf_conntrack_ipv4 16384 2
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4
nf_nat_ipv4 16384 1 iptable_nat
xt_addrtype 16384 2
xt_conntrack 16384 1
nf_nat 24576 2 nf_nat_masquerade_ipv4,nf_nat_ipv4
nf_conntrack 114688 6 nf_conntrack_ipv4,nf_conntrack_netlink,nf_nat_masquerade_ipv4,xt_conntrack,nf_nat_ipv4,nf_nat
br_netfilter 24576 0
bridge 135168 1 br_netfilter
stp 16384 1 bridge
llc 16384 2 bridge,stp
aufs 360448 0
overlay 49152 0
edac_core 57344 0
crct10dif_pclmul 16384 0
crc32_pclmul 16384 0
ppdev 20480 0
ghash_clmulni_intel 16384 0
evdev 24576 2
intel_rapl_perf 16384 0
parport_pc 28672 0
parport 49152 2 parport_pc,ppdev
serio_raw 16384 0
button 16384 0
ip6table_filter 16384 0
ip6_tables 28672 1 ip6table_filter
iptable_filter 16384 1
ip_tables 24576 2 iptable_filter,iptable_nat
x_tables 36864 7 ip_tables,iptable_filter,ipt_MASQUERADE,ip6table_filter,xt_addrtype,xt_conntrack,ip6_tables
autofs4 40960 2
ext4 589824 1
crc16 16384 1 ext4
jbd2 106496 1 ext4
crc32c_generic 16384 0
fscrypto 28672 1 ext4
ecb 16384 0
mbcache 16384 2 ext4
crc32c_intel 24576 2
aesni_intel 167936 0
aes_x86_64 20480 1 aesni_intel
glue_helper 16384 1 aesni_intel
lrw 16384 1 aesni_intel
gf128mul 16384 1 lrw
ablk_helper 16384 1 aesni_intel
cryptd 24576 3 ablk_helper,ghash_clmulni_intel,aesni_intel
nvme 28672 1
nvme_core 40960 3 nvme
i2c_piix4 24576 0
ena 94208 0
After reboot - docker works normally.
UPD: Machine rebooting before installing docker - works too :)
Hello
the machine reboot works for me but I don't why :(
@supareno In my case - that's a necessary system reboot after post-initial upgrading(kernel itself, its modules etc). In common, I think, this is not a Docker issue. This is a general setup flow problem.
so as @Pshellvon already noticed, the root cause is how you setup your automation steps. in our case, moving everything that pulls in updates (especially kernel ones) to trigger AFTER the docker installation is done mitigated the problem and allowed our automated image-creation to succeed once again.
@Pshellvon , yes I know that it is not a Docker issue. It comes from the Debian :(
let me close this issue because (as mentioned https://github.com/docker/for-linux/issues/598#issuecomment-466429627) it's not a docker bug
If you have this problem double check your running kernel version. It might be that your current kernel is just ancient.
Expected behavior
service docker start
work successfullyActual behavior
Failed to start Docker Application Container Engine.
Meanwhile, I saw error message from kernel:
Steps to reproduce the behavior
apt-get dist-upgrade -q -y --no-install-recommends
update to debian 9.8Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.) Google Compute Engine Instance - Debian 9.5 -> updated to 9.8