scaleway / kernel-tools

:penguin: Kernels on Scaleway
http://devhub.scaleway.com/#/bootscripts
MIT License
104 stars 36 forks source link

IP_VS_NFCT not enabled in latest docker image. Causes docker swarm issues? #343

Closed snorremd closed 6 years ago

snorremd commented 7 years ago

The kernel tools x86_64/4.8.14-docker enables IP_VS_NFCT. The x86_64/4.10.8-docker version does not enable the IP_VS_NFCT kernel module.

When I for example run nginx as a docker swarm service on a scaleway docker server and expose port 80 with --publish 80:80 it should be available on http://localhost on the server, but is not. Running nginx as a normal container and publishing the port works as expected.

I think the missing IP_VS_NFCT module might be what is causing my issues when running docker services in swarm mode. The docker engine logs among other things:

Apr 14 20:39:40 <hostname> docker[15787]: time="2017-04-14T20:39:40Z" level=error msg="Failed to write to /proc/sys/net/ipv4/vs/conntrack: open /proc/sys/net/ipv4/vs/conntrack: nno such file or directory"
Apr 14 20:39:40 <hostname> docker[15787]: time="2017-04-14T20:39:40.154260407Z" level=error msg="Failed to add firewall mark rule in sbox ingress (ingress): reexec failed: exit status 8"
Apr 14 20:41:17 <hostname> docker[15787]: time="2017-04-14T20:41:17.432619182Z" level=error msg="Failed to delete real server 10.255.0.3 for vip 10.255.0.2 fwmark 259 in sbox ingress (ingress): no such process"
Apr 14 20:41:17 <hostname> docker[15787]: time="2017-04-14T20:41:17.432762944Z" level=error msg="Failed to delete service for vip 10.255.0.2 fwmark 259 in sbox ingress (ingress): no such process"
$ uname -r
4.10.8-docker-1

$ ./check-config.sh 
info: reading kernel config from /proc/config.gz ...

Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_VETH: enabled
- CONFIG_BRIDGE: enabled
- CONFIG_BRIDGE_NETFILTER: enabled (as module)
- CONFIG_NF_NAT_IPV4: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_IPVS: enabled
- CONFIG_IP_NF_NAT: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled

Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: enabled
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: enabled
    (cgroup swap accounting is currently enabled)
- CONFIG_LEGACY_VSYSCALL_EMULATE: enabled
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: missing
- CONFIG_NET_CLS_CGROUP: enabled (as module)
- CONFIG_CGROUP_NET_PRIO: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: enabled
- CONFIG_IP_VS: enabled
- CONFIG_IP_VS_NFCT: missing
- CONFIG_IP_VS_RR: enabled (as module)
- CONFIG_EXT4_FS: enabled
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
  - "overlay":
    - CONFIG_VXLAN: enabled (as module)
      Optional (for encrypted networks):
      - CONFIG_CRYPTO: enabled
      - CONFIG_CRYPTO_AEAD: enabled
      - CONFIG_CRYPTO_GCM: enabled (as module)
      - CONFIG_CRYPTO_SEQIV: enabled
      - CONFIG_CRYPTO_GHASH: enabled (as module)
      - CONFIG_XFRM: enabled
      - CONFIG_XFRM_USER: enabled (as module)
      - CONFIG_XFRM_ALGO: enabled
      - CONFIG_INET_ESP: enabled (as module)
      - CONFIG_INET_XFRM_MODE_TRANSPORT: enabled
  - "ipvlan":
    - CONFIG_IPVLAN: enabled (as module)
  - "macvlan":
    - CONFIG_MACVLAN: enabled (as module)
    - CONFIG_DUMMY: enabled (as module)
  - "ftp,tftp client in container":
    - CONFIG_NF_NAT_FTP: enabled (as module)
    - CONFIG_NF_CONNTRACK_FTP: enabled (as module)
    - CONFIG_NF_NAT_TFTP: enabled (as module)
    - CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
- Storage Drivers:
  - "aufs":
    - CONFIG_AUFS_FS: enabled (as module)
  - "btrfs":
    - CONFIG_BTRFS_FS: enabled (as module)
    - CONFIG_BTRFS_FS_POSIX_ACL: enabled
  - "devicemapper":
    - CONFIG_BLK_DEV_DM: enabled (as module)
    - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
  - "overlay":
    - CONFIG_OVERLAY_FS: enabled (as module)
  - "zfs":
    - /dev/zfs: missing
    - zfs command: missing
    - zpool command: missing

Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000
rmacfie commented 7 years ago

I believe I have this problem too. Is there any way to override the version when we create a new server?

Situation right now is that I cannot create any new servers that can publish ports from docker swarm.

agiUnderground commented 7 years ago

Same issue with Scaleway kernel 4.10.8-docker-1, i change docker version from 17.05.0-ce to 17.03.0-ce, but it is not helps. Swarm not accessible from outside, or even from host.

same errors:

level=error msg="Failed to write to /proc/sys/net/ipv4/vs/conntrack:
ghost commented 7 years ago

Same for me, waiting for new bootscript, please add new bootscript!

fredix commented 7 years ago

Same issue for me. Kernel 4.8.x was working before.

agiUnderground commented 7 years ago

Please, return old kernel - 4.8.14-docker-2 as a choice, if you can't add new one, it is critical.

Glukozavr commented 7 years ago

Same thing.

sebastianistoblame commented 7 years ago

+1

mathev19 commented 7 years ago

:+1:

raarts commented 7 years ago

Please add the NFCT option! Docker Swarm will not run without it!

tsunammis commented 7 years ago

👍 I'm waiting also this fix :)

JesusPerez commented 7 years ago

I am trying to use docker swarm and It looks I can not get IPVS works fine ....

docker.log has: level=warning msg="Running modprobe ip_vs failed with message: modprobe: module ip_vs not found in modules.dep, error: exit status 1" level=error msg="Failed to write to /proc/sys/net/ipv4/vs/conntrack: open /proc/sys/net/ipv4/vs/conntrack

I am using: bootscript: x86_64 4.10.8 docker #1 ( is this the one I should be use ?) OS Alpine 3.6.1

Where is the problem if IP_VS_NFCT is enabled ? Is possible to have docker swarm and running with your baremetal servers ? Should I use Private IPs o Public IPs for the swarm cluster ? Considering that private IPs could be reassigned at boot time

Just after more testing: If I use kernel option x86_64 4.10.8 docker #1 and then run command: zcat /proc/config.gz | fgrep IP_VS ... I get CONFIG_IP_VS=y But "lsmod | fgrep ip_vs " command does not show up ip_vs module available

Please check this https://github.com/moby/moby/issues/26930 It looks like we need to make at least CONFIG_IP_VS_RR and CONFIG_IP_VS_NFCT available and are off in config kernel

Looking at https://github.com/rstub/kernel-tools/blob/8bd75cc3ef2d81d067c68cdadbf46a1a6172d4bc/x86_64/4.10.8-docker/.config line 1031 it is activated but in the current bootscript available is not

tbillon commented 7 years ago

The x86_64 4.4.70 std #1 bootscript which address another issue should run Docker fine and has this options enabled. I tried to merge most of the -docker and -apparmor kernels into the -std one.

JesusPerez commented 7 years ago

Many thanks for your answer ... I almost give up !!! swarm finally works. I still do not know if it is better to use private or public IPs for the nodes and swarm but both seems to work from outside and ip_vs module is loaded. It will be nice to know what is different in each bootscript ... or at least what are your criteria to setup that options. Thanks for your answer and work

jthomaschewski commented 7 years ago

@tbillon The bootscript you mention is not available in AMS1. Any fix for AMS servers? Thanks.

tbillon commented 7 years ago

I think it is but unfortunately it's only available for x86.

rmacfie commented 7 years ago

In my case, I need this for when I use the docker-machine driver. Is it possible to adjust which bootscript is used there?

labe-me commented 7 years ago

Just lost 2 hours trying to find why docker swarm wasn't publishing my ports to the outside :)

Worked with x86_64 4.4.70 std #1 bootscript

mtrense commented 6 years ago

@tbillon The bootscript "x86_64 4.4.70 std #1" you mentioned seems to be not available anymore. Is there any bootscript currently available that supports docker swarm?

insertjokehere commented 6 years ago

The 4.11 mainline kernel works for me

On Thu, 31 Aug 2017, 9:15 PM Max Trense notifications@github.com wrote:

@tbillon https://github.com/tbillon The bootscript "x86_64 4.4.70 std #1 https://github.com/scaleway/kernel-tools/issues/1" you mentioned seems to be not available anymore. Is there any bootscript currently available that supports docker swarm?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/scaleway/kernel-tools/issues/343#issuecomment-326238827, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJN8111CFEfscBOfK0ikAb7oROs0dXFks5sdnnBgaJpZM4M-Gwf .

mtrense commented 6 years ago

I've just tested x86_64 mainline 4.11.12 rev1 on two VC1S instances. A docker service create -p 9090:80 --replicas 2 --name test1 nginx:alpine creates the specified service but the networking between swarms loadbalancer and the service does not work. The required module ip_vs_nfct is also not available for that kernel.

@insertjokehere How did you get swarm to work?

tbillon commented 6 years ago

Try one of the current 4.4 or 4.9 bootscript. Both have CONFIG_IP_VS_NFCT enable.

raarts commented 6 years ago

This also holds for the latest mainline scripts (I use x86_64 mainline 4.13.5 rev1)

develar commented 6 years ago

4.10.8 docker doesn't work. Why closed?

develar commented 6 years ago

x86_64 mainline 4.9.64 rev1 works. It seems finally I can run my server on Scaleway.

tboerger commented 6 years ago

Latest bootscripts got it disabled again :(

Edit: Was referring to the docker bootscript.

tbillon commented 6 years ago

No, it's still here. Look into the /proc/config.gz file.

# uname -a
Linux git 4.9.64-mainline-rev1 #1 SMP Tue Nov 21 10:00:26 UTC 2017 x86_64 GNU/Linux

# zcat /proc/config.gz | grep -i nfct
CONFIG_IP_VS_NFCT=y
tboerger commented 6 years ago

The mainline kernel works as expected.

souhaiebtar commented 5 years ago

as of june 2018 in ubuntu 16.04, default config, only choosing the server range (in amsterdam by the way), docker swarm does not work, to make it work you have to change the bootscript, for me "x86_64 mainline 4.9.93 rev1" worked, but to be able to change the bootscript you have to disable enable boot mode under advanced, after disabling it and creating the server, bootscript will now show, and you can change it