chris-short / rak8s

Stand up a Raspberry Pi based Kubernetes cluster with Ansible
MIT License
365 stars 112 forks source link

ignore sysctl bridge-nf-call-iptables error #31

Closed tedsluis closed 6 years ago

tedsluis commented 6 years ago

Prevents playbook failure when bridge-nf-call-iptables was set.

Description

When the cluster.yml playbook is launched for the first time on a fresh raspbian image it always fails on TASK [common : Pass bridged IPv4 traffic to iptables' chains], due to the fact that the path /proc/sys/net/bridge/bridge-nf-call-iptables not yet exists, as you can see below:

pi@ansible-host ~/git/rak8s $ ansible-playbook cluster.yml 

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [node1]
ok: [node2]
ok: [master]

TASK [common : Enabling cgroup options at boot] ********************************
changed: [node1]
changed: [master]
changed: [node2]

TASK [common : Pass bridged IPv4 traffic to iptables' chains] ******************
fatal: [node1]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory\n"}
fatal: [master]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory\n"}
fatal: [node2]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory\n"}

PLAY RECAP *********************************************************************
master                     : ok=2    changed=1    unreachable=0    failed=1   
node1                      : ok=2    changed=1    unreachable=0    failed=1   
node2                      : ok=2    changed=1    unreachable=0    failed=1  

Simply adding the option "ignoreerrors: yes" to the ansible sysctl module will fix the issue.

Without this fix users worked around this issue by re-running the playbook. At the second run the path /proc/sys/net/bridge/bridge-nf-call-iptables already exists, so there will be no error any more.

So this fix is only a minor improvement, but it would be nice that first time users don't run into this error anymore.

Testing

I tested the TASK [common : Pass bridged IPv4 traffic to iptables' chains]` with the fix on a fresh node:

pi@testnode ~/git/rak8s $ sudo ls -l /proc/sys/net/bridge/bridge-nf-call-iptables
ls: cannot access /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory

pi@testnode ~/git/rak8s $ ansible-playbook cluster.yml --tags "test-bridge-nf-call-iptables"

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [testnode]

TASK [common : Pass bridged IPv4 traffic to iptables' chains] ******************
changed: [testnode]

PLAY [all:!master] *************************************************************

PLAY RECAP *********************************************************************
testnode                  : ok=2    changed=1    unreachable=0    failed=0  

The task finished without errors.

Issue Number

My change fixes issue #30

chris-short commented 6 years ago

The ideal fix would be to run this task when it will actually work. Please consider figuring out what order this will work in.

tedsluis commented 6 years ago
quote chris-short: The ideal fix would be to run this task when it will actually work. 
Please consider figuring out what order this will work in. 

Fair question. /proc/sys/net/bridge/bridge-nf-call-iptables is a key used by the kernel module br_netfilter which enables bridging.

/proc/sys/net/bridge/bridge-nf-call-iptables = 1 is required for Weave Net , but on a fresh rapsbian image this /proc/sys/net/bridge/bridge-nf-call-iptables path doesn't exists.

On a fresh image the br_netfilter module is loaded:

$ sudo modinfo br_netfilter
filename:       /lib/modules/4.14.34-v7+/kernel/net/bridge/br_netfilter.ko
description:    Linux ethernet netfilter firewall bridge
author:         Bart De Schuymer <bdschuym@pandora.be>
author:         Lennert Buytenhek <buytenh@gnu.org>
license:        GPL
srcversion:     709D5E7978D06335992E47B
depends:        bridge
intree:         Y
name:           br_netfilter
vermagic:       4.14.34-v7+ SMP mod_unload modversions ARMv7 p2v8 

Although it doesn't show any keys yet:

$ sudo sysctl -a -r bridge

$ ls -l /proc/sys/net/bridge/*

The first time you run playbook cluster.yml the path will be created, but since the path didn't exists the TASK [common : Pass bridged IPv4 traffic to iptables' chains] will exit with the error: Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory

According the http://docs.ansible.com/ansible/latest/modules/sysctl_module.html documentation this is normal behavior: An error will occur when the key doesn't exists yet and the ignoreerrors can be used to ignore errors about unknown keys.

Later on in the playbook, the TASK [kubeadm : Run Docker Install Script] will create the br_netfilter keys:

$ sudo sysctl -a -r bridge
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-filter-pppoe-tagged = 0
net.bridge.bridge-nf-filter-vlan-tagged = 0
net.bridge.bridge-nf-pass-vlan-input-dev = 0

$ ls -l /proc/sys/net/bridge/*
-rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-call-arptables
-rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-call-ip6tables
-rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-call-iptables
-rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-filter-pppoe-tagged
-rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-filter-vlan-tagged
-rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-pass-vlan-input-dev

If net.bridge.bridge-nf-call-iptables wasn't set to 1 by the TASK [common : Pass bridged IPv4 traffic to iptables' chains] yet, it will be zero.

So if we move the TASK [common : Pass bridged IPv4 traffic to iptables' chains] from the common role to the kubeadm role and execute it after the TASK [kubeadm : Run Docker Install Script], the option ignoreerrors can be left out the sysctl command.

Down here the output of the playbook cluster.yml (runned on a fresh image), were I moved TASK [common : Pass bridged IPv4 traffic to iptables' chains] after TASK [kubeadm : Run Docker Install Script]:

$ ansible-playbook cluster.yml 

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [testnode]

TASK [common : Enabling cgroup options at boot] ********************************
ok: [testnode]

TASK [common : apt-get update] *************************************************
ok: [testnode]

TASK [common : apt-get upgrade] ************************************************
ok: [testnode]

TASK [common : Reboot] *********************************************************
skipping: [testnode]

TASK [common : Wait for Reboot] ************************************************
skipping: [testnode]

TASK [kubeadm : Disable Swap] **************************************************
changed: [testnode]

TASK [kubeadm : Determine if docker is installed] ******************************
ok: [testnode]

TASK [kubeadm : Run Docker Install Script] *************************************
changed: [testnode]

TASK [kubeadm : Pass bridged IPv4 traffic to iptables' chains] *****************
changed: [testnode]

TASK [kubeadm : Install apt-transport-https] ***********************************
ok: [testnode]

TASK [kubeadm : Add Google Cloud Repo Key] *************************************
changed: [testnode]
 [WARNING]: Consider using get_url or uri module rather than running curl

TASK [kubeadm : Add Kubernetes to Available apt Sources] ***********************
changed: [testnode]

TASK [kubeadm : apt-get update] ************************************************
changed: [testnode]

TASK [kubeadm : Install k8s Y'all] *********************************************
changed: [testnode] => (item=[u'kubelet', u'kubeadm', u'kubectl'])
etc....

Note: The TASK [kubeadm : Pass bridged IPv4 traffic to iptables' chains] is without the ignoreerrors option. This time it doesn't result in an error.

Shall I close this pull request and create a new one?

chris-short commented 6 years ago

Yes. Submitting a new PR would be perfect! Thanks for diagnosing this so thoroughly!

Chris Short https://chrisshort.net https://devopsish.com

On Mon, May 14, 2018 at 5:05 PM, Ted Sluis notifications@github.com wrote:

quote chris-short: The ideal fix would be to run this task when it will actually work. Please consider figuring out what order this will work in.

Fair question. /proc/sys/net/bridge/bridge-nf-call-iptables is a key used by the kernel module br_netfilter which enables bridging.

/proc/sys/net/bridge/bridge-nf-call-iptables = 1 is required for Weave Net , but on a fresh rapsbian image this /proc/sys/net/bridge/bridge- nf-call-iptables path doesn't exists.

On a fresh image the br_netfilter module is loaded:

$ sudo modinfo br_netfilter filename: /lib/modules/4.14.34-v7+/kernel/net/bridge/br_netfilter.ko description: Linux ethernet netfilter firewall bridge author: Bart De Schuymer bdschuym@pandora.be author: Lennert Buytenhek buytenh@gnu.org license: GPL srcversion: 709D5E7978D06335992E47B depends: bridge intree: Y name: br_netfilter vermagic: 4.14.34-v7+ SMP mod_unload modversions ARMv7 p2v8

Although it doesn't show any keys yet:

$ sudo sysctl -a -r bridge

$ ls -l /proc/sys/net/bridge/*

The first time you run playbook cluster.yml the path will be created, but since the path didn't exists the TASK [common : Pass bridged IPv4 traffic to iptables' chains] will exit with the error: Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory

According the http://docs.ansible.com/ansible/latest/modules/sysctl_ module.html documentation this is normal behavior: An error will occur when the key doesn't exists yet and the ignoreerrors can be used to ignore errors about unknown keys.

Later on in the playbook, the TASK [kubeadm : Run Docker Install Script] will create the br_netfilter keys:

$ sudo sysctl -a -r bridge net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-filter-pppoe-tagged = 0 net.bridge.bridge-nf-filter-vlan-tagged = 0 net.bridge.bridge-nf-pass-vlan-input-dev = 0

$ ls -l /proc/sys/net/bridge/* -rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-call-arptables -rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-call-ip6tables -rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-call-iptables -rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-filter-pppoe-tagged -rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-filter-vlan-tagged -rw-r--r-- 1 root root 0 May 14 21:51 /proc/sys/net/bridge/bridge-nf-pass-vlan-input-dev

If net.bridge.bridge-nf-call-iptables wasn't set to 1 by the TASK [common : Pass bridged IPv4 traffic to iptables' chains] yet, it will be zero.

So if we move the TASK [common : Pass bridged IPv4 traffic to iptables' chains] from the common role to the kubeadm role and execute it after the TASK [kubeadm : Run Docker Install Script], the option ignoreerrors can be left out the sysctl command.

Down here the output of the playbook cluster.yml (runned on a fresh image), were I moved TASK [common : Pass bridged IPv4 traffic to iptables' chains] after TASK [kubeadm : Run Docker Install Script]:

$ ansible-playbook cluster.yml

PLAY [all] *****

TASK [setup] *** ok: [testnode]

TASK [common : Enabling cgroup options at boot] **** ok: [testnode]

TASK [common : apt-get update] ***** ok: [testnode]

TASK [common : apt-get upgrade] **** ok: [testnode]

TASK [common : Reboot] ***** skipping: [testnode]

TASK [common : Wait for Reboot] **** skipping: [testnode]

TASK [kubeadm : Disable Swap] ** changed: [testnode]

TASK [kubeadm : Determine if docker is installed] ** ok: [testnode]

TASK [kubeadm : Run Docker Install Script] ***** changed: [testnode]

TASK [kubeadm : Pass bridged IPv4 traffic to iptables' chains] ***** changed: [testnode]

TASK [kubeadm : Install apt-transport-https] *** ok: [testnode]

TASK [kubeadm : Add Google Cloud Repo Key] ***** changed: [testnode] [WARNING]: Consider using get_url or uri module rather than running curl

TASK [kubeadm : Add Kubernetes to Available apt Sources] *** changed: [testnode]

TASK [kubeadm : apt-get update] **** changed: [testnode]

TASK [kubeadm : Install k8s Y'all] ***** changed: [testnode] => (item=[u'kubelet', u'kubeadm', u'kubectl']) etc....

Note: The TASK [kubeadm : Pass bridged IPv4 traffic to iptables' chains] is without the ignoreerrors option. This time it doesn't result in an error.

Shall I close this pull request and create a new one?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rak8s/rak8s/pull/31#issuecomment-388962579, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVB-bGGFSrzClAwuEISsUWCBjs0Pa1Iks5tyfGfgaJpZM4T8UY4 .