hassio-addons / addon-tailscale

Tailscale - Home Assistant Community Add-ons
MIT License
200 stars 75 forks source link

[0.14.0] After disable userspace networking mode addon crashed #316

Closed shirou93 closed 8 months ago

shirou93 commented 8 months ago

Hi,

When userspace networking mode is enabled and next got disabled addon crashed with this logs:

[20:04:22] WARNING: Altering the MSS is not supported due to missing kernel module, [20:04:22] WARNING: skip clamping the MSS to the MTU for all advertised subnet's interface

Thanks in advance for your work.

sinclairpaul commented 8 months ago

Can you confirm which OS you run?

shirou93 commented 8 months ago

Can you confirm which OS you run?

Debian 12 Bookworm

sinclairpaul commented 8 months ago

Then you likely need to install the support in your host OS. hint hint wireguard.....

shirou93 commented 8 months ago

Then you likely need to install the support in your host OS. hint hint wireguard.....

Addon crash after this warning. Change settings to previous and restart not fix issue. Addon still crashed until reinstall.

You try help or write holderplace comments?

lmagyar commented 8 months ago

Aren't there some ERROR lines before in the log? See #311

Because these are only warnings, failing to set MSS clamping do not crash it, I modified this months ago.

lmagyar commented 8 months ago

There are multiple internal services in the add-on, another service fails at the same moment when this warning is printed, but the error in the other service is the cause.

shirou93 commented 8 months ago

There are multiple internal services in the add-on, another service fails at the same moment when this warning is printed, but the error in the other service is the cause.

[23:57:52] WARNING: Altering the MSS is not supported due to missing kernel module, [23:57:52] WARNING: skip clamping the MSS to the MTU for all advertised subnet's interface s6-rc: info: service mss-clamping successfully started s6-rc: info: service legacy-services: starting s6-rc: info: service legacy-services successfully started s6-rc: info: service legacy-services: stopping s6-rc: info: service legacy-services successfully stopped s6-rc: info: service mss-clamping: stopping s6-rc: info: service taildrop: stopping s6-rc: info: service nginx: stopping s6-rc: info: service taildrop successfully stopped s6-rc: info: service mss-clamping successfully stopped s6-rc: info: service post-tailscaled: stopping s6-rc: info: service post-tailscaled successfully stopped s6-rc: info: service tailscaled: stopping s6-rc: info: service protect-subnets: stopping s6-rc: info: service protect-subnets successfully stopped s6-rc: info: service nginx successfully stopped s6-rc: info: service init-nginx: stopping s6-rc: info: service web: stopping s6-rc: info: service init-nginx successfully stopped s6-rc: info: service tailscaled successfully stopped s6-rc: info: service web successfully stopped s6-rc: info: service legacy-cont-init: stopping s6-rc: info: service legacy-cont-init successfully stopped s6-rc: info: service fix-attrs: stopping s6-rc: info: service base-addon-log-level: stopping log-level: stopping s6-rc: info: service fix-attrs successfully stopped s6-rc: info: service base-addon-log-level successfully stopped s6-rc: info: service base-addon-banner: stopping s6-rc: info: service base-addon-banner successfully stopped s6-rc: info: service s6rc-oneshot-runner: stopping s6-rc: info: service s6rc-oneshot-runner successfully stopped

It's os host issue?

accept_routes to true got still crash.

lmagyar commented 8 months ago

There should be sg. like this before the warnings you copied out:

[23:42:40] INFO: Adding local subnets to ip rules with higher priority than Tailscale's routing,
[23:42:40] INFO: to prevent routing local subnets if the same subnet is routed within your tailnet.
[23:42:40] WARNING:   IPv4 multiple routing tables are not enabled, skip adding route 192.168.0.0/16 to ip rules
[23:42:40] WARNING:   IPv6 multiple routing tables are not enabled, skip adding route 2001:b07:add:e868::/64 to ip rules
[23:42:40] ERROR: Can't protect any subnets
RTNETLINK answers: Not supported
Dump terminated
[23:42:40] INFO: Service protect-subnets exited with code 1 (by signal 0)

Run in SSH command line, what does it show?:

zcat /proc/config.gz | grep -E 'CONFIG_IP(V6)?_MULTIPLE_TABLES'

accept_routes=false should "repair" it, this is a workaround to turn of the failing service, that fails because there are missing kernel modules, modules you can enable with:

CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IPV6_MULTIPLE_TABLES=y

After the modules there, accept_routes=true should work.

It's midnight here, I will continue only in the morning...

shirou93 commented 8 months ago

There should be sg. like this before the warnings you copied out:

[23:42:40] INFO: Adding local subnets to ip rules with higher priority than Tailscale's routing,
[23:42:40] INFO: to prevent routing local subnets if the same subnet is routed within your tailnet.
[23:42:40] WARNING:   IPv4 multiple routing tables are not enabled, skip adding route 192.168.0.0/16 to ip rules
[23:42:40] WARNING:   IPv6 multiple routing tables are not enabled, skip adding route 2001:b07:add:e868::/64 to ip rules
[23:42:40] ERROR: Can't protect any subnets
RTNETLINK answers: Not supported
Dump terminated
[23:42:40] INFO: Service protect-subnets exited with code 1 (by signal 0)

Run in SSH command line, what does it show?:

zcat /proc/config.gz | grep -E 'CONFIG_IP(V6)?_MULTIPLE_TABLES'

accept_routes=false should "repair" it, this is a workaround to turn of the failing service, that fails because there are missing kernel modules, modules you can enable with:

CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IPV6_MULTIPLE_TABLES=y

After the modules there, accept_routes=true should work.

It's midnight here, I will continue only in the morning...

This config is for /etc/default/grub?

CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IPV6_MULTIPLE_TABLES=y

zcat command: gzip: /proc/config.gz: No such file or directory

Good night :)

sinclairpaul commented 8 months ago

Not being funny here, but you are running a supervised install, my honest suggestion would be to switch to HAOS.

shirou93 commented 8 months ago

I checked CONFIG_IP:

image

Warning and crash still exsist.

lmagyar commented 8 months ago

I repeat: not the warning causes the crash, there have to be an error before the warnings.

Please copy the whole log, from the beginning.

shirou93 commented 8 months ago

I deleted my email and external ip from logs:

s6-rc: info: service nginx: starting s6-rc: info: service nginx successfully started 2024/01/15 10:30:59 wgengine.NewUserspaceEngine(tun "tailscale0") ... 2024/01/15 10:30:59 dns: [resolved-ping=yes rc=unknown ret=direct] 2024/01/15 10:30:59 dns: using "direct" mode 2024/01/15 10:30:59 dns: using *dns.directManager 2024/01/15 10:30:59 link state: interfaces.State{defaultRoute=enp1s0 ifs={docker0:[172.17.0.1/16 llu6] enp1s0:[192.168.0.114/24 fdf9:9a78:1b22::d63/128 fdf9:9a78:1b22:0:9c8d:3ed5:e4b8:62c6/64 llu6] hassio:[172.30.32.1/23 llu6]} v4=true v6=true} 2024/01/15 10:30:59 onPortUpdate(port=42163, network=udp6) 2024/01/15 10:30:59 router: using firewall mode pref 2024/01/15 10:30:59 router: default choosing iptables 2024/01/15 10:30:59 router: v6nat = true 2024/01/15 10:30:59 onPortUpdate(port=36002, network=udp4) 2024/01/15 10:30:59 magicsock: disco key = d:ed64675eef96f52d 2024/01/15 10:30:59 Creating WireGuard device... 2024/01/15 10:30:59 Bringing WireGuard device up... 2024/01/15 10:30:59 Bringing router up... 2024/01/15 10:30:59 external route: up 2024/01/15 10:30:59 Clearing router settings... 2024/01/15 10:30:59 Starting network monitor... 2024/01/15 10:30:59 Engine created. 2024/01/15 10:30:59 pm: using backend prefs for "profile-60ba": Prefs{ra=true dns=true want=true webclient=true routes=[0.0.0.0/0 ::/0 192.168.0.0/24] snat=true nf=on host="wyse" update=check Persist{lm=, o=, n=[v8dn2] u="mail@gmail.com"}} 2024/01/15 10:30:59 envknob: TS_NO_LOGS_NO_SUPPORT="true" 2024/01/15 10:30:59 logpolicy: using system state directory "/var/lib/tailscale" 2024/01/15 10:30:59 got LocalBackend in 54ms 2024/01/15 10:30:59 Start 2024/01/15 10:30:59 Backend: logs: be:e42fc50225a64916be106ee0cd2b789a2dbaf1ce0f862d8280de7e2ac2dde039 fe: 2024/01/15 10:30:59 control: client.Login(false, 0) 2024/01/15 10:30:59 control: doLogin(regen=false, hasUrl=false) 2024/01/15 10:30:59 web server running on: http://127.0.0.1:25899 2024/01/15 10:30:59 health("overall"): error: not in map poll 2024/01/15 10:31:00 control: control server key from https://controlplane.tailscale.com: ts2021=[fSeS+], legacy=[nlFWp] 2024/01/15 10:31:00 control: RegisterReq: onode= node=[v8dn2] fup=false nks=false zcat: /proc/config.gz: No such file or directory 2024/01/15 10:31:00 control: RegisterReq: got response; nodeKeyExpired=false, machineAuthorized=true; authURL=false 2024/01/15 10:31:01 control: netmap: got new dial plan from control 2024/01/15 10:31:01 active login: _MYEMAIL____ 2024/01/15 10:31:01 Switching ipn state NoState -> Starting (WantRunning=true, nm=true) 2024/01/15 10:31:01 magicsock: SetPrivateKey called (init) 2024/01/15 10:31:01 wgengine: Reconfig: configuring userspace WireGuard config (with 0/3 peers) 2024/01/15 10:31:01 wgengine: Reconfig: configuring router 2024/01/15 10:31:01 monitor: gateway and self IP changed: gw=192.168.0.1 self=192.168.0.114 2024/01/15 10:31:01 wgengine: Reconfig: configuring DNS 2024/01/15 10:31:01 dns: Set: {DefaultResolvers:[] Routes:{tail93030.ts.net.:[] ts.net.:[199.247.155.53 2620:111:8007::53]}+65arpa SearchDomains:[tail93030.ts.net.] Hosts:4} 2024/01/15 10:31:01 dns: Resolvercfg: {Routes:{.:[172.30.32.3] ts.net.:[199.247.155.53 2620:111:8007::53]} Hosts:4 LocalDomains:[tail93030.ts.net.]+65arpa} 2024/01/15 10:31:01 dns: OScfg: {Nameservers:[100.100.100.100] SearchDomains:[tail93030.ts.net. local.hass.io.] } 2024/01/15 10:31:01 rename of "/etc/resolv.conf" to "/etc/resolv.pre-tailscale-backup.conf" failed (rename /etc/resolv.conf /etc/resolv.pre-tailscale-backup.conf: device or resource busy), falling back to copy+delete 2024/01/15 10:31:01 peerapi: serving on http://100.120.253.33:61268 2024/01/15 10:31:01 peerapi: serving on http://[fd7a:115c:a1e0::ab8:fd21]:63764 2024/01/15 10:31:01 listening on [fd7a:115c:a1e0::ab8:fd21]:5252 2024/01/15 10:31:01 listening on 100.120.253.33:5252 2024/01/15 10:31:01 magicsock: home is now derp-4 (fra) 2024/01/15 10:31:01 magicsock: adding connection to derp-4 for home-keep-alive 2024/01/15 10:31:01 magicsock: 1 active derp conns: derp-4=cr0s,wr0s 2024/01/15 10:31:01 control: NetInfo: NetInfo{varies=false hairpin=false ipv6=false ipv6os=true udp=true icmpv4=false derp=#4 portmap= link="" firewallmode="ipt-default"} 2024/01/15 10:31:01 derphttp.Client.Connect: connecting to derp-4 (fra) 2024/01/15 10:31:01 Switching ipn state Starting -> Running (WantRunning=true, nm=true) 2024/01/15 10:31:01 magicsock: endpoints changed: _____MY_IP___ (stun), 172.17.0.1:36002 (local), 172.30.32.1:36002 (local), 192.168.0.114:36002 (local) 2024/01/15 10:31:01 magicsock: derp-4 connected; connGen=1 Warning: IPv6 forwarding is disabled. Subnet routes and exit nodes may not work correctly. See https://tailscale.com/s/ip-forwarding zcat: /proc/config.gz: No such file or directory s6-rc: info: service post-tailscaled successfully started s6-rc: info: service mss-clamping: starting s6-rc: info: service taildrop: starting s6-rc: info: service taildrop successfully started zcat: /proc/config.gz: No such file or directory [10:31:02] WARNING: Altering the MSS is not supported due to missing kernel module, [10:31:02] WARNING: skip clamping the MSS to the MTU for all advertised subnet's interface s6-rc: info: service mss-clamping successfully started s6-rc: info: service legacy-services: starting s6-rc: info: service legacy-services successfully started s6-rc: info: service legacy-services: stopping s6-rc: info: service legacy-services successfully stopped s6-rc: info: service mss-clamping: stopping s6-rc: info: service taildrop: stopping s6-rc: info: service nginx: stopping s6-rc: info: service taildrop successfully stopped s6-rc: info: service mss-clamping successfully stopped s6-rc: info: service post-tailscaled: stopping s6-rc: info: service post-tailscaled successfully stopped s6-rc: info: service tailscaled: stopping s6-rc: info: service protect-subnets: stopping s6-rc: info: service protect-subnets successfully stopped s6-rc: info: service nginx successfully stopped s6-rc: info: service init-nginx: stopping s6-rc: info: service web: stopping s6-rc: info: service tailscaled successfully stopped s6-rc: info: service init-nginx successfully stopped s6-rc: info: service web successfully stopped s6-rc: info: service legacy-cont-init: stopping s6-rc: info: service legacy-cont-init successfully stopped s6-rc: info: service fix-attrs: stopping s6-rc: info: service base-addon-log-level: stopping s6-rc: info: service fix-attrs successfully stopped s6-rc: info: service base-addon-log-level successfully stopped s6-rc: info: service base-addon-banner: stopping s6-rc: info: service base-addon-banner successfully stopped s6-rc: info: service s6rc-oneshot-runner: stopping s6-rc: info: service s6rc-oneshot-runner successfully stopped

lmagyar commented 8 months ago

Oh, I see, /proc/config.gz: No such file or directory, your kernel needs this:

CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y

But who knows what other stuff is different on your OS? Eg. tailscale itself has other problems, see:

Warning: IPv6 forwarding is disabled.
Subnet routes and exit nodes may not work correctly.
See https://tailscale.com/s/ip-forwarding

I second sinclairpaul, you are better with HAOS (optionally in a VM).

shirou93 commented 8 months ago

debian not use config.gz

I solved issue via adding:

net.ipv4.ip_forward = 1 net.ipv6.conf.all.forwarding = 1

to /etc/sysctl.conf

Thanks for your time :)