opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.35k stars 751 forks source link

radvd does not start up with 22.1 #5453

Closed brad0 closed 2 years ago

brad0 commented 2 years ago

I upgraded to 22.1 and radvd no longer starts up.

The web UI mentioned an error..

PHP Warning: Invalid argument supplied for foreach() in /usr/local/www/services_router_advertisements.php on line 334

OPNsense 22.1.b_141-amd64

OPNsense-bot commented 2 years ago

Thank you for creating an issue. Since the ticket doesn't seem to be using one of our templates, we're marking this issue as low priority until further notice.

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

The easiest option to gain traction is to close this ticket and open a new one using one of our templates.

fichtner commented 2 years ago

@brad0 the error should be in the system log:

# opnsense-log | grep radvd

I don't think the PHP warning is related just yet.

Cheers, Franco

brad0 commented 2 years ago

It doesn't show anything. Anything else to check?

agh1467 commented 2 years ago

@brad0 Can you try: opnsense-log routing

brad0 commented 2 years ago
version 2.19 started
warning: AdvRDNSSLifetime <= 2*MaxRtrAdvInterval would allow stale DNS servers to be deleted faster
warning: (/usr/local/etc/radvd.conf:111) AdvRDNSSLifetime <= 2*MaxRtrAdvInterval would allow stale DNS servers to be deleted faster
warning: AdvDNSSLLifetime <= 2*MaxRtrAdvInterval would allow stale DNS suffixes to be deleted faster
lo not found: Device not configured
lo not found: Device not configured
exiting, 1 sigterm(s) received
sending stop adverts
lo not found: Device not configured
removing /var/run/radvd.pid
returning from radvd main
fichtner commented 2 years ago

Tracking WAN on a manual loopback device? I've seen before this doesn't work.

brad0 commented 2 years ago

Tracking WAN on a manual loopback device? I've seen before this doesn't work.

I have not knowingly configured anything regarding the loopback interface. All I did configure was the VLAN interfaces and that's it.

fichtner commented 2 years ago

Well, in any case this is not normal so at least for community support I’d take one more look at the following:

# ifconfig
# cat /var/etc/radvd.conf

Cheers, Franco

brad0 commented 2 years ago
root@inet-fw:~ # ifconfig -a
em0: flags=8822<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER,NOMAP>
        ether 18:03:73:31:54:49
        media: Ethernet autoselect
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ix0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: IX
        options=4e538bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether a0:36:9f:b3:2a:14
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ix1: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4e53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether a0:36:9f:b3:2a:15
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
enc0: flags=41<UP,RUNNING> metric 0 mtu 1536
        groups: enc
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
pflog0: flags=20100<PROMISC,PPROMISC> metric 0 mtu 33160
        groups: pflog
pfsync0: flags=0<> metric 0 mtu 1500
        groups: pfsync
ix1_vlan3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: MAIN
        options=4600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether a0:36:9f:b3:2a:15
        inet 192.168.3.2 netmask 0xffffff00 broadcast 192.168.3.255
        inet 192.168.3.1 netmask 0xffffffff broadcast 192.168.3.1 vhid 3
        inet6 2001:470:b050:3::2 prefixlen 64
        inet6 fe80::a236:9fff:feb3:2a15%ix1_vlan3 prefixlen 64 scopeid 0x8
        groups: vlan
        carp: MASTER vhid 3 advbase 1 advskew 0
        vlan: 3 vlanproto: 802.1q vlanpcp: 0 parent interface: ix1
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ix1_vlan4: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: JUMBO
        options=4600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether a0:36:9f:b3:2a:15
        inet 192.168.4.2 netmask 0xffffff00 broadcast 192.168.4.255
        inet 192.168.4.1 netmask 0xffffffff broadcast 192.168.4.1 vhid 4
        inet6 2001:470:b050:4::2 prefixlen 64
        inet6 fe80::a236:9fff:feb3:2a15%ix1_vlan4 prefixlen 64 scopeid 0x9
        groups: vlan
        carp: MASTER vhid 4 advbase 1 advskew 0
        vlan: 4 vlanproto: 802.1q vlanpcp: 0 parent interface: ix1
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ix1_vlan5: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: PUBLICWIFI
        options=4600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether a0:36:9f:b3:2a:15
        inet 192.168.5.2 netmask 0xffffff00 broadcast 192.168.5.255
        inet 192.168.5.1 netmask 0xffffffff broadcast 192.168.5.1 vhid 5
        inet6 2001:470:b050:5::2 prefixlen 64
        inet6 fe80::a236:9fff:feb3:2a15%ix1_vlan5 prefixlen 64 scopeid 0xa
        groups: vlan
        carp: MASTER vhid 5 advbase 1 advskew 0
        vlan: 5 vlanproto: 802.1q vlanpcp: 0 parent interface: ix1
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ix1_vlan6: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: DYNACERT
        options=4600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
        ether a0:36:9f:b3:2a:15
        inet 192.168.6.2 netmask 0xffffff00 broadcast 192.168.6.255
        inet 192.168.6.1 netmask 0xffffffff broadcast 192.168.6.1 vhid 6
        inet6 2001:470:b050:6::2 prefixlen 64
        inet6 fe80::a236:9fff:feb3:2a15%ix1_vlan6 prefixlen 64 scopeid 0xb
        groups: vlan
        carp: MASTER vhid 6 advbase 1 advskew 0
        vlan: 6 vlanproto: 802.1q vlanpcp: 0 parent interface: ix1
        media: Ethernet autoselect (10Gbase-T <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ovpns2: flags=8010<POINTOPOINT,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        groups: tun openvpn
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
ovpnc1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        inet 192.168.180.2 --> 192.168.180.1 netmask 0xffffff00
        inet6 2001:470:b0db:180::1000 prefixlen 64
        inet6 fe80::1a03:73ff:fe31:5449%ovpnc1 prefixlen 64 scopeid 0xd
        groups: tun openvpn
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        Opened by PID 63830
pppoe0: flags=88d1<UP,POINTOPOINT,RUNNING,NOARP,SIMPLEX,MULTICAST> metric 0 mtu 1492
        description: WAN
        inet 142.114.5.252 --> 10.11.5.161 netmask 0xffffffff
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
gif0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1440
        description: HENETV6
        options=80000<LINKSTATE>
        tunnel inet 142.114.5.252 --> 216.66.38.58
        inet6 2001:470:1c:70::2 --> 2001:470:1c:70::1 prefixlen 128
        inet6 fe80::1a03:73ff:fe31:5449%gif0 prefixlen 64 scopeid 0xf
        groups: gif
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
root@inet-fw:~ # cat /var/etc/radvd.conf
# Automatically generated, do not edit
fichtner commented 2 years ago

Not sure what is going on, but radvd is empty so maybe no tracking enabled? It should have said in the initial post.

agh1467 commented 2 years ago

I find it strange that the log messages reference the example configuration file with line number: /usr/local/etc/radvd.conf:111

Which, this file includes exclusively a setting for loopback:

interface lo
{
        AdvSendAdvert on;

I'm not sure why radvd would be using that file over the one in /var/etc/ though, as it's hard coded to use the /var/etc conf:

https://github.com/opnsense/core/blob/master/src/etc/inc/plugins.inc.d/dhcpd.inc#L567 mwexec('/usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.conf -m syslog');

Maybe a red herring, unless there is another means to starting it somewhere else.

fichtner commented 2 years ago

If it goes to the default if the given config is empty that could explain it, but it’s really not that relevant in the grand scheme of things. I can’t see what is broken that should work (configuration issue likely).

agh1467 commented 2 years ago

I was able to reproduce those messages by manually starting radvd with /usr/local/etc/rc.d/radvd onestart

radvd[5628]: version 2.19 started 
radvd[5628]: warning: AdvRDNSSLifetime <= 2*MaxRtrAdvInterval would allow stale DNS servers to be deleted faster 
radvd[5628]: warning: (/usr/local/etc/radvd.conf:111) AdvRDNSSLifetime <= 2*MaxRtrAdvInterval would allow stale DNS servers to be deleted faster 
radvd[5628]: warning: AdvDNSSLLifetime <= 2*MaxRtrAdvInterval would allow stale DNS suffixes to be deleted faster 
radvd[19904]: lo not found: Device not configured 

/usr/local/etc/rc.d/radvd references that config file:

load_rc_config $name
: ${radvd_enable="NO"}
: ${radvd_config="/usr/local/etc/${name}.conf"}

I think somehow the radvd_enable="YES" got added to the rc conf, and it's trying to start through rc.

I did a quick grep through the source and I don't see a provision for setting that to "YES", and on my system where radvd is in use, radvd_enable isn't put into the rc.conf system.

fichtner commented 2 years ago

Yep, legacy service integration does not use rc.d… we only started using it after forking.

brad0 commented 2 years ago

So what do I need to do to fix this?

fichtner commented 2 years ago

I’m unsure what the goalpost is. You keep giving no further information on what you actually expect other than radvd starting, which is irrelevant without configuration and ISP considerations.

brad0 commented 2 years ago

radvd starts and runs, like it did (without changing anything) before updating from 21.7 to 22.1.

My setup is very simple and straightforward. It's using a 6in4 tunnel with static IPs on each VLAN interface.

Screenshot_4

Screenshot_5

root@inet-fw:~ # pluginctl -s radvd start
Service `radvd' has been started.
root@inet-fw:~ # ps -auxwww | grep radvd
root    25618   0.0  0.1  12740  2204  0  S+   03:37       0:00.00 grep radvd

I don't see a verbose flag for pluginctl. How do you see what is going on?

fichtner commented 2 years ago

Radvd only starts when you configure tracking Interfaces, but then you need a prefix lager than /64 anyway on your tunnel. Despite radvd not starting what sort of functionality have you lost from radvd not starting?

brad0 commented 2 years ago

That makes absolutely no sense. Who broke things?

A network that doesn't work. Even if I didn't care about SLAAC (which I do) RA is required.

fichtner commented 2 years ago

I’m sorry to say this is a waste of both of our time.

megmug commented 2 years ago

I had the same issue of radvd not starting up after the update from 21.7.8 to 22.1. I have multiple interfaces configured with static IPv6 + DHCP6 + managed RA, so in my understanding it definitely needs to be run. On the old version it was still running, after the update not running anymore. On the old version, DHCP6 was still working, after the update DHCP6 and therefore IPv6 on these managed networks was completely broken (apart from the statically configured IP6s). Can't say anything about SLAAC since we don't use it on any network. Trying to start radvd manually didn't help. Log entries nowhere to be found unfortunately. Curiously, /var/etc/radvd.conf was completely empty which can't be correct.

I've found a workaround though, which I wanted to share here: If you go to Services -> Router Advertisements -> Some Interface and just click on save, the config will be rewritten and radvd will start up again. In my case, this fixed the DHCP6 problems. I did the save thing for any interface that was shown in the RA menu there to be sure. The fix seems to persist across reboots as well, so I think it only needs to be applied once after the update to fix things.

So I guess this is some kind of problem where the config gets lost during the upgrade and doesn't get rewritten afterwards. And btw:

Radvd only starts when you configure tracking Interfaces

Either I'm misunderstanding something or this claim is false, since I don't have any tracking interfaces (or is DHCP6 only managed interface also tracking?) and still radvd is needed, so it needs to be started (and did start in prior versions, and now starts again after applying the workaround).

kasper93 commented 2 years ago

I’m sorry to say this is a waste of both of our time.

@fichtner: I’m sorry to say this is a waste of my time... and many more people who have broken IPv6 after update, because you released broken version, while this was known issue for a month. Sorry, but I simply don't understand your dismissive attitude towards @brad0. It is not his job to debug and fix the issue.

Anyway, back to the issue. /var/etc/radvd.conf is cleared during the update, even tho in the web gui configuration looks ok, one need to change something, save and change back to regenerate radvd.conf. And @megmug seems to confirm that. Once you start digging it is easy to fix, but I guess you can agree that desynchronized settings in gui/system is not something that is immediately seen as a solution. And not everyone like @brad0 have to know how to fix that.

That makes absolutely no sense. Who broke things?

@brad0: They did... You did nothing wrong. Upgrade process broke your settings.

fichtner commented 2 years ago

Closing for heated discussion.