openwrt / openwrt

This repository is a mirror of https://git.openwrt.org/openwrt/openwrt.git It is for reference only and is not active for check-ins. We will continue to accept Pull Requests here. They will be merged via staging trees then into openwrt.git.
Other
20.63k stars 10.58k forks source link

netifd: not bringing up tunnel interface with NO_DEVICE #14616

Open danog opened 9 months ago

danog commented 9 months ago

Describe the bug

Randomly, changing some configuration parameters or creating a 6in4 or wireguard interface leads to netifd not being able to bring up the interface when running ifup, with ifstatus returning NO_DEVICE.

OpenWrt version

r23497-6637af95aa

OpenWrt release

23.05.0

OpenWrt target/subtarget

ramips/mt7621

Device

UniElec U7621-06 (16M flash)

Image kind

Official downloaded image

Steps to reproduce

Config:

config interface 'wan6'
        option proto '6in4'
        option peeraddr 'xxx.xxx.xxx.xxx'
        option ip6addr '2a0e:97c0:38f:fffd::1/64'
        list ip6prefix '2a0e:97c0:38f::/64'
        option ip6assign '64'
        option mtu '1472'
        option ipaddr 'yyy.yyy.yyy.yyy'
        option ttl '255'

Debug config in /etc/init.d/network:

start_service() {
        init_switch

        procd_open_instance
        procd_set_param command /sbin/netifd -l 5 -d 15
        procd_set_param stdout 1
        procd_set_param stderr 1
        procd_set_param respawn
        procd_set_param watch network.interface
        [ -e /proc/sys/kernel/core_pattern ] && {
                procd_set_param limits core="unlimited"
        }
        procd_close_instance
}

Logs (no errors or debug-level logs strangely):

Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: device_apply_config(1188): Device 'br-lan': config applied
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: interface_update(1402): Update interface 'loopback'
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: interface_update(1402): Update interface 'lan'
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: interface_add_dns_server(1435): Add IPv4 DNS server: 192.168.69.1
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: interface_update(1402): Update interface 'wan'
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: interface_update(1402): Update interface 'wan6'
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: interface_update(1402): Update interface 'wg_securebit'
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: wdev_update(754): Update wireless device 'radio0'
Sun Feb 11 17:42:59 2024 daemon.err netifd[1493]: wdev_update(754): Update wireless device 'radio1'

Creating the interface through luci yields the same result.

Actual behaviour

Running ifup does not create the iface, with ifstatus returning NO_DEVICE.

Same thing happens when restarting the interface through luci, even across reboots.

Deleting and recreating the config of the same exact interface a few times fixes.

Expected behaviour

The interface gets created correctly.

Additional info

No response

Diffconfig

No response

Terms

brada4 commented 9 months ago

How did you get any effect whatsoever from debian ifupdown config files?

danog commented 9 months ago

That was a typo, I meant to point to /etc/init.d/network with the edited lines to enable logging, as you can see the files are clearly UCI configs, created by LUCI.

brada4 commented 9 months ago

Remove 2 other occurrences of AI spellchecker and lets get to solving the issue.

danog commented 9 months ago

image

I do not use AI or chatbots to generate code or issues, I kindly ask you to stop being toxic and try to investigate the issue.

danog commented 9 months ago

I'm quite frankly amazed at the toxicity I've encountered in all the openwrt issues I've opened so far, I have taken some time to report an issue that is actually quite frequent from reports I see all over the internet, and so far I've only received toxic replies and redirects, instead of support. I understand openwrt is a FOSS project maintained by volunteers, I too maintain multiple projects and reply&contribute when I can and if I have time, but this is no excuse to be this toxic.

I am sure you have received many AI-generated issues made by noobs who do not understand a line of what they're copy-pasting, I myself have recently received many such issues and support requests, but this is no reason to be this toxic: I just made a few typos, easily made when hastily opening an issue from a phone in what little free time I have, like I did.

brada4 commented 9 months ago

You need option tunlink 'wan' to hint ordering, to wait for lower interface to come up.

ip link should show if 6in4 was set up correctly, and wan interface c{nfig is relevant,

Maybe you careless, but im no 3x spellchecker.

danog commented 9 months ago
  1. The underlying wan interface is already up
  2. There is no way to specify that parameter via luci that I see, and it is not specified in the guide at https://openwrt.org/docs/guide-user/network/ipv6/ipv6tunnel-luci, even if it is a valid parameter
  3. Adding that parameter manually, running /etc/init.d/network reload and then ifup cesi does not bring up the iface: config:
config interface 'wan'
        option device 'wan'
        option proto 'pppoe'
        option username 'user'
        option password 'pw'
        option ipv6 'auto'
        option peerdns '0'
        list dns '192.168.69.1'
        option defaultroute '0'

config interface 'cesi'
        option tunlink 'wan'
        option proto '6in4'
        option peeraddr 'xxx.xxx.xxx.xxx'
        option ip6addr '2a0e:97c0:38f:1234::2/24'
        option ipaddr 'yyy.yyy.yy.yyy'

Logread when running /etc/init.d network reload&ifup cesi:

Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: device_apply_config(1188): Device 'br-lan': config applied
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'loopback'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'lan'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_add_dns_server(1435): Add IPv4 DNS server: 192.168.69.1
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'wan'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'sb'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'cesi'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_change_config(1363): Reload interface 'cesi' because of config changes
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_set_available(487): Interface 'cesi', available=1
Tue Feb 13 16:55:38 2024 daemon.notice netifd: Interface 'cesi' is setting up now
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: proto_shell_handler(231): run setup for interface 'cesi'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: wdev_update(754): Update wireless device 'radio0'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: wdev_update(754): Update wireless device 'radio1'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_set_available(487): Interface 'cesi', available=0
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: proto_shell_handler(231): run teardown for interface 'cesi'
Tue Feb 13 16:55:38 2024 daemon.notice netifd: Interface 'cesi' is now down
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: interface_queue_event(124): Queue hotplug handler for interface 'cesi', event 'ifdown'
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: call_hotplug(100): Call hotplug handler for interface 'cesi', event 'ifdown' (none)
Tue Feb 13 16:55:38 2024 daemon.err netifd[1495]: task_complete(109): Complete hotplug handler for interface 'cesi'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: device_apply_config(1188): Device 'br-lan': config applied
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'loopback'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'lan'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: interface_add_dns_server(1435): Add IPv4 DNS server: 192.168.69.1
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'wan'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'sb'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: interface_update(1402): Update interface 'cesi'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: wdev_update(754): Update wireless device 'radio0'
Tue Feb 13 16:55:41 2024 daemon.err netifd[1495]: wdev_update(754): Update wireless device 'radio1'
root@OpenWrt:~# ip link | grep cesi
root@OpenWrt:~#

Note that the issue is random, sometimes running ifup multiple times brings up the interface anyway.

danog commented 9 months ago

Another big issue I see is the complete absence of any logs that indicate the source of the error, why is the interface not available in interface_set_available(487): Interface 'cesi', available=0?

How can I further debug the issue?

danog commented 9 months ago

Additional detail, as you can see, the wan interface does not have a default route configured: this is on purpose, as I have configured ipv4 RPKI filtering.

However, an explicit static route is separately configured towards the destination of the tunnel, via the next hop on the wan, and the destination of the tunnel is pingable, even if the tunnel interface still won't come up.

To reiterate, the issue is random, sometimes running ifup multiple times brings up the interface anyway.

brada4 commented 9 months ago

tunlink should be interface having 6in4 IP4 local address configured, likely pppoe-wan Altenatively you can force_link 1 on pppoe involved interfaces so that they come up without lower layer getting up, you will lose first few ip6 pings, but be happy after.

danog commented 9 months ago

Using pppoe-wan instead of wan as tunlink does not fix the issue.

The strangest thing is that I've recently successfully created one more tunnel, with a nearly identical config, and it came up, but the other one does not:

config interface 'wan6'
        option proto '6in4'
        option peeraddr 'xxx.xxx.xxx.xxx'
        option ip6addr '2a0e:97c0:38f:ffff::2/64'
        list ip6prefix '2a0e:97c0:38f:0::/64'
        option ipaddr 'yyy.yyy.yyy.yyy'

config interface 'cesi'
        option proto '6in4'
        option peeraddr 'zzz.zzz.zzz.zzz'
        option ip6addr '2a0e:97c0:38f:1234::2/24'
        list ip6prefix '2a0e:97c0:38f:4321::/64'
        option ipaddr 'yyy.yyy.yyy.yyy'

cesi is not coming up after an ifup, wan6 is coming up just fine.

Already tried specifying wan and pppoe-wan as tunlink.

danog commented 9 months ago

Again, is there any way to enable even more logging, to get the exact reason why bringing up the interface failed?

brada4 commented 9 months ago

Logging: https://openwrt.org/docs/techref/netifd zzzzz and xxxxx should run 6in4 tunnel endpoint, you can tcpdump on any involved interface while ping6-ing the other end and see if any response arrives.

danog commented 9 months ago

The endpoints on the other side are obviously running 6in4 tunnels, the problem is that netifd does not bring up the interface, it is absent from ip link readouts, and the luci UI shows "Network device not present" errors when trying to start the tunnel.

Creating the interface manually through an

ip link add name 6in4-cesi type sit local yyy.yyy.yyy.yyy remote zzz.zzz.zzz.zzz

Works perfectly.

I have already enabled logging as described in the first post, with no result.

brada4 commented 9 months ago

@jow- or @nbd168 should be able to move issue to ../netifd tracker....

PS from past experience, if obvious config failed I started with ip/ifconfig/route and then tried to recreate config via (l)uci.