Closed joaochainho closed 5 years ago
cc @feckert
@joaochainho does this have worked with the self interface added directly to the network configuration already (as described in https://wiki.openwrt.org/doc/howto/mwan3#the_routable_loopback_self)?
I only automated this!
The only problem i think could be that i do this without the know of netifd. I am using directly the ip
command. https://github.com/openwrt/packages/blob/7b34c7689c1ffa2bf772987c787a038c23e74634/net/mwan3/files/etc/hotplug.d/iface/14-mwan3#L27-L38
My be i have to update the netifd as well.
Thanks for feedback
Hi @feckert, thanks for your quick reply.
I forgot to mention that this also happens with the 'self' interface added directly to the network. The device I'm having trouble with isn't a PC or mobile/tablet. It's a non Linux embedded device with some custom OS and IP stack. So maybe the issue isn't related to MWAN3. Could it be something related do dnsmasq itself?
BTW should DHCP Reply messages be sent through 'lo' interface? Here's a simultaneous dump from both 'lan' and 'lo' interfaces.
root@LEDE:~# tcpdump -ni br-lan portrange 67-68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-lan, link-type EN10MB (Ethernet), capture size 262144 bytes
10:24:39.825620 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
10:24:42.835626 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
10:24:42.844432 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
10:24:45.845478 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
10:24:48.855405 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
10:24:48.857965 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
root@LEDE:~# tcpdump -ni lo portrange 67-68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
10:24:39.827725 IP 192.168.90.254.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
10:24:45.847131 IP 192.168.90.254.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
10:24:51.866793 IP 192.168.90.254.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
No other device I tested has this behaviour, and only with this device I see DHCP related packets sent through 'lo' interface. Here's a dump from a 'normal' device.
root@LEDE:~# tcpdump -ni br-lan portrange 67-68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-lan, link-type EN10MB (Ethernet), capture size 262144 bytes
10:42:11.289839 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 314
10:42:11.294436 IP 192.168.90.254.67 > 192.168.90.153.68: BOOTP/DHCP, Reply, length 318
10:42:14.934019 IP 192.168.90.153.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
10:42:14.937379 IP 192.168.90.254.67 > 192.168.90.153.68: BOOTP/DHCP, Reply, length 300
I will test this again (hopefully tomorrow) with the 'self' interface added directly to the network and without MWAN3. This time I'll be running a firmware taken from the snapshots repository, so a minimal number of variables are at play.
Tested with the 'self' interface added directly to the network and without MWAN3, and I stopped having the issue. DHCP works fine and no DHCP related packets are being sent through the 'lo' interface. I also tested with MWAN3 installed but with all its interfaces disabled, and DHCP also works fine. So it seems that it's something connected to MWAN3. Hope this helps.
@joaochainho thanks for feedback. That means you have added the self interface to the network config. And mwan3 is completely disabled. Then everything works as expected.
If you enable a wwan interface in mwan3 then firewall and routing are applied. After this the device will not get an IP. All other device still get an IP. Is this correct?
Hi @feckert , thanks once more for your feedback. Regarding your comments,
That means you have added the self interface to the network config.
Yes.
And mwan3 is completely disabled.
Yes, all interfaces (wan and wwan) are disabled in the mwan3 config, and globals --> local_source = none.
Then everything works as expected.
I went to check again and saw that the device indeed gets IP address but makes several attempts to get it. No packets are sent through the 'lo' interface though, so it's different from when mwan3 is enabled.
# tcpdump
-----------
14:57:12.942739 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:15.952514 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:16.186287 IP 192.168.90.254.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
14:57:16.189694 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
14:57:16.202831 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:16.206324 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
14:57:19.212285 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:19.215366 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
14:57:22.222100 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:22.225179 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
14:57:25.231911 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:25.234958 IP 192.168.90.254.67 > 192.168.90.115.68: BOOTP/DHCP, Reply, length 300
14:57:25.251965 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:25.253627 IP 192.168.90.254.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
14:57:25.272240 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300
14:57:25.275216 IP 192.168.90.254.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
# Logs
-----------
Fri Sep 8 14:57:16 2017 daemon.info dnsmasq-dhcp[1509]: DHCPDISCOVER(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:16 2017 daemon.info dnsmasq-dhcp[1509]: DHCPOFFER(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:16 2017 daemon.info dnsmasq-dhcp[1509]: DHCPDISCOVER(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:16 2017 daemon.info dnsmasq-dhcp[1509]: DHCPOFFER(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:16 2017 daemon.info dnsmasq-dhcp[1509]: DHCPREQUEST(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:16 2017 daemon.info dnsmasq-dhcp[1509]: DHCPACK(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:19 2017 daemon.info dnsmasq-dhcp[1509]: DHCPREQUEST(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:19 2017 daemon.info dnsmasq-dhcp[1509]: DHCPACK(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:22 2017 daemon.info dnsmasq-dhcp[1509]: DHCPREQUEST(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:22 2017 daemon.info dnsmasq-dhcp[1509]: DHCPACK(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:25 2017 daemon.info dnsmasq-dhcp[1509]: DHCPREQUEST(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:25 2017 daemon.info dnsmasq-dhcp[1509]: DHCPACK(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:25 2017 daemon.info dnsmasq-dhcp[1509]: DHCPDISCOVER(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:25 2017 daemon.info dnsmasq-dhcp[1509]: DHCPOFFER(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:25 2017 daemon.info dnsmasq-dhcp[1509]: DHCPREQUEST(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
Fri Sep 8 14:57:25 2017 daemon.info dnsmasq-dhcp[1509]: DHCPACK(br-lan) 192.168.90.115 xx:xx:xx:xx:xx:xx
# Routing table (eth0: wan, eth1.2: wwan)
-----------
default via 192.168.90.254 dev lo proto static
default via 192.168.100.254 dev eth0 proto static src 192.168.100.205 metric 10
default via 192.168.0.1 dev eth1.2 proto static metric 20
192.168.0.0/24 dev eth1.2 proto static scope link metric 20
192.168.100.0/24 dev eth0 proto static scope link metric 10
192.168.100.254 dev eth0 proto static scope link src 192.168.100.205 metric 10
If you enable a wwan interface in mwan3 then firewall and routing are applied. After this the device will not get an IP. All other device still get an IP. Is this correct?
That's correct. Even if I enable just one interface on mwan3 (e.g. wan) then this device can't get IP address.
I just activated mwan3 in my environment: Linksys WRT1900ACS FW Lede r6302-0f54d96d24 LuCI Master (git-18.051.28524-09ea6db) Kernel Version 4.9.82 mwan3 2.6.10-1
Device: HDHomeRun EXTEND Observed same behavior as described before by @joaochainho and also confirmed that workaround he provided fixes the problem:
root@LEDE:~# ip addr del 192.168.90.254/32 dev lo
Question to @feckert or anybody else close enough to the mwan3 code to confirm Is there any side effect to this fix on the way mwan3 works? I have checked and all my rules seem to work just fine....
From my point of view i do not know that there are any site effects. If you add the lan ip of the router to the lo interface, router initiated connections/traffic are routed over the mangle table. Independently from this - knowing networks for example the lan will not be set over the mangle table anyway.
Is there any side effect to this fix on the way mwan3 works?
From my experience, this breaks option local_source 'lan'
, which means that mwan3 rules won't apply to the traffic originated from the router.
If you don't need that, please try setting option local_source 'none'
.
Interestingly enough, I'm also seeing this with the same type of device (SiliconDust HDHomeRun). I'm happy to help further debug, but, as expected, it seems to be a broadcast issue.
I did some more digging. I power-cycled two embedded devices on the lan. We are sending the HDHomeRun (fails) a DHCP offer to 255.255.255.255 per the DHCP spec. The Eagle Rainforest (working) is getting sent to a unicast IP address (which seems odd).
Here are the DHCP DISCOVER and OFFER messages for two embedded devices, one works, one fails:
HDHomeRun (fails):
discover:
13:35:37.221694 00:18:dd:04:69:99 (oui Unknown) > Broadcast, ethertype IPv4 (0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 328)
0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:18:dd:04:69:99 (oui Unknown), length 300, xid 0xaca8e9ab, Flags [Broadcast]
Client-Ethernet-Address 00:18:dd:04:69:99 (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Discover
Client-ID Option 61, length 7: ether 00:18:dd:04:69:99
Hostname Option 12, length 13: "HDHR-1046999D"
Parameter-Request Option 55, length 3:
Domain-Name-Server, Default-Gateway, Subnet-Mask
offer:
13:35:37.723184 30:b5:c2:96:62:fe (oui Unknown) > Broadcast, ethertype IPv4 (0x0800), length 342: (tos 0xc0, ttl 64, id 14767, offset 0, flags [none], proto UDP (17), length 328)
192.168.88.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300, xid 0xaca8e9ab, Flags [Broadcast]
Your-IP 192.168.88.79
Server-IP 192.168.88.1
Client-Ethernet-Address 00:18:dd:04:69:99 (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Offer
Server-ID Option 54, length 4: 192.168.88.1
Lease-Time Option 51, length 4: 14400
RN Option 58, length 4: 7200
RB Option 59, length 4: 12600
Subnet-Mask Option 1, length 4: 255.255.255.0
BR Option 28, length 4: 192.168.88.255
Default-Gateway Option 3, length 4: 192.168.88.1
Domain-Name-Server Option 6, length 4: 192.168.88.1
Eagle rainforest:
discover:
13:36:02.482399 d8:d5:b9:00:0c:5a (oui Unknown) > Broadcast, ethertype IPv4 (0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 328)
0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from d8:d5:b9:00:0c:5a (oui Unknown), length 300, xid 0x35e5c655, secs 6, Flags [none]
Client-Ethernet-Address d8:d5:b9:00:0c:5a (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Discover
Client-ID Option 61, length 7: ether d8:d5:b9:00:0c:5a
MSZ Option 57, length 2: 576
Parameter-Request Option 55, length 7:
Subnet-Mask, Default-Gateway, Domain-Name-Server, Hostname
Domain-Name, BR, NTP
Vendor-Class Option 60, length 12: "udhcp 1.22.1"
offer:
13:36:02.985627 30:b5:c2:96:62:fe (oui Unknown) > d8:d5:b9:00:0c:5a (oui Unknown), ethertype IPv4 (0x0800), length 361: (tos 0xc0, ttl 64, id 1569, offset 0, flags [none], proto UDP (17), length 347)
192.168.88.1.67 > 192.168.88.236.68: BOOTP/DHCP, Reply, length 319, xid 0x35e5c655, secs 6, Flags [none]
Your-IP 192.168.88.236
Server-IP 192.168.88.1
Client-Ethernet-Address d8:d5:b9:00:0c:5a (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Offer
Server-ID Option 54, length 4: 192.168.88.1
Lease-Time Option 51, length 4: 14400
RN Option 58, length 4: 7200
RB Option 59, length 4: 12600
Subnet-Mask Option 1, length 4: 255.255.255.0
BR Option 28, length 4: 192.168.88.255
Default-Gateway Option 3, length 4: 192.168.88.1
Domain-Name-Server Option 6, length 4: 192.168.88.1
Domain-Name Option 15, length 18: "home.shockwave.org"
Hostname Option 12, length 5: "eagle"
Reference debugging (`iptables -t raw -j TRACE -m udp -s udp --dport 67:68):
HD HomeRun:
Mon Jul 9 13:01:30 2018 kern.warn kernel: [ 2019.318962] TRACE: filter:zone_lan_dest_ACCEPT:rule:1 IN= OUT=br-lan SRC=192.168.88.1 DST=255.255.255.255 LEN=328 TOS=0x00 PREC=0xC0 TTL=64 ID=32317 PROTO=UDP SPT=67 DPT=68 LEN=308 UID=0 GID=0 MARK=0x3f00
(end of filter table processing)
<!!! FAILS HERE, output interface is now lo, not br-lan>
(beginning of mangle table processing)
Mon Jul 9 13:01:30 2018 kern.warn kernel: [ 2019.337500] TRACE: mangle:POSTROUTING:rule:1 IN= OUT=lo SRC=192.168.88.1 DST=255.255.255.255 LEN=328 TOS=0x00 PREC=0xC0 TTL=64 ID=32317 PROTO=UDP SPT=67 DPT=68 LEN=308 UID=0 GID=0 MARK=0x3f00
@joaochainho Could you please close this issue? The option local_source is not supported anymore in openwrt master. And the problem was fixed and should work now out of the box
Is it possible to re-open this issue?
I am able to reproduce the exact scenario as described in https://github.com/openwrt/packages/issues/4802#issuecomment-403618872 with in a factory-refreshed OpenWrt 18.06.4 with nothing but mwan3 installed.
Explicitly:
I set up a factory-image 18.06.4 OpenWRT and connect Logitech Squeezebox, Silicondust HDHomerun and a couple of other clients to it. They get their IP addresses.
I then run
opkg update && opkg install mwan3
and reboot.
The Squeezebox and the HDHomerun can no longer get IP addresses, but other clients still can.
When I examine the tcpdumps I can see that Squeezebox and HDHomerun are being sent OFFER via broadcast, whereas other devices are being sent OFFER by unicast. Something in the mwan3 iptables is dropping the broadcast OFFER.
My journey to reach this point is documented in the openwrt forum https://forum.openwrt.org/t/certain-devices-ignore-dhcpoffer/44828/24
You have to set local_source to the value none or delete it from the mwan3 config section. I Expect it should the work again. As discussed the option local_source is not supported any more in the upcoming openwrt-19.07 and master branch.
Apologies, I didn't understand that part when I read it previously.
It now makes sense; I have made the update in the config globals 'globals'
section of /etc/config/mwan3
Thank you very much for your very swift response to my question.
Is there any side effect to this fix on the way mwan3 works?
From my experience, this breaks
option local_source 'lan'
, which means that mwan3 rules won't apply to the traffic originated from the router. If you don't need that, please try settingoption local_source 'none'
.
Many thanks. Setting local_source to none works for me. All devices can connect without problems now. For those who have LuCi web interface it's very easy to change the setting local_source using luci-app-mwan3 package and http://[your router IP]/cgi-bin/luci/admin/network/mwan/globals
FYI, I believe I had the same issue with my setup and I've been searching for AGES for a fix. Randomly DHCP addresses wouldn't be assigned, it would feel like the router was locked up or frozen despite having 30%+ memory available, etc. It would also show our primary WAN connection as constantly down and even rebooting the modem would only show the primary WAN as up for a few minutes and then it would go down again.
Even after upgrading from 18.04 to 19.07 I was having the same issue but I added the option local_source 'none'
to my config manually (couldn't find it in mwan3 settings through Luci) and then rebooted the router. Now my interfaces are staying up and I'm not seeing disconnections like I was before.
Running mwan3 2.8.8-1 and OpenWRT 19.07.3 r11063-85e04e9f46
Package: mwan3 Rev: commit faa86fe.
Some devices can't get IP address via DHCP if routable loopback (self) is enabled.
DHCP negotiation succeeds by removing the IP address from 'lo'.
Adding back the IP address the issue comes back.
MWAN3 config.
Am I doing something wrong? TIA.