Closed nicefile closed 7 months ago
r22967-f18cb0ba63 on freshly supported wr3000 still doesn't register some of the connection
/etc/init.d/bridger restart
fix this for current traffic
to duplicate this issue just use iperf3 test between wired and wireless host.
I see no cpu hogging that plague previous bridger build
I can confirm I am seeing the same as well. I posted details here: https://forum.openwrt.org/t/mt76-wireless-driver-debugging/154514/177?u=_failsafe
Running on RT3200 build r24615-25e215c14e. However, I am no longer seeing WED crashing as noted here: https://github.com/openwrt/mt76/issues/754#issue-1601750830
This seems to solve the issue for me, but I'm not sure why it does:
diff --git a/flow.c b/flow.c
index 61564c0..c7a599a 100644
--- a/flow.c
+++ b/flow.c
@@ -160,7 +160,6 @@ bridger_flow_update_cb(struct uloop_timeout *timeout)
avl_for_each_element_safe(&flows, flow, node, tmp) {
avl_delete(&sorted_flows, &flow->sort_node);
bridger_bpf_flow_update(flow);
- bridger_nl_flow_offload_update(flow);
avl_insert(&sorted_flows, &flow->sort_node);
flow_debug_msg(flow, "Update");
I do not know what's wrong with https://github.com/nbd168/bridger/blob/3159bbe0a2ebcea9f209bbca88dcd5ac86f7a7f1/nl.c#L734-L739 but I don't think it's an issue in handle_filter()
. I made handle_filter()
a noop and there was no change, only not sending that RTM_GETTFILTER
command to cmd_sock
by not calling bridger_nl_flow_offload_update
fixed it.
Of course, I'm sharing this is only in the hopes that it helps find the source of the issue; not for you to use the patch; though it does seem to work fine.
Very interesting find! I rebuilt the firmware for my three RT3200s including the change you made in flow.c
and sure enough, I'm still seeing flows in /sys/kernel/debug/ppe0/bind
even after about 20 minutes of uptime. Longest I've ever seen it keep working.
Update: This is wild! It is still working nearly 12 hours later!
I know @nbd168 has to be pulled in a million other directions, but hopefully he can give this a look and get some updates into bridger
. 😃
It's weird that RTM_GETTFILTER causes this issue because as far as I know, it shouldn't cause any changes. I can even trigger the issue again with while :; do tc -s filter show dev eth0 ingress >/dev/null; sleep 1; done
and bridger_nl_flow_offload_update
commented out like above.
I think it could be kernel bug but not sure.
I'm still seeing this issue with "OpenWrt SNAPSHOT r25136-6497cdba09" and "bridger 2023-05-12-d0f79a16", is there any update or ii is still better to keep WED disabled on a DumpAP?
The WED crash issue seems to be fixed. See details toward the end of: https://github.com/openwrt/mt76/issues/754#issue-1601750830
I'm using @rany2's patch from here and it has kept the WED offloading working for me.
The WED crash issue seems to be fixed. See details toward the end of: openwrt/mt76#754 (comment)
I'm using @rany2's patch from here and it has kept the WED offloading working for me.
Thanks, So there is no any pre-packaged build of it, but I need to build myself?
Correct, at this point you'd have to build and patch yourself.
Just testing my bpi-r3 as a dumb AP and latest snapshot. I have also tested kernel 6.6 and bridger with the same result. I see that bridger does not get new flows after a minute or so...
But now, I've been testing bridger by removing this line. https://github.com/nbd168/bridger/issues/3#issuecomment-1865342049 And I can see new flows again.
- bridger_nl_flow_offload_update(flow);
Thanks @rany2 👍
@rany2 I've took liberty to create PR with your proposed workaround . Maybe this will catch @nbd168 attention
how do we apply this patch? sorry, I am relatively new to openwrt. thanks for your help!
@gssjshark Lets assume you're in OpenWrt folder
mkdir package/network/services/bridger/patches
wget -O package/network/services/bridger/patches/10-fix-issue-3.patch "https://github.com/nbd168/bridger/pull/5/commits/c73bf1f80999db1fe5dbf5c082a9e77862b35d58.patch"
then build your package or whole firmware
or You can install package for OpenWrt 23.05.3 from https://github.com/nbd168/bridger/issues/3#issuecomment-2016912962
@nbd168 Hey Felix, do you have any feedback around the findings from @rany2 in post https://github.com/nbd168/bridger/issues/3#issuecomment-1865342049?
Please try the latest version
I'll test it out tomorrow, thanks as always for your efforts. Hopefully you could find a tester that can respond earlier.
@nbd168 Updated my build to run with https://github.com/nbd168/bridger/commit/c77a7a1ff74d9d4065270239240366c1e6bd9986. So far, so good.
After 50 minutes of uptime, I am still seeing flows when watching /sys/kernel/debug/ppe0/bind
. I typically would have seen the flows "disappear" within a handful of minutes (often less than 5 mins). I'll give another update after I let this cook overnight and see how things look.
Thank you, @nbd168!
@nbd168 Still seeing flows 12+ hours later. Commit https://github.com/nbd168/bridger/commit/c77a7a1ff74d9d4065270239240366c1e6bd9986 seems golden, IMHO. Many thanks!
I think this issue could be closed, seems solved for me.
thanks for testing!
After a while after starting
bridger
doesn't register new connections and file/sys/kernel/debug/ppe0/bind
stays empty for new/current traffic but after a while work again/etc/init.d/bridger restart
fix this for current traffic instantlylink to forum thread where others confirm this https://forum.openwrt.org/t/mt76-wireless-driver-debugging/154514/147?u=nicefile
build from 21-04-2023 @ cudy wr3000 mt7981