berlin-open-wireless-lab / DAWN

Decentralized WiFi Controller
GNU General Public License v2.0
367 stars 63 forks source link

With 0721 release Network Overview showing only own Hostname #190

Closed edrikk closed 2 years ago

edrikk commented 2 years ago

I just installed snapshot version 20220721 of DAWN on Master Head.

I noticed that each AP (4 in total) after this upgrade results in the Network Overview page showing only the clients that are connected to that specific AP only. Clients from other APs (on same SSID) are not longer shown.

I noticed this after the 1st AP was upgraded, and watched the others as I upgraded them, and after each upgrade the page which previously showed Clients connected to each of the other APs then showed only Clients connected to itself.

Usually when I see this a restart of umdns followed by a restart of DAWN 'resolves' this; However this time it appears to be due to the recent changes.

I do not believe the July 18th commits caused it as that was the version I upgraded from. It could be ap_array_unlink_entry always returns NULL ?

PolynomialDivision commented 2 years ago

In my network everything is running smooth. Did you add the correct timeout value in your config?

config times
    option con_timeout '60'
edrikk commented 2 years ago

Yes I had noted that from the commit message and added:

config times option con_timeout '60' option update_client '10'

In case it makes any difference, my setup:

All devices are exhibiting the same behavior still this morning.

XiaoliChan commented 2 years ago

Same issue.

Devices:

Redmi-AX6

Error logs:

Fri Jul 22 19:36:02 2022 daemon.err dawn: client_to_server_state()=tcpsocket.c@106 eof!, pending: 0, total: 33816578 Fri Jul 22 19:36:02 2022 daemon.warn dawn: Connection to server closed Fri Jul 22 19:36:03 2022 daemon.err dawn: client_to_server_state()=tcpsocket.c@106 eof!, pending: 0, total: 33816578 Fri Jul 22 19:36:03 2022 daemon.warn dawn: Connection to server closed Fri Jul 22 19:36:07 2022 daemon.err dawn: client_notify_state()=tcpsocket.c@76 eof!, pending: 0, total: 0 Fri Jul 22 19:36:07 2022 daemon.warn dawn: Connection closed Fri Jul 22 19:36:16 2022 daemon.err dawn: client_to_server_state()=tcpsocket.c@106 eof!, pending: 0, total: 33816578 Fri Jul 22 19:36:16 2022 daemon.warn dawn: Connection to server closed

Proof:

AP-1 image

AP-2 image

PolynomialDivision commented 2 years ago

Can you maybe go through the latest 3 commits and check which commit it breaks?

PolynomialDivision commented 2 years ago

I guess it is because of this: https://github.com/berlin-open-wireless-lab/DAWN/blob/bb362db2facd8ce7a39c430a353b6413ec24d70d/src/network/tcpsocket.c#L108-L109

Not sure anymore why, but we close the connection if no more data is pending. I would say we just remove it.

PolynomialDivision commented 2 years ago

Can you check https://github.com/berlin-open-wireless-lab/DAWN/pull/191 ?

ptpt52 commented 2 years ago

Please provide the output of ubus call umdns browse on each AP

XiaoliChan commented 2 years ago

Please provide the output of ubus call umdns browse on each AP

Here are the results of each APs. AP1-ubus.txt AP2-ubus.txt AP3-ubus.txt

ptpt52 commented 2 years ago

Please provide the output of ubus call umdns browse on each AP

Here are the results of each APs. AP1-ubus.txt AP2-ubus.txt AP3-ubus.txt

looks good in umds and try run lsof -ni :1026 on each AP if no lsof, try netstat -tpn | grep :1026

XiaoliChan commented 2 years ago

looks good in umds and try run lsof -ni :1026 on each AP if no lsof, try netstat -tpn | grep :1026

Note:

AP1 is 192.168.1.101 AP2 is 192.168.1.102 AP3 is 192.168.1.103

AP1

root@AP-LivingRoom-AX:~# netstat -tpn | grep :1026
tcp        0      0 192.168.1.101:1026      192.168.1.103:44736     TIME_WAIT   -
tcp        0      0 192.168.1.101:1026      192.168.1.103:43784     TIME_WAIT   -
tcp        0      0 192.168.1.101:1026      192.168.1.103:47638     ESTABLISHED 3694/dawn
tcp        0      0 192.168.1.101:1026      192.168.1.103:44134     TIME_WAIT   -
tcp        0      0 192.168.1.101:1026      192.168.1.103:55462     TIME_WAIT   -
tcp    68825      0 192.168.1.101:1026      192.168.1.102:51622     ESTABLISHED 3694/dawn
tcp        0      0 192.168.1.101:1026      192.168.1.103:37430     TIME_WAIT   -
tcp        0  69057 192.168.1.101:44408     192.168.1.102:1026      ESTABLISHED 3694/dawn
tcp        0  96459 192.168.1.101:43318     192.168.1.103:1026      ESTABLISHED 3694/dawn
tcp        0      0 192.168.1.101:1026      192.168.1.103:41062     TIME_WAIT   -

AP2

root@AP-Room1-AX:~# netstat -tpn | grep :1026
tcp        0 104571 192.168.1.102:51622     192.168.1.101:1026      ESTABLISHED 4294/dawn
tcp    71694      0 192.168.1.102:1026      192.168.1.103:47700     ESTABLISHED 4294/dawn
tcp        0  74461 192.168.1.102:49324     192.168.1.103:1026      ESTABLISHED 4294/dawn
tcp    72353      0 192.168.1.102:1026      192.168.1.101:44408     ESTABLISHED 4294/dawn

AP3

root@AP-Room2-AX:~# netstat -tpn | grep :1026
tcp        0  79538 192.168.1.103:47700     192.168.1.102:1026      ESTABLISHED 3765/dawn
tcp    72182      0 192.168.1.103:1026      192.168.1.101:43318     ESTABLISHED 3765/dawn
tcp    68461      0 192.168.1.103:1026      192.168.1.102:49324     ESTABLISHED 3765/dawn
ptpt52 commented 2 years ago

@XiaoliChan and what about logread -e dawn

ptpt52 commented 2 years ago

looks like you use diff version of dawn?

XiaoliChan commented 2 years ago

@XiaoliChan and what about logread -e dawn

Lots of errors & warning messages. image

ptpt52 commented 2 years ago

likely one of your AP using old version dawn.

XiaoliChan commented 2 years ago

looks like you use diff version of dawn?

Nope, I used the newest commit of the main upstream. image

XiaoliChan commented 2 years ago

likely one of your AP using old version dawn.

Every dawn on each APs is up to date.

ptpt52 commented 2 years ago

likely one of your AP using old version dawn.

Every dawn on each APs is up to date.

or something like one of your dawn keep restart?

ptpt52 commented 2 years ago

watch the pid of your dawn on each AP

pidof dawn
XiaoliChan commented 2 years ago

or something like one of your dawn keep restart?

Not sure what happened.

One thing I'm sure that is version "e596ff131735821684f7ecea73d7634733319f94" works fine.

ptpt52 commented 2 years ago

your device is mt7621?

XiaoliChan commented 2 years ago

watch the pid of your dawn on each AP

image

XiaoliChan commented 2 years ago

your device is mt7621?

No, they are all Redmi AX6 which is working with robimarko's firmware.

Link: robimarko 5.15

ptpt52 commented 2 years ago

try capture pcap on tcp port 1026 like

tcpdump -i br-lan -nvv tcp port 1026
ptpt52 commented 2 years ago

enable dawn log to see more info:

config local
        option loglevel '1'
XiaoliChan commented 2 years ago

try capture pcap on tcp port 1026 like

pcap.zip

XiaoliChan commented 2 years ago

enable dawn log to see more info:

Here are the logs of each APs in log level 1.

dawn-moreinfo-ap1.txt dawn-moreinfo-ap2.txt dawn-moreinfo-ap3.txt

ptpt52 commented 2 years ago

try https://github.com/berlin-open-wireless-lab/DAWN/pull/192

XiaoliChan commented 2 years ago

try #192

Issue solved by this commit. image

PolynomialDivision commented 2 years ago

Thanks for testing.

edrikk commented 2 years ago

Confirmed as well. Closing issue. Thank you all!