berlin-open-wireless-lab / DAWN

Decentralized WiFi Controller
GNU General Public License v2.0
372 stars 64 forks source link

logs are flooded with Neigbor-Report is null! #108

Closed morhimi closed 3 years ago

morhimi commented 4 years ago

Trying DAWN for the first time. Testing it on 2 Xiaomi Redmi AC2100 running latest master snapshot as of last week.

logread is flooded with these messages: daemon.err dawn[1612]: Neigbor-Report is null! I tried changing some settings but that didn't help.

here is my config:

root@OpenWrt_RM2100_MBR:~# cat /etc/config/wireless

config wifi-device 'radio0'
    option type 'mac80211'
    option hwmode '11g'
    option path '1e140000.pcie/pci0000:00/0000:00:01.0/0000:02:00.0'
    option htmode 'HT20'
    option channel 'auto'
    option country 'IL'
    option legacy_rates '0'

config wifi-iface 'default_radio0'
    option device 'radio0'
    option network 'lan'
    option mode 'ap'
    option ft_over_ds '1'
    option ssid '****'
    option encryption 'psk2'
    option ft_psk_generate_local '1'
    option key '****'
    option ieee80211r '1'
    option nasid '88C3973DE85A'
    option ieee80211k '1'
    option bss_transition '1'
    option time_advertisement '2'
    option time_zone 'IST-2IDT,M3.4.4/26,M10.5.0'
    option ieee80211v '0'
    option wnm_sleep_mode '0'

config wifi-device 'radio1'
    option type 'mac80211'
    option hwmode '11a'
    option path '1e140000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0'
    option htmode 'VHT80'
    option country 'IL'
    option legacy_rates '0'
    option channel 'auto'

config wifi-iface 'default_radio1'
    option device 'radio1'
    option network 'lan'
    option mode 'ap'
    option ft_over_ds '1'
    option ssid '****'
    option encryption 'psk2'
    option ft_psk_generate_local '1'
    option key '****'
    option ieee80211r '1'
    option nasid '88C3973DE85B'
    option ieee80211k '1'
    option bss_transition '1'
    option time_advertisement '2'
    option time_zone 'IST-2IDT,M3.4.4/26,M10.5.0'
    option ieee80211v '0'
    option wnm_sleep_mode '0'

root@OpenWrt_RM2100_MBR:~# cat /etc/config/dawn

config network
    option broadcast_ip '10.0.0.255'
    option broadcast_port '1025'
    option tcp_port '1026'
    option network_option '2'
    option shared_key 'Niiiiiiiiiiiiiik'
    option iv 'Niiiiiiiiiiiiiik'
    option use_symm_enc '1'
    option collision_domain '-1'
    option bandwidth '-1'

config ordering
    option sort_order 'cbfs'

config hostapd
    option hostapd_dir '/var/run/hostapd'

config times
    option denied_req_threshold '30'
    option remove_client '15'
    option remove_probe '30'
    option remove_ap '460'
    option update_hostapd '10'
    option update_chan_util '5'
    option update_beacon_reports '120'
    option update_tcp_con '60'
    option update_client '60'

config metric
    option ap_weight '0'
    option ht_support '0'
    option vht_support '0'
    option no_ht_support '0'
    option no_vht_support '0'
    option rssi '10'
    option low_rssi '-500'
    option freq '100'
    option chan_util '0'
    option max_chan_util '-500'
    option rssi_val '-60'
    option low_rssi_val '-80'
    option chan_util_val '140'
    option max_chan_util_val '170'
    option min_probe_count '0'
    option bandwidth_threshold '6'
    option use_station_count '1'
    option max_station_diff '1'
    option deny_auth_reason '1'
    option deny_assoc_reason '17'
    option use_driver_recog '1'
    option min_number_to_kick '3'
    option chan_util_avg_period '3'
    option set_hostapd_nr '1'
    option op_class '0'
    option duration '0'
    option mode '0'
    option scan_channel '0'
    option eval_probe_req '1'
    option eval_auth_req '1'
    option evalcd_assoc_req '1'
    option kicking '1'
    option eval_assoc_req '1'
Ian-Clowes commented 4 years ago

Try setting eval_probe_req to zero or min_probe_count to non-zero (I think the default is 2). I'm not sure that evaluating against a count of zero makes sense. That should be described or prevented, or handled more gracefully in the code if it is valid.

PolynomialDivision commented 4 years ago

@morhimi Please set update_beacon_reports to 0, to disable 802.11k frames. Or set that to a much higher value.

@Ian-Clowes eval_probe_req is a bool. If it is set, the APs check if they are the best AP for the client before sending back a probe response. min_probe_count is the number of probe_requests that needs to first be received until a probe response is send back. That should force the client to scan the whole spectrum and allow to create a hearing map. There a lot of problems with this, like mac randomization in probe scanning.

morhimi commented 4 years ago

@PolynomialDivision how much is a "much higher value"? It tried setting it to 300 but I still get a 1-2 logs every second This is with: option update_beacon_reports '300' option min_probe_count '1' option eval_probe_req '1'

morhimi commented 4 years ago

I tried setting option update_beacon_reports '0' and restarted dawn of course, but that log is still there in masses (btw probably should change it to Neighbor)

thanks for your assistance

PolynomialDivision commented 4 years ago

@morhimi U are running latest dawn? Which OpenWrt version? That should not happen? Did u changed that on all APs? I will test again. That null-mac thing happens If a client reports just an empty beacon report back with mac "00:00:00:00" ...

morhimi commented 4 years ago

@PolynomialDivision Every change is done on both APs running DAWN, and then I restart dawn using /etc/init.d/dawn restart I do want to mention that I have two additional APs that serve the same SSID running 19.07.3 without DAWN to summarise, 4 APs, 2 running snapshot with DAWN and 2 running 19.07.3, all 4 are dual band with the same SSID for both 2.4ghz and 5ghz bands

OpenWrt version: OpenWrt SNAPSHOT, r13719-66e04abbb6

root@OpenWrt_RM2100_Kids:~# opkg list-installed | grep dawn
dawn - 2020-06-12-ada3bf3f-1
luci-app-dawn - git-20.188.40628-3a5ee5c
Ian-Clowes commented 4 years ago

The test that prints the message is for a NULL pointer, not a valid pointer to a 00:00:00:00:00;00 MAC. decide_function() explicitly calls better_ap_available() with a NULL char *neighbor parameter, while the other path sets it to valid pointer to a char array. decide_function() looks at min_probe_count and eval_proq_req brfore reacing that call, so if they are set "appropriately" they should prevent that path that gives the NULL message.

morhimi commented 4 years ago

It seems like my version of DAWN is from June 12 (I assume based on the pkg name dawn - 2020-06-12-ada3bf3f-1) There are newer commits from June 18 (and of course additional made yesterday)

How come the online pkg (in the opkg repo) didn't include the June 18 deliveries? Could it be that the issue I'm seeing has been fixed already? What is the recommended way to run the latest DAWN (without re-flashing the entire openwrt image)

image

morhimi commented 4 years ago

Still seeing many of these logs, on all of my AP: root@OWRT_RM2100_Entry:~# logread -f Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:12:54 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:13:02 2020 daemon.err dawn[3730]: Neigbor-Report is NULL! Sat Aug 15 15:13:02 2020 daemon.err dawn[3730]: Neigbor-Report is NULL!

PolynomialDivision commented 4 years ago

How come the online pkg (in the opkg repo) didn't include the June 18 deliveries?

The buildbot is very busy and needs a lot of time to get to a recent version.

Could it be that the issue I'm seeing has been fixed already?

We fixed a lot of memory allocation bugs. I really suggest to update to the newest version.

What is the recommended way to run the latest DAWN (without re-flashing the entire openwrt image)

You can compile the ipkg package by yourself and then copy it on the device and install it with opkg (before uninstall dawn with opkg remove dawn).

PolynomialDivision commented 4 years ago

I tried setting option update_beacon_reports '0' and restarted dawn of course, but that log is still there in masses (btw probably should change it to Neighbor)

Did u changed the value on all APs u are running dawn? Since this should switch off 802.11k...

eems01 commented 4 years ago

Adding an observation from my 3 x R7800 setup on latest master. I was also seeing a lot of Neigbor-Report is NULL! errors in system log. In troubleshooting I noticed that if I reboot all 3 routers and do not visit the luci-app-dawn pages in LuCI that I do not have these errors. I'm not a coder but maybe this observation can help track the cause of this in the current builds.

morhimi commented 4 years ago

I tried setting option update_beacon_reports '0' and restarted dawn of course, but that log is still there in masses (btw probably should change it to Neighbor)

Did u changed the value on all APs u are running dawn? Since this should switch off 802.11k...

I upgraded all AP (and added 2 more, so now running 4 AP with DAWN) to August 13 snapshot, DAWN version from August 7. I used default DAWN configuration, and then changed some values, here is my latest config (same for 4 APs):

root@OpenWrt_RM2100_Kids:~# cat /etc/config/dawn

config network
    option broadcast_ip '10.0.0.255'
    option broadcast_port '1025'
    option tcp_port '1026'
    option network_option '2'
    option shared_key 'Niiiiiiiiiiiiiik'
    option iv 'Niiiiiiiiiiiiiik'
    option use_symm_enc '1'
    option collision_domain '-1'
    option bandwidth '-1'

config ordering
    option sort_order 'cbfs'

config hostapd
    option hostapd_dir '/var/run/hostapd'

config times
    option update_client '10'
    option denied_req_threshold '30'
    option remove_client '15'
    option remove_probe '30'
    option remove_ap '460'
    option update_hostapd '10'
    option update_tcp_con '10'
    option update_chan_util '5'
    option update_beacon_reports '20'

config metric
    option ap_weight '0'
    option ht_support '0'
    option vht_support '0'
    option no_ht_support '0'
    option no_vht_support '0'
    option rssi '10'
    option low_rssi '-500'
    option chan_util '0'
    option max_chan_util '-500'
    option chan_util_val '140'
    option max_chan_util_val '170'
    option min_probe_count '0'
    option bandwidth_threshold '6'
    option use_station_count '1'
    option max_station_diff '1'
    option deny_auth_reason '1'
    option deny_assoc_reason '17'
    option use_driver_recog '1'
    option chan_util_avg_period '3'
    option set_hostapd_nr '1'
    option op_class '0'
    option duration '0'
    option mode '0'
    option scan_channel '0'
    option kicking '1'
    option eval_probe_req '1'
    option rssi_val '-58'
    option low_rssi_val '-68'
    option freq '200'
    option min_number_to_kick '1'
    option eval_auth_req '1'
    option eval_assoc_req '1'
    option evalcd_assoc_req '1'
morhimi commented 4 years ago

@PolynomialDivision if you have a like a debug version of DAWN then I can run it on my system and report results

PolynomialDivision commented 4 years ago

@morhimi The new version has some fixes you need. The log message will continue to appear. It would be useful if u could make a screenshot of the hearing map, so I can see how clients behave, e.g. if they report rcpi and rsni values.

Further, try setting option update_beacon_reports '20' to option update_beacon_reports '120'

morhimi commented 4 years ago

Thanks,. I'm still running the August 21 version of Dawn (the 25th version isn't built yet for my platform (https://downloads.openwrt.org/snapshots/packages/mipsel_24kc/packages/), so probably tomorrow I'll upgrade all AP and report. for now I'll make the changes you suggested

PolynomialDivision commented 4 years ago

probably tomorrow I'll upgrade all AP and report.

The buildbot needs a lot of time to build all packages. Not sure if latest version is already there tomorrow. But you could compile it yourself or if u habe mips24kc you can download the ipkg from github workflow.

morhimi commented 4 years ago

in order to compile it do I need the full OpenWrt code env? have any of you done it on Mac? I will be happy to compile it myself and help with this project

mcaptur commented 4 years ago

@PolynomialDivision so i have latest build on 4 APs.. how can i help you solve this? Attached find capture maps..

I get daemon.err dawn[2114]: Neigbor-Report is NULL! on all 4 routers 1&2 and xiaomi mir3g 3 is dlink dir-878 4 is redmi AC2100

Capture1 Capture2 Capture3 Capture4

PolynomialDivision commented 4 years ago

@mcaptur Which hostapd version?

mcaptur commented 4 years ago

am using wpad-openssl which i think contains has hostapd full

mcaptur commented 4 years ago

@mcaptur Which hostapd version?

root@OpenWrt-RedmiAC2100:~# hostapd -v hostapd v2.10-devel User space daemon for IEEE 802.11 AP management, IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator Copyright (c) 2002-2019, Jouni Malinen j@w1.fi and contributors

amaumene commented 4 years ago

Hi,

I have the same issue with wpad-wolfssl:

Fri Sep 25 14:23:32 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:23:33 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:23:41 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:23:42 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:23:42 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:24:03 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:24:10 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:24:12 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:24:12 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:24:37 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:24:37 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:25:09 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:25:38 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:25:42 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:00 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:14 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:33 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:34 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:34 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:34 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:38 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:26:51 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:27:16 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:27:33 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:27:34 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:27:38 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:28:39 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:28:43 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:28:49 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:29:02 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:29:36 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
Fri Sep 25 14:29:36 2020 daemon.err dawn[3590]: Neigbor-Report is NULL!
turboproc commented 3 years ago

Running OpenWrt SNAPSHOT r15115-760952ad02 on a Netgear Nighthawk X4S R7800, this issue still seems to exist. What I can't assess yet is the impact of this on the well functioning of Dawn.

Dawn snapshot in this case is dawn - 2020-09-03-b639145c-1

Any idea ?

NilsRo commented 3 years ago

Me also, any ideas?

PolynomialDivision commented 3 years ago

Should be fixed.