kaloz / mwlwifi

mac80211 driver for the Marvell 88W8864 802.11ac chip
395 stars 119 forks source link

88W8964 issues with multiple SSID #162

Closed kubrickfr closed 7 years ago

kubrickfr commented 7 years ago

Driver / hardware properties:

driver name: mwlwifi
chip type: 88W8964
hw version: 7
driver version: 10.3.4.0-20170512
firmware version: 0x09020105
power table loaded from dts: no
firmware region code: 0x30
mac address: 60:38:e0:b6:97:ca
2g: disable
5g: enable
antenna: 4 4
irq number: 45
ap macid support: 0000ffff
sta macid support: 00010000
macid used: 00000003
radio: enable
iobase0: e0e00000
iobase1: e1080000
tx limit: 1024
rx limit: 16384

On LEDE, with multiple SSID on the 5GHz channels, only one SSID is usable at a time, connection is dropped when another device connects to another SSID.

yuhhaurlin commented 7 years ago

I will check it. Thanks.

kb3tbx commented 7 years ago

@kubrickfr, how do you make that list?

kubrickfr commented 7 years ago

@kb3tbx cat /sys/kernel/debug/ieee80211/phy0/mwlwifi/info @yuhhaurlin I don't know if it's linked to this problem, or power reporting or what, but jitter is absolutely awful. I tried to play openra on it an it was just unusable, the game would just pause every second.

yuhhaurlin commented 7 years ago

So the problem is related to multi SSID or throughput?

Voltara commented 7 years ago

I'm glad to see this ticket here; I just set up LEDE on a new WRT3200ACM (same chip type), and had lots of trouble when I tried to use multiple SSID on a radio. I ended up working around the problem by using only 1 SSID for each radio (one on the 5GHz radio0, the other on the 2.4GHz radio1.) Since I made that change, it has been running stable.

@kubrickfr I am curious, does this affect your 5GHz radio only, or do you also have problems with 2.4GHz? I had problems with both.

I tried version 10.3.4.0-20170512 also HEAD (commit 6d1595e9db5dec8e61332f6be431b3bf972d6d08), and neither version reliably handled multiple SSID.

My experience in troubleshooting wifi at this layer (beyond what tcpdump can tell me) is very limited, but I'll do what I can to help. I am building my own LEDE images, so I can test driver changes fairly easily.

yuhhaurlin commented 7 years ago

Thanks. I will check it.

kubrickfr commented 7 years ago

@Voltara: I haven't tried 2.4GHz at all, it's just too crowded where I live, I tried 5GHz only.

@yuhhaurlin: troughput is not a problem, jitter is.

ad019 commented 7 years ago

Using DD-WRT v3.0-r32104 std (05/19/17) on WRT3200ACM. I did some testing to setup the guest network (both on 2.4 Ghz and 5 Ghz) but it does not seem to work. Clients cannot connect to even an open guest network. When security is enabled, I keep on getting a message that the password is incorrect.

yuhhaurlin commented 7 years ago

This one will be checked later. Thanks.

ValCher1961 commented 7 years ago

For statistics. Created on wrt3200 (Debian) two different protected SSID, everything works.

BrainSlayer commented 7 years ago

@ValCher1961 are the encryption keys on the second network different too? @ad019 can you post the exact setup? especially the encryption type etc. (without showing your keys for sure)

ValCher1961 commented 7 years ago

Yes, I did it with a different key, and I tried an open guest point. Customers don't have any problems.

ValCher1961 commented 7 years ago

I have HOSTAPD 2.6, and it may be able to communicate. What's the version in LEDE, DD-WRT?

ad019 commented 7 years ago

@BrainSlayer here are the snapshots of the settings - wireless, security and DHCPD.

screen shot 2017-05-21 at 3 32 37 pm screen shot 2017-05-21 at 3 33 03 pm screen shot 2017-05-21 at 3 35 53 pm
ad019 commented 7 years ago

Here's the error that I get

screen shot 2017-05-21 at 3 46 50 pm
BrainSlayer commented 7 years ago

can you try the same with a bridged vap interface? it could make a difference

ad019 commented 7 years ago

Tried that. No difference.

ad019 commented 7 years ago

Seems that if there is no other client connected to the radio then the guest network is accessible. I disconnected all other clients from the 2.4 Ghz radio and then I was able to connect to the guest network. But as soon as other clients join in, the laptop connected to the guest network lost the connection. This is what the original post from @kubrickfr was.

loomy commented 7 years ago

same here. with 5 and 2,4GHz on openwrt latest master and mwlwifi from HEAD of this repo

wlans are bridged in lan and guests interface

the multiple SSIDs worked on the older mwlwifi version thats shipped in openwrt master branch per default. ....at least for some time....until it needs a reboot

Voltara commented 7 years ago

I can confirm that multiple SSID works on commit ccdfdac28f7666474745b1f46f0769f3a2879b5f as @loomy pointed out. It does not work as of at least 7b7611dcb978bdba4465ef0a093a7397136a0053.

I could not test any of the 39 intervening commits because my wifi clients failed to connect at all. I may try testing again later tonight, assuming I can get them working by cherry-picking the beacon fix from 7b7611dcb978bdba4465ef0a093a7397136a0053.

Edit: So far I have been able to further narrow down the range to these commits. To test 87b163f176a4cb83dc709b9c9ab8f793607efd5c, I needed to cherry-pick a7cb7ca05cea6c754621d1fd6fb699adee0c4602, fb05f8805d05d829b1ed1a7f05c4d93866b4a61b, and 7b7611dcb978bdba4465ef0a093a7397136a0053. So I technically still haven't ruled out those specific commits.

# possible first bad commit: [87b163f176a4cb83dc709b9c9ab8f793607efd5c] Fixed problem: restart mwlwifi to let AP work.
# possible first bad commit: [ca699af60b7066fdfee714bf7d535fdbb513b294] Connected rx antenna setting for 88W8964.
# possible first bad commit: [25b90b19a657c240f9f288e97fdf2fb9ede3f9da] Added debugfs "ratetable" to get rate table.
# possible first bad commit: [ce314329672be1e6e8fbf4147d5505631c307d8d] Added draft version for new data path.
# possible first bad commit: [618bbc0d580e0ca428dd02709281520ac2783eee] Change driver version to 10.3.4.0-20170216.
# possible first bad commit: [f834af06a18eeacd300cfe2c3bddaf1f3eaa95f4] Re-architecture mwlwifi.
# possible first bad commit: [7b96b8a8d9871a5fe7028da24122917f8da9f84a] Modification of the code to load firmware 9.1.2.5.
# possible first bad commit: [5fac04c8f9b1631d9c402c79b65342be251faaaa] Upgrade 88W8964 firmware to 9.1.2.5.
Voltara commented 7 years ago

I think I'm starting to get somewhere in this troubleshooting. This may be an issue with station ID (stnid in the driver) conflict between multiple VIFs (SSIDs) on the same phy.

This sequence always fails ("alpha" and "beta" are SSIDs on the same phy):

  1. Client A connects to SSID "alpha" (assigned stnid 1 - success)
  2. Client B connects to SSID "beta" (also assigned stnid 1 - failure)

This always succeeds:

  1. Client A connects to SSID "beta" (assigned stnid 1 - success)
  2. Client B connects to SSID "beta" (assigned stnid 2 - success)
  3. Client A disconnects from "beta" (frees up stnid 1)
  4. Client A connects to SSID "alpha" (assigned stnid 1 - success)
ValCher1961 commented 7 years ago

I'm curious, but it worked. You have to take into account that the installation was a Debian. The only thing I've encountered is that it's impossible to issue IP addresses to clients connected to WLAN1.1. I didn't investigate this problem deeper, but I assigned the IP manually, and it worked. Another interesting point is that the configuration of wlan1.conf (WLAN1) uses two channels (6 + 10), and the WLAN1.1 for some reason only uses channel 6, although it should also use Channel 10.

BrainSlayer commented 7 years ago

the channel config for wlan1.1 is not neccessary. remove it

ValCher1961 commented 7 years ago

Of course, the recommended input is specified for the WLAN 1.1 and nothing else. bssid, bss, ssid, wpa, wpa_passphrase, wpa_key_mgmt, rsn_pairwise.

Voltara commented 7 years ago

Confirmed this is caused by aid conflict. The sta_link table won't work as intended because it is keyed on a number that's unique per SSID (vif), not per radio; and pcie_rx_process_fast_data is using it to try and look up the vif. The old firmware used the bssid from the packet header to decide which vif to use, but the new firmware evidently strips out the bssid before it gets to the driver.

As a workaround, I patched hostapd to assign aid numbers per-interface (radio) instead of per-bss (SSID). So far multiple SSID have been working well with the modified hostapd in preliminary testing; I will test more thoroughly later.

Here are the patched hostapd sources: https://github.com/Voltara/hostap/tree/mwlwifi-workaround

yuhhaurlin commented 7 years ago

Thanks. In fact, STA is also related to aid. I try to fix it now. I think this problem can also be fixed.

loomy commented 7 years ago

@Voltara thx for the patch. workaround works for me so the problem seems to be identified.

@yuhhaurlin if you need some more debugging output from a 3200ACM. I could help out

yuhhaurlin commented 7 years ago

Please help to check https://github.com/kaloz/mwlwifi/commit/b3d5924f039085fb86a13a606d8f4e9ed63c97a6. Thanks.

Voltara commented 7 years ago

I am tentatively going to say that it is working. I have clients successfully connecting to 2 SSIDs on the same radio.

# cat /sys/kernel/debug/ieee80211/phy0/mwlwifi/stnid

stnid: 1 macid: 0 aid: 1
stnid: 2 macid: 1 aid: 1
stnid: 3 macid: 1 aid: 2
stnid: 4 macid: 1 aid: 3

I had problems connecting when I configured a 3rd SSID, but I believe that issue to be unrelated to this commit because it also didn't work when I tried with the previous driver version.

I will continue running version b3d5924 and will report back if I run into any problems.

Voltara commented 7 years ago

Here is a new issue which started after I installed commit b3d5924. My laptop started throwing "wrong command queue" warnings to kern.log. These are new: no prior instances of the warning in the past month of kern.log entries:

May 27 12:10:56 meki kernel: [19602.128708] ------------[ cut here ]------------
May 27 12:10:56 meki kernel: [19602.128737] WARNING: CPU: 2 PID: 557 at /build/linux-0XAgc4/linux-4.4.0/drivers/net/wireless/iwlwifi/pcie/tx.c:1619 iwl_pcie_hcmd_complete+0x376/0x4d0 [iwlwifi]()
May 27 12:10:56 meki kernel: [19602.128741] wrong command queue 16 (should be 4), sequence 0x1053 readp=61 writep=61
May 27 12:10:56 meki kernel: [19602.128743] Modules linked in: snd_seq_dummy ctr ccm pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) bbswitch(OE) ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack binfmt_misc iptable_filter ip_tables x_tables pn544_mei mei_phy pn544 hci nfc dell_wmi sparse_keymap dell_laptop dcdbas dell_smm_hwmon uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core v4l2_common videodev media intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel arc4 kvm irqbypass iwldvm nvidia_uvm(POE) joydev mac80211 input_leds serio_raw iwlwifi snd_hda_codec_hdmi snd_hda_codec_realtek snd_soc_rt5640 snd_hda_codec_generic cfg80211 snd_soc_rl6231 lpc_ich snd_soc_core snd_hda_intel snd_hda_codec snd_compress mei_me snd_hda_core ac97_bus ie31200_edac snd_hwdep snd_pcm_dmaengine shpchp mei edac_core snd_pcm snd_seq_midi snd_seq_midi_event wmi snd_rawmidi snd_seq snd_seq_device snd_timer elan_i2c 8250_fintek snd soundcore dw_dmac dell_smo8800 i2c_designware_platform dw_dmac_core i2c_designware_core snd_soc_sst_acpi spi_pxa2xx_platform 8250_dw dell_rbtn mac_hid coretemp sunrpc parport_pc ppdev lp parport autofs4 jitterentropy_rng drbg ansi_cprng algif_skcipher af_alg dm_crypt hid_logitech_hidpp hid_logitech_dj hid_generic usbhid mmc_block crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd nvidia_drm(POE) nvidia_modeset(POE) i915 psmouse nvidia(POE) i2c_algo_bit drm_kms_helper syscopyarea ahci sysfillrect libahci sysimgblt fb_sys_fops e1000e sdhci_pci drm ptp pps_core sdhci_acpi sdhci video i2c_hid hid fjes
May 27 12:10:56 meki kernel: [19602.128900] CPU: 2 PID: 557 Comm: irq/34-iwlwifi Tainted: P        W  OE   4.4.0-78-generic #99-Ubuntu
May 27 12:10:56 meki kernel: [19602.128903] Hardware name: Dell Inc. Precision M4800/0W7R2C, BIOS A16 12/01/2015
May 27 12:10:56 meki kernel: [19602.128907]  0000000000000286 000000009d8db1ec ffff880418b47b88 ffffffff813f8dd3
May 27 12:10:56 meki kernel: [19602.128912]  ffff880418b47bd0 ffffffffc1326f88 ffff880418b47bc0 ffffffff81081302
May 27 12:10:56 meki kernel: [19602.128917]  ffff88025985b000 0000000000001053 0000000000000850 ffff880415a812e0
May 27 12:10:56 meki kernel: [19602.128921] Call Trace:
May 27 12:10:56 meki kernel: [19602.128930]  [<ffffffff813f8dd3>] dump_stack+0x63/0x90
May 27 12:10:56 meki kernel: [19602.128938]  [<ffffffff81081302>] warn_slowpath_common+0x82/0xc0
May 27 12:10:56 meki kernel: [19602.128943]  [<ffffffff8108139c>] warn_slowpath_fmt+0x5c/0x80
May 27 12:10:56 meki kernel: [19602.128957]  [<ffffffffc1311e56>] iwl_pcie_hcmd_complete+0x376/0x4d0 [iwlwifi]
May 27 12:10:56 meki kernel: [19602.128966]  [<ffffffff8184055e>] ? _raw_spin_unlock_bh+0x1e/0x20
May 27 12:10:56 meki kernel: [19602.128985]  [<ffffffffc1375d65>] ? iwl_add_sta_callback+0x125/0x150 [iwldvm]
May 27 12:10:56 meki kernel: [19602.128999]  [<ffffffffc137a323>] ? iwl_rx_dispatch+0x83/0xe0 [iwldvm]
May 27 12:10:56 meki kernel: [19602.129011]  [<ffffffffc130d035>] iwl_pcie_rx_handle+0x375/0x960 [iwlwifi]
May 27 12:10:56 meki kernel: [19602.129021]  [<ffffffff8102d66c>] ? __switch_to+0x1dc/0x5c0
May 27 12:10:56 meki kernel: [19602.129033]  [<ffffffffc130e727>] iwl_pcie_irq_handler+0x5b7/0x8d0 [iwlwifi]
May 27 12:10:56 meki kernel: [19602.129041]  [<ffffffff810dc3e0>] ? irq_finalize_oneshot.part.35+0xe0/0xe0
May 27 12:10:56 meki kernel: [19602.129046]  [<ffffffff810dc400>] irq_thread_fn+0x20/0x50
May 27 12:10:56 meki kernel: [19602.129050]  [<ffffffff810dc768>] irq_thread+0x138/0x1c0
May 27 12:10:56 meki kernel: [19602.129055]  [<ffffffff810dc4a0>] ? irq_forced_thread_fn+0x70/0x70
May 27 12:10:56 meki kernel: [19602.129059]  [<ffffffff810dc630>] ? irq_thread_check_affinity+0xe0/0xe0
May 27 12:10:56 meki kernel: [19602.129065]  [<ffffffff810a0bf8>] kthread+0xd8/0xf0
May 27 12:10:56 meki kernel: [19602.129069]  [<ffffffff810a0b20>] ? kthread_create_on_node+0x1e0/0x1e0
May 27 12:10:56 meki kernel: [19602.129075]  [<ffffffff81840dcf>] ret_from_fork+0x3f/0x70
May 27 12:10:56 meki kernel: [19602.129080]  [<ffffffff810a0b20>] ? kthread_create_on_node+0x1e0/0x1e0
May 27 12:10:56 meki kernel: [19602.129084] ---[ end trace eff9a5ab9a0a5811 ]---
May 27 12:10:56 meki kernel: [19602.129088] iwl data: 00000000: 4c 08 80 0a 18 4e 53 10 39 38 62 ef e1 82 54 52  L....NS.98b...TR
May 27 12:10:56 meki kernel: [19602.129093] iwl data: 00000010: 31 10 5f f8 42 a9 21 ce da 66 27 59 61 17 8c 0b  1._.B.!..f'Ya...
yuhhaurlin commented 7 years ago

@Voltara This is iwlwifi, I don't know how to check it.

Voltara commented 7 years ago

@yuhhaurlin Understood. I'm reporting it here because it's a new error which started happening when my laptop is connected to my AP running mwlwifi commit b3d5924.

I ended up reverting to 6d1595e (and the hostapd workaround) because my laptop kept crashing while using the wireless. Whatever changed between 6d1595e and b3d5924 is evidently triggering some bugs in iwlwifi.

I have not yet tried commit 2d4b9bc (with the 9.3.0.7 firmware) as I was not sure if it is ready for testing.

yuhhaurlin commented 7 years ago

9.3.0.7 fixes uncorrected size of AMSDU for 11n client, maybe you can try it. Thanks.

Voltara commented 7 years ago

9.3.0.7 fixed the iwlwifi warnings and crashes. I will continue using and testing 2d4b9bc, but so far this seems to correct the multiple SSID issue.

yuhhaurlin commented 7 years ago

Thanks. I close this one.

lewisdiamond commented 7 years ago

I'm on kernel 4.9.30 (LEDE 61eb18d3f7449fd3379d9bd995af85d669dbc9ff) and using mwlwifi 2d4b9bcd6c82043284fc83a3b4c92d93db39a9c1

I see a problem with having more than 2 SSID on the same radio. 2 works apparently fine (not much testing was done), but as soon as I add a 3rd, the first two break and I can't connect to them. I can still connect to the last one added (or enabled).

Voltara commented 7 years ago

I ran into the same problem when I tried adding a 3rd SSID. I haven't yet had a chance yet to experiment further, but I think it is a separate issue from the one reported here.

yuhhaurlin commented 7 years ago

I will check it. Thanks.

kb3tbx commented 7 years ago

Yes, I have this. Two associate OK; Three ESSIDs gives only the last. The first two disconnect clients immediately but still broadcast. IOS gives a hint that the Encryption Key is no good, but no - I changed one of the originals to 'no encryption' and still unable to join those networks.

WRT3200ACM

root@LEDE:~# uname -a Linux LEDE 4.9.30 #0 SMP Fri May 26 22:48:37 2017 armv7l GNU/Linux

root@LEDE:~# cat /sys/kernel/debug/ieee80211/phy0/mwlwifi/info driver name: mwlwifi chip type: 88W8964 hw version: 7 driver version: 10.3.4.0-20170512 firmware version: 0x09030007 power table loaded from dts: no firmware region code: 0x10 mac address: 60:38:e0:b4:6b:82 2g: disable 5g: enable antenna: 4 4 irq number: 45 ap macid support: 0000ffff sta macid support: 00010000 macid used: 00000007 radio: enable iobase0: e1000000 iobase1: e1280000 tx limit: 1024 rx limit: 16384

root@LEDE:~# iwinfo 5G-test3 ESSID: "Lede-3" Access Point: 66:38:E0:B4:6B:82 Mode: Master Channel: 36 (5.180 GHz) Tx-Power: 23 dBm Link Quality: unknown/70 Signal: unknown Noise: -104 dBm Bit Rate: unknown Encryption: none Type: nl80211 HW Mode(s): 802.11nac Hardware: 11AB:2B40 11AB:0000 [Generic MAC80211] TX power offset: unknown Frequency offset: unknown Supports VAPs: yes PHY name: phy0

wlan0 ESSID: "LEDE_AC" Access Point: 60:38:E0:B4:6B:82 Mode: Master Channel: 36 (5.180 GHz) Tx-Power: 23 dBm Link Quality: unknown/70 Signal: unknown Noise: -104 dBm Bit Rate: unknown Encryption: WPA2 PSK (CCMP) Type: nl80211 HW Mode(s): 802.11nac Hardware: 11AB:2B40 11AB:0000 [Generic MAC80211] TX power offset: unknown Frequency offset: unknown Supports VAPs: yes PHY name: phy0

wlan0-1 ESSID: "OpenWrt" Access Point: 62:38:E0:B4:6B:82 Mode: Master Channel: 36 (5.180 GHz) Tx-Power: 23 dBm Link Quality: unknown/70 Signal: unknown Noise: -104 dBm Bit Rate: unknown Encryption: WPA2 PSK (CCMP) Type: nl80211 HW Mode(s): 802.11nac Hardware: 11AB:2B40 11AB:0000 [Generic MAC80211] TX power offset: unknown Frequency offset: unknown Supports VAPs: yes PHY name: phy0

wlan1 ESSID: "LEDE-G" Access Point: 60:38:E0:B4:6B:81 Mode: Master Channel: 11 (2.462 GHz) Tx-Power: 30 dBm Link Quality: unknown/70 Signal: unknown Noise: -104 dBm Bit Rate: unknown Encryption: WPA2 PSK (CCMP) Type: nl80211 HW Mode(s): 802.11bgn Hardware: 11AB:2B40 11AB:0000 [Generic MAC80211] TX power offset: unknown Frequency offset: unknown Supports VAPs: yes PHY name: phy1

lewisdiamond commented 7 years ago

For the record, I have the same behavior. Only the third ESSID works, the first two keep broadcasting but it's impossible to connect to them. Unfortunately I don't have much information about the connection failure.

lewisdiamond commented 7 years ago

I suggest reopening this issue since it only works up to 2 SSIDs. Even if there was a fix for 2, another fix is needed to go beyond that and this issue is still relevant.

yuhhaurlin commented 7 years ago

I think it should use different BSSIDs for these VAPs.

yuhhaurlin commented 7 years ago

It looks like current LEDE will create all VAPs with same BSSID.

BrainSlayer commented 7 years ago

@yuhhaurlin : look closer. they are different 60:38:E0:B4:6B:82 vs 62:38:E0:B4:6B:82

yuhhaurlin commented 7 years ago

Sorry. You are right. I will check this problem.

ad019 commented 7 years ago

I tired to create 2 different BSSIDs for VAP on 2.4 Ghz radio and the radio refused to start up. Using dd-wrt r32149. Removed one and the radio came up again.

yuhhaurlin commented 7 years ago

You need to check if hostapd is running. If not, you need to check configuration file of hostapd.

BrainSlayer commented 7 years ago

did a quick test with 2 vaps in dd-wrt. hostapd is running. but i see also all 3 ssid's broadcasting so far. i will now enable encryption for all 3 interfaces for testing

yuhhaurlin commented 7 years ago

I can reproduce the problem with three SSIDs. I will try to fix it. Thanks.

BrainSlayer commented 7 years ago

okay. at least for me i see all 3 broadcasting. i also dont know what the user means with "i tried to create 2 different bssid's" dd-wrt always uses different bssids and does not allow to change it manually within the wifi configuration