greearb / ath10k-ct

Stand-alone ath10k driver based on Candela Technologies Linux kernel.
111 stars 40 forks source link

Dynamic VLAN and IPv6 #134

Open dhess opened 4 years ago

dhess commented 4 years ago

Description of the problem

Dynamic VLAN breaks IPv6 (tested with macOS Catalina and iOS 13 clients -- neither works). IPv6 works without dynamic VLAN enabled, and IPv4 and VLAN assignment work with dynamic VLAN enabled.

Wireless clients configure their wireless interfaces with SLAAC and receive the IPv6 route advertisement, but can't pass IPv6 traffic, including ping.

It looks like an NDP issue. On macOS, after associating with the WAP, there are no neighbors in the client's cache other than the client's own IPv6 addresses. If I manually ping the router, IPv6 starts working, but then fails again as soon as the neighbor cache entry for the router expires.

Software (OS, Firmware version, kernel, driver, etc)

OpenWRT 19.07.2, built from the image builder. I'm using the -CT firmware and -CT ath10k driver.

Hardware (NIC chipset, platform, etc)

Netgear R7800.

Logs (dmesg, maybe supplicant and/or hostap)

Here's the relevant dmesg output:

[   12.233615] ath10k_pci 0000:01:00.0: assign IRQ: got 67
[   12.233637] ath10k 4.19 driver, optimized for CT firmware, probing pci device: 0x46.
[   12.234495] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[   12.240495] ath10k_pci 0000:01:00.0: enabling bus mastering
[   12.240954] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   12.416306] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/fwcfg-pci-0000:01:00.0.txt failed with error -2
[   12.416351] ath10k_pci 0000:01:00.0: Falling back to user helper
[   12.451926] firmware ath10k!fwcfg-pci-0000:01:00.0.txt: firmware_loading_store: map pages failed
[   12.452174] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
[   12.459885] ath10k_pci 0000:01:00.0: Falling back to user helper
[   12.695901] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/ct-firmware-5.bin failed with error -2
[   12.695935] ath10k_pci 0000:01:00.0: Falling back to user helper
[   12.723420] firmware ath10k!QCA9984!hw1.0!ct-firmware-5.bin: firmware_loading_store: map pages failed
[   12.723636] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/ct-firmware-2.bin failed with error -2
[   12.731722] ath10k_pci 0000:01:00.0: Falling back to user helper
[   12.759135] firmware ath10k!QCA9984!hw1.0!ct-firmware-2.bin: firmware_loading_store: map pages failed
[   12.759297] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2
[   12.767409] ath10k_pci 0000:01:00.0: Falling back to user helper
[   12.827148] firmware ath10k!QCA9984!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
[   13.162576] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   13.162610] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 0
[   13.173691] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-012-17ba98334 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 877928bc
[   15.491485] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1 crc32 85498734
[   21.318471] ath10k_pci 0000:01:00.0: 10.4 wmi init: vdevs: 16  peers: 48  tid: 96
[   21.318497] ath10k_pci 0000:01:00.0: msdu-desc: 2500  skid: 32
[   21.400390] ath10k_pci 0000:01:00.0: wmi print 'P 48/48 V 16 K 144 PH 176 T 186  msdu-desc: 2500  sw-crypt: 0 ct-sta: 0'
[   21.401231] ath10k_pci 0000:01:00.0: wmi print 'free: 81784 iram: 23220 sram: 14440'
[   21.658729] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 32 raw 0 hwcrypto 1
[   21.753048] ath: EEPROM regdomain: 0x0
[   21.753061] ath: EEPROM indicates default country code should be used
[   21.753068] ath: doing EEPROM country->regdmn map search
[   21.753084] ath: country maps to regdmn code: 0x3a
[   21.753095] ath: Country alpha2 being used: US
[   21.753103] ath: Regpair used: 0x3a
[   21.757693] ath10k_pci 0001:01:00.0: assign IRQ: got 100
[   21.757729] ath10k 4.19 driver, optimized for CT firmware, probing pci device: 0x46.
[   21.758573] ath10k_pci 0001:01:00.0: enabling device (0140 -> 0142)
[   21.764746] ath10k_pci 0001:01:00.0: enabling bus mastering
[   21.765398] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   21.935766] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/fwcfg-pci-0001:01:00.0.txt failed with error -2
[   21.935797] ath10k_pci 0001:01:00.0: Falling back to user helper
[   22.054699] firmware ath10k!fwcfg-pci-0001:01:00.0.txt: firmware_loading_store: map pages failed
[   22.054889] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0001:01:00.0.bin failed with error -2
[   22.062557] ath10k_pci 0001:01:00.0: Falling back to user helper
[   22.329700] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/ct-firmware-5.bin failed with error -2
[   22.329729] ath10k_pci 0001:01:00.0: Falling back to user helper
[   22.355198] firmware ath10k!QCA9984!hw1.0!ct-firmware-5.bin: firmware_loading_store: map pages failed
[   22.355345] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/ct-firmware-2.bin failed with error -2
[   22.363396] ath10k_pci 0001:01:00.0: Falling back to user helper
[   22.399752] firmware ath10k!QCA9984!hw1.0!ct-firmware-2.bin: firmware_loading_store: map pages failed
[   22.399914] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2
[   22.408048] ath10k_pci 0001:01:00.0: Falling back to user helper
[   22.435505] firmware ath10k!QCA9984!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
[   22.435632] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   22.443359] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 0
[   22.455435] ath10k_pci 0001:01:00.0: firmware ver 10.4b-ct-9984-fW-012-17ba98334 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 877928bc
[   24.773668] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 85498734
[   30.606539] ath10k_pci 0001:01:00.0: 10.4 wmi init: vdevs: 16  peers: 48  tid: 96
[   30.606571] ath10k_pci 0001:01:00.0: msdu-desc: 2500  skid: 32
[   30.690333] ath10k_pci 0001:01:00.0: wmi print 'P 48/48 V 16 K 144 PH 176 T 186  msdu-desc: 2500  sw-crypt: 0 ct-sta: 0'
[   30.691212] ath10k_pci 0001:01:00.0: wmi print 'free: 81784 iram: 23220 sram: 14440'
[   30.957081] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 32 raw 0 hwcrypto 1

Let me know if you need any more information. I've reverted the WAP to the pre-dynamic VLAN config as I need IPv6, but I can get back to the dynamic VLAN config to run some tests.

greearb commented 4 years ago

Can you try latest top-of-tree OpenWrt?

dhess commented 4 years ago

@greearb Thanks for the quick reply. It would be a bit of a PITA. Is there a particular commit in OpenWRT master that you think might fix this issue?

edit It appears I can tell the image builder to use snapshots, which should be easy, in theory. Let me look into that.

dhess commented 4 years ago

OK, I've tested this with tonight's snapshot:

[   14.452751] ath10k_pci 0000:01:00.0: assign IRQ: got 35
[   14.452773] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x46.
[   14.453213] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[   14.459286] ath10k_pci 0000:01:00.0: enabling bus mastering
[   14.459803] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   15.112031] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   15.112063] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 0
[   15.122563] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-013-d81f62d97 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 46b728ef
[   17.440755] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1 crc32 85498734
[   23.264421] ath10k_pci 0000:01:00.0: unsupported HTC service id: 1536
[   23.265549] ath10k_pci 0000:01:00.0: 10.4 wmi init: vdevs: 16  peers: 48  tid: 96
[   23.269844] ath10k_pci 0000:01:00.0: msdu-desc: 2500  skid: 32
[   23.353181] ath10k_pci 0000:01:00.0: wmi print 'P 48/48 V 16 K 144 PH 176 T 186  msdu-desc: 2500  sw-crypt: 0 ct-sta: 0'
[   23.354033] ath10k_pci 0000:01:00.0: wmi print 'free: 84856 iram: 13140 sram: 11224'
[   23.643613] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 32 raw 0 hwcrypto 1
[   23.732903] ath: EEPROM regdomain: 0x0
[   23.732919] ath: EEPROM indicates default country code should be used
[   23.732928] ath: doing EEPROM country->regdmn map search
[   23.732946] ath: country maps to regdmn code: 0x3a
[   23.732959] ath: Country alpha2 being used: US
[   23.732970] ath: Regpair used: 0x3a
[   23.738345] ath10k_pci 0001:01:00.0: assign IRQ: got 37
[   23.738379] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x46.
[   23.739492] ath10k_pci 0001:01:00.0: enabling device (0140 -> 0142)
[   23.745063] ath10k_pci 0001:01:00.0: enabling bus mastering
[   23.745809] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   24.336896] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   24.336955] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 0
[   24.350996] ath10k_pci 0001:01:00.0: firmware ver 10.4b-ct-9984-fW-013-d81f62d97 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 46b728ef
[   26.683557] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 85498734
[   32.546217] ath10k_pci 0001:01:00.0: unsupported HTC service id: 1536
[   32.547144] ath10k_pci 0001:01:00.0: 10.4 wmi init: vdevs: 16  peers: 48  tid: 96
[   32.551709] ath10k_pci 0001:01:00.0: msdu-desc: 2500  skid: 32
[   32.636629] ath10k_pci 0001:01:00.0: wmi print 'P 48/48 V 16 K 144 PH 176 T 186  msdu-desc: 2500  sw-crypt: 0 ct-sta: 0'
[   32.637502] ath10k_pci 0001:01:00.0: wmi print 'free: 84856 iram: 13140 sram: 11224'
[   32.954238] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 32 raw 0 hwcrypto 1

The result is more or less the same. I can communicate with IPv6 hosts on the same subnet as the client (including wired clients), but can't make it past the router. If I ping the router address, then I can communicate with any IPv6 host.

greearb commented 4 years ago

Does this work with stock ath10k driver and firmware?

dhess commented 4 years ago

With 19.07.2 on the R7800, the stock driver + firmware don't appear support dynamic VLANs at all:

Sun May 10 19:22:53 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-RETRANSMIT2 d4:90:9c:d2:c4:67
Sun May 10 19:22:55 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-RETRANSMIT2 08:e6:89:3c:fd:4d
Sun May 10 19:23:02 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-RETRANSMIT2 d4:90:9c:d6:ab:58
Sun May 10 19:23:05 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-RETRANSMIT2 d4:90:9c:d2:c4:67
Sun May 10 19:23:06 2020 daemon.notice hostapd: wlan1: CTRL-EVENT-EAP-RETRANSMIT2 8c:86:1e:4e:bb:95
Sun May 10 19:23:09 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-RETRANSMIT2 8c:86:1e:4e:bb:95
Sun May 10 19:23:15 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-STARTED 08:e6:89:3c:fd:4d
Sun May 10 19:23:15 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-PROPOSED-METHOD vendor=0 method=1
Sun May 10 19:23:15 2020 daemon.info hostapd: wlan0: STA 08:e6:89:3c:fd:4d IEEE 802.11: authenticated
Sun May 10 19:23:15 2020 daemon.info hostapd: wlan0: STA 08:e6:89:3c:fd:4d IEEE 802.11: associated (aid 1)
Sun May 10 19:23:15 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-STARTED 08:e6:89:3c:fd:4d
Sun May 10 19:23:15 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-EAP-PROPOSED-METHOD vendor=0 method=1
Sun May 10 19:23:16 2020 daemon.err hostapd: Failed to create interface wlan0.100: -95 (Not supported)
# iw phy0 info
Wiphy phy0
    max # scan SSIDs: 16
    max scan IEs length: 199 bytes
    max # sched scan SSIDs: 0
    max # match sets: 0
    max # scan plans: 1
    max scan plan interval: -1
    max scan plan iterations: 0
    Retry short limit: 7
    Retry long limit: 4
    Coverage class: 0 (up to 0m)
    Device supports AP-side u-APSD.
    Available Antennas: TX 0xf RX 0xf
    Configured Antennas: TX 0xf RX 0xf
    Supported interface modes:
         * managed
         * AP
         * monitor
         * mesh point
    Band 2:
        Capabilities: 0x19ef
            RX LDPC
            HT20/HT40
...

On tonight's snapshot, I can't get the stock driver + firmware to start on the 5GHz interface, and there are quite a few backtraces in dmesg, but when I connect to the 2.4GHz interface, everything seems to be working fine. The only change between the stock and CT builds is kmod-ath10k and ath10k-firmware-qca9984.

dhess commented 4 years ago

I just tested the 2.4GHz interface with the CT snapshot, and it's the same as 5GHz -- no IPv6 traffic past the router until I ping the router address.

So, to summarize:

/etc/config exactly the same in all 4 cases.

dhess commented 4 years ago

I should also point out that in the statically bridged configuration (no dynamic VLAN, all wireless client traffic bridged to the same wired interface), 19.07.2 with the CT driver + firmware works great with the same IPv6 networks.

dhess commented 4 years ago

I can reproduce this issue on completely different hardware and OS:

Using different combinations of driver + firmware:

[1] http://www.candelatech.com/downloads/ath10k-9984-10-4b/firmware-5-ct-full-htt-mgt-community-12.bin-lede.018

greearb commented 4 years ago

Can you test with stock ath10k driver plus ath10k-ct firmware (get the non-htt-mgt firmware variant, it should work with stock driver). I'd like to know if it is driver or firmware related. If just firmware, then I can build images for you to bisect if you are willing.

dhess commented 4 years ago

Sure, I'm happy to. The CT driver has been rock solid for me in OpenWRT 19.07.2 and the only thing missing is working IPv6 in dynamic VLAN mode!

I'll post some results in a bit.

dhess commented 4 years ago

APU2E4, NixOS, Linux 5.4.39, stock driver, CT firmware (non-htt-mgt): IPv6 doesn't work until I ping the router. (No other changes to config.)

Here's the relevant dmesg:

[   11.257903] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   11.301793] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517
[   11.313210] ath10k_pci 0000:05:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   11.351854] ath10k_pci 0000:01:00.0: Unknown FW IE: 30
[   11.351869] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   11.351875] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[   11.353651] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-013-d81f62d97 api 5 features mfp,peer-flow-ctrl crc32 46b728ef
[   11.369254] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517
[   11.432823] ath10k_pci 0000:05:00.0: Unknown FW IE: 30
[   11.432839] ath10k_pci 0000:05:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043222ff sub 0000:0000
[   11.432846] ath10k_pci 0000:05:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[   11.433609] ath10k_pci 0000:05:00.0: firmware ver 10.1-ct-8x-__fW-022-538f0906 api 5 features wmi-10.x,has-wmi-mgmt-tx,mfp crc32 e1c91a74
[   11.467064] ath10k_pci 0000:05:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
[   12.207877] audit: type=1130 audit(1589331056.646:8): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-timesyncd comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.321084] audit: type=1130 audit(1589331056.760:9): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=irqbalance comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.335643] audit: type=1130 audit(1589331056.775:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=logrotate comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.360748] audit: type=1305 audit(1589331056.800:11): op=set audit_enabled=0 old=1 auid=4294967295 ses=4294967295 subj=kernel res=1
[   12.390169] ath10k_pci 0000:05:00.0: unsupported HTC service id: 1536
[   12.401886] ath10k_pci 0000:05:00.0: htt-ver 2.1 wmi-op 2 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1

(As you can see, there's also a WLE600vx in the APU2E4, but it doesn't seem to work at all with the stock driver + CT firmware for the QCA988X. I can associate with the AP but I don't get an IP address. The WLE600vx works great with stock driver + latest stock firmware, however.)

dhess commented 4 years ago

And, as you might have guessed: APU2E4, NixOS, Linux 5.4.39, CT driver, stock firmware: IPv6 works out of the box.

[   11.293651] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x3c.
[   11.297351] ath10k_pci 0000:05:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   11.336860] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   11.336871] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[   11.338933] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.9.0.2-00091 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast,no-ps,peer-fixed-rate crc32 bffdd0ad
[   11.353572] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517
[   11.420305] ath10k_pci 0000:05:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043222ff sub 0000:0000
[   11.420315] ath10k_pci 0000:05:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[   11.421175] ath10k_pci 0000:05:00.0: firmware ver 10.2.4-1.0-00047 api 5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 35bd9258
[   11.454474] ath10k_pci 0000:05:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
[   12.340739] audit: type=1130 audit(1589332216.780:8): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-timesyncd comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.452595] audit: type=1130 audit(1589332216.893:9): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=irqbalance comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.468709] audit: type=1130 audit(1589332216.909:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=logrotate comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.545457] audit: type=1130 audit(1589332216.986:11): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=dbus comm="systemd" exe="/nix/store/7s71p4xl1djjg4zi4skzx8nk5j2jw3hv-systemd-245.5/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   12.570584] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:31 crc32 85498734
[   12.573070] ath10k_pci 0000:05:00.0: unsupported HTC service id: 1536
[   12.582473] ath10k_pci 0000:05:00.0: wmi print 'P 135 V 16 T 433'
[   12.593875] ath10k_pci 0000:05:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1

The CT driver is applied as a patch to the 5.4.39 kernel, generated from your driver here:

https://github.com/greearb/ath10k-ct/tree/master/ath10k-5.4

It includes the patch here as well:

https://github.com/greearb/ath10k-ct/blob/master/patches/0001-wireless-Relax-beacon_int_min_gcd-and-ADHOC-check.patch

greearb commented 4 years ago

I'll build a series of firmware to bisect...will take a day or two...

greearb commented 4 years ago

Please try bisecting using these images. They should work on stock ath10k driver or ath10k-ct driver. http://www.candelatech.com/downloads/ath10k-9984-10-4b/bisect/all_builds-9984-may-14-2020-full-community.tar.gz

dhess commented 4 years ago

Wow, that's a lot of builds :)

I assume that, e.g., firmware-5-full-community-commit-218-65a183480.bin precedes firmware-5-full-community-commit-219-17a168111.bin, etc?

greearb commented 4 years ago

Yes and yes! If you find yourself thinking you will need to test all 1000+ builds, please google bisect :)

dhess commented 4 years ago

I understand the concept :) Working on it now, I'll get back to you ASAP.

dhess commented 4 years ago

Most of these builds don't support AP VLAN, which means they don't exhibit the dynamic VLAN problem with IPv6 because they don't support dynamic VLANs ;)

greearb commented 4 years ago

Please bisect to find the first that does support it at all then, maybe I can backport that to the earlier images so you can bisect the real bug.

dhess commented 4 years ago

They're probably going to be the same build, i.e., IPv6 never worked with dynamic VLANs.

It's either 1002 or 1003; I'll let you know in a sec.

dhess commented 4 years ago

OK, all builds up to and including firmware-5-full-community-commit-1002-547058dc0.bin appear not to have support for AP/VLAN, so I can't reproduce the test case on those, obviously. (edit I didn't test them all, of course, but every build that I did test between 008 to 1002 did not support AP/VLAN, so I assume it was first added in 1003.)

firmware-5-full-community-commit-1003-9a3b6a55b.bin is the first build that supports AP/VLAN, and I can reproduce the problem with that build: IPv6 broken until the router is pinged.

Let me know if there's anything else I can do. And thanks so much for your help on this!

greearb commented 4 years ago

If you can edit the ath10k-ct driver, please change this code to just always report VLAN support and see if early builds work? diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 97f5865ae421..285eb9c0f92b 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -10374,7 +10374,10 @@ int ath10k_mac_register(struct ath10k *ar) goto err_dfs_detector_exit; }

dhess commented 4 years ago

Sorry, that diff was garbled by GitHub's comment renderer. Can you post it as a gist?

greearb commented 4 years ago

Please look in the ath10k/mac.c file in your ath10k-ct driver code (try searching for IFTYPE_AP_VALN)

Find the code that has condition checks for PER_PACKET_SW_ENCRYPT, and just comment out the conditionals and always enable the IFTYPE_AP_VLAN bits.

That will let earlier firmware attempt to support the feature. I'm not sure if the build 006 (stock-ish firmware) will really support the feature or not, but somewhere along the way support was added.

Thanks, Ben

On 05/16/2020 03:28 PM, Drew Hess wrote:

Sorry, that diff was garbled by GitHub's comment renderer. Can you post it as a gist?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/greearb/ath10k-ct/issues/134#issuecomment-629713886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHNKSZ36BPF3WCFHX5R3TRR4HQBANCNFSM4M5IOHSA.

-- Ben Greear greearb@candelatech.com Candela Technologies Inc http://www.candelatech.com

dhess commented 4 years ago

It was a good idea to check this, because with the CT driver hack you suggested, the very first firmware you sent me (008-46433c339) works with AP/VLAN. Clients are assigned to the proper VLAN and IPv4 seems to work perfectly.

Unfortunately, with this first firmware, IPv6 doesn't work as expected and exhibits the same issue as described above.

Here's the dmesg with the hacked CT driver and the 008-46433c339 firmware:

[   11.217894] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x46.
[   11.224477] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   11.298947] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517
[   11.305117] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x3c.
[   11.310751] ath10k_pci 0000:05:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[   11.352226] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[   11.352236] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[   11.354478] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-000-46433c339 api 5 features mfp,peer-flow-ctrl,wmi-10.x-CT crc32 bc4a0c1d
[   11.367130] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517
[   11.432185] ath10k_pci 0000:05:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043222ff sub 0000:0000
[   11.432194] ath10k_pci 0000:05:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[   11.432943] ath10k_pci 0000:05:00.0: firmware ver 10.1-ct-8x-__fH-022-538f0906 api 5 features wmi-10.x,mfp,txstatus-noack,wmi-10.x-CT,ratemask-CT,txrate-CT,get-temp-CT,tx-rc-CT,cust-stats-CT,retry-gt2-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT crc32 d870ee1d
[   11.462544] ath10k_pci 0000:05:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
[   12.375071] ath10k_pci 0000:05:00.0: unsupported HTC service id: 1536
[   12.375720] ath10k_pci 0000:05:00.0: 10.1 wmi init: vdevs: 16  peers: 127  tid: 256
[   12.385261] ath10k_pci 0000:05:00.0: wmi print 'P 128 V 8 T 410'
[   12.385466] ath10k_pci 0000:05:00.0: wmi print 'msdu-desc: 1424  sw-crypt: 0 ct-sta: 0'
[   12.385987] ath10k_pci 0000:05:00.0: wmi print 'alloc rem: 21000 iram: 25960'
[   12.418135] ath10k_pci 0000:05:00.0: htt-ver 2.2 wmi-op 2 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1

I tested 008 and 1002, and both had this IPv6 issue. Therefore, I didn't bother testing any of the builds inbetween.

If you have earlier firmware builds you'd like to test, let me know as I have a relatively easy way to test them now.

greearb commented 4 years ago

I guess the good news is that it is not a regression in my code, but the QCA firmware I based it on must have the bug.

Sometime soon I am going to try to port in some newer upstream code, maybe I can find the fix somewhere in there.

Thanks, Ben

On 05/17/2020 07:54 AM, Drew Hess wrote:

It was a good idea to check this, because with the CT driver hack you suggested, the very first firmware you sent me (|008-46433c339|) works with AP/VLAN. Clients are assigned to the proper VLAN and IPv4 seems to work perfectly.

Unfortunately, with this first firmware, IPv6 doesn't work as expected and exhibits the same issue as described above.

Here's the dmesg with the hacked CT driver and the |008-46433c339| firmware:

|[ 11.217894] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x46. [ 11.224477] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0 [ 11.298947] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517 [ 11.305117] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x3c. [ 11.310751] ath10k_pci 0000:05:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0 [ 11.352226] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe [ 11.352236] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1 [ 11.354478] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-000-46433c339 api 5 features mfp,peer-flow-ctrl,wmi-10.x-CT crc32 bc4a0c1d [ 11.367130] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517 [ 11.432185] ath10k_pci 0000:05:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043222ff sub 0000:0000 [ 11.432194] ath10k_pci 0000:05:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1 [ 11.432943] ath10k_pci 0000:05:00.0: firmware ver 10.1-ct-8x-__fH-022-538f0906 api 5 features wmi-10.x,mfp,txstatus-noack,wmi-10.x-CT,ratemask-CT,txrate-CT,get-temp-CT,tx-rc-CT,cust-stats-CT,retry-gt2-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT crc32 d870ee1d [ 11.462544] ath10k_pci 0000:05:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08 [ 12.375071] ath10k_pci 0000:05:00.0: unsupported HTC service id: 1536 [ 12.375720] ath10k_pci 0000:05:00.0: 10.1 wmi init: vdevs: 16 peers: 127 tid: 256 [ 12.385261] ath10k_pci 0000:05:00.0: wmi print 'P 128 V 8 T 410' [ 12.385466] ath10k_pci 0000:05:00.0: wmi print 'msdu-desc: 1424 sw-crypt: 0 ct-sta: 0' [ 12.385987] ath10k_pci 0000:05:00.0: wmi print 'alloc rem: 21000 iram: 25960' [ 12.418135] ath10k_pci 0000:05:00.0: htt-ver 2.2 wmi-op 2 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1 |

I tested |008| and |1002|, and both had this IPv6 issue. Therefore, I didn't bother testing any of the builds inbetween.

If you have earlier firmware builds you'd like to test, let me know as I have a relatively easy way to test them now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/greearb/ath10k-ct/issues/134#issuecomment-629811127, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHNKT3J6KUEKHDL2M45DTRR73DPANCNFSM4M5IOHSA.

-- Ben Greear greearb@candelatech.com Candela Technologies Inc http://www.candelatech.com

greearb commented 4 years ago

I should add, in case you want to dig into the details, I suspect the issue is that some frame is not received when it should be. If you can do a detailed sniff of all pkts on air, as well as the ath10k-ct device (ie, not monitor mode on the real device) and find which packet(s) are either not transmitted or not received, then maybe I can find the fix based on that...

THanks, Ben

On 05/17/2020 09:07 AM, Ben Greear wrote:

I guess the good news is that it is not a regression in my code, but the QCA firmware I based it on must have the bug.

Sometime soon I am going to try to port in some newer upstream code, maybe I can find the fix somewhere in there.

Thanks, Ben

On 05/17/2020 07:54 AM, Drew Hess wrote:

It was a good idea to check this, because with the CT driver hack you suggested, the very first firmware you sent me (|008-46433c339|) works with AP/VLAN. Clients are assigned to the proper VLAN and IPv4 seems to work perfectly.

Unfortunately, with this first firmware, IPv6 doesn't work as expected and exhibits the same issue as described above.

Here's the dmesg with the hacked CT driver and the |008-46433c339| firmware:

|[ 11.217894] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x46. [ 11.224477] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0 [ 11.298947] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517 [ 11.305117] ath10k 5.1 driver, optimized for CT firmware, probing pci device: 0x3c. [ 11.310751] ath10k_pci 0000:05:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0 [ 11.352226] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe [ 11.352236] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1 [ 11.354478] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-000-46433c339 api 5 features mfp,peer-flow-ctrl,wmi-10.x-CT crc32 bc4a0c1d [ 11.367130] gpio-keys-polled gpio-keys-polled: unable to claim gpio 0, err=-517 [ 11.432185] ath10k_pci 0000:05:00.0: qca988x hw2.0 target 0x4100016c chip_id 0x043222ff sub 0000:0000 [ 11.432194] ath10k_pci 0000:05:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1 [ 11.432943] ath10k_pci 0000:05:00.0: firmware ver 10.1-ct-8x-__fH-022-538f0906 api 5 features wmi-10.x,mfp,txstatus-noack,wmi-10.x-CT,ratemask-CT,txrate-CT,get-temp-CT,tx-rc-CT,cust-stats-CT,retry-gt2-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT crc32 d870ee1d [ 11.462544] ath10k_pci 0000:05:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08 [ 12.375071] ath10k_pci 0000:05:00.0: unsupported HTC service id: 1536 [ 12.375720] ath10k_pci 0000:05:00.0: 10.1 wmi init: vdevs: 16 peers: 127 tid: 256 [ 12.385261] ath10k_pci 0000:05:00.0: wmi print 'P 128 V 8 T 410' [ 12.385466] ath10k_pci 0000:05:00.0: wmi print 'msdu-desc: 1424 sw-crypt: 0 ct-sta: 0' [ 12.385987] ath10k_pci 0000:05:00.0: wmi print 'alloc rem: 21000 iram: 25960' [ 12.418135] ath10k_pci 0000:05:00.0: htt-ver 2.2 wmi-op 2 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1 |

I tested |008| and |1002|, and both had this IPv6 issue. Therefore, I didn't bother testing any of the builds inbetween.

If you have earlier firmware builds you'd like to test, let me know as I have a relatively easy way to test them now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/greearb/ath10k-ct/issues/134#issuecomment-629811127, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHNKT3J6KUEKHDL2M45DTRR73DPANCNFSM4M5IOHSA.

-- Ben Greear greearb@candelatech.com Candela Technologies Inc http://www.candelatech.com

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/greearb/ath10k-ct/issues/134#issuecomment-629821031, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHNKSXNRPHFV3X5VDYHPDRSADSVANCNFSM4M5IOHSA.

-- Ben Greear greearb@candelatech.com Candela Technologies Inc http://www.candelatech.com

jannispinter commented 3 years ago

I have exactly the same issue with OpenWrt snapshot builds for Aruba Instant On AP11. Linux, Android and iOS as clients will get an IPv6 address and configure routes correctly, but there is no IPv6 traffic passing through. I also have 802.11r enabled, and when roaming between APs, it sometimes magically works.

wadegerencser commented 2 years ago

Not sure if this helps, but Dynamic VLAN for IPv6 is simply not supported in AirOS. Thats all.