lwfinger / rtw88

A backport of the Realtek Wifi 5 drivers from the wireless-next repo.
602 stars 175 forks source link

rtw_8821au: Connection breaks after a while #205

Open stkw0 opened 2 months ago

stkw0 commented 2 months ago

Sometimes my wifi connection suddenly disconnects. When it happens, dmesg shows the next message recurrently: "rtw_8821au 1-8:1.0: MAC has not been powered on yet". No matter what I try, seems it's only resolved by rebooting the computer. This message also shows the first time I boot up the computer, but it connects properly the first time. May also important to note that before using rtw88 source I used aircrack-ng/rtl8812au driver. With those drivers, I also had random disconnections (maybe due to the AP¿?) but after restarting iwd or NetworkManager it recovered the connection. This does not happen now.

Module: rtw_8821au Hardware: ID 2357:0120 TP-Link Archer T2U PLUS [RTL8821AU] Linux: 6.9.5

dubhater commented 2 months ago

How long does it usually take to lose the connection like that?

Please attach the full journalctl output from a boot where your connection broke.

stkw0 commented 2 months ago

From a couple of hours (say ~3h) to 1-2 days. I don't use systemd. I will add the relevant logs once it happens again.

Thank you.

stkw0 commented 2 months ago

Here is the log of the last time that connetion broke

[36259.610821] random: crng reseeded on system resumption
[36259.610832] PM: suspend exit
[36259.642000] Generic FE-GE Realtek PHY r8169-0-600:00: attached PHY driver (mii_bus:phy_addr=r8169-0-600:00, irq=MAC)
[36259.717371] Loading firmware: rtw88/rtw8821a_fw.bin
[36259.717517] rtw_8821au 1-8:1.0: Firmware version 42.4.0, H2C version 0
[36259.759666] 00000000: 29 81 00 7c 01 00 01 00 4c 00 04 00 10 00 00 00  )..|....L.......
[36259.759671] 00000010: 25 26 26 27 27 27 2e 2e 2e 2e 2e ff ff ff ff ff  %&&'''..........
[36259.759672] 00000020: ff ff 1d 1b 19 19 1b 19 17 16 15 15 17 16 16 16  ................
[36259.759674] 00000030: fd ff ff ff ff ff 10 ff ff ff ff ff ff ff ff ff  ................
[36259.759675] 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759677] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759678] 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759680] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759681] 00000080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759683] 00000090: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759684] 000000a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759685] 000000b0: ff ff ff ff ff ff ff ff ba 27 1e 00 01 00 00 08  .........'......
[36259.759687] 000000c0: ff 09 00 ff 00 00 00 55 00 ff ff ff ff ff ff ff  .......U........
[36259.759688] 000000d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759690] 000000e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759691] 000000f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759693] 00000100: 57 23 20 01 ff ff 03 98 48 27 20 4c 84 0a 03 52  W# .....H' L...R
[36259.759694] 00000110: 65 61 6c 74 65 6b 20 18 03 38 30 32 2e 31 31 61  ealtek ..802.11a
[36259.759696] 00000120: 63 20 57 4c 41 4e 20 41 64 61 70 74 65 72 20 00  c WLAN Adapter .
[36259.759697] 00000130: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759698] 00000140: ff ff ff ff ff ff ff 0f ff ff ff ff ff ff ff ff  ................
[36259.759700] 00000150: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759701] 00000160: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759702] 00000170: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759703] 00000180: ff ff ff ff ff ff ff ff 83 ab 99 2d 03 93 98 a0  ...........-....
[36259.759705] 00000190: fc 8c 00 11 9b c4 00 ff ff ff ff ff ff ff ff ff  ................
[36259.759706] 000001a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759708] 000001b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759709] 000001c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759710] 000001d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759711] 000001e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.759713] 000001f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[36259.810289] r8169 0000:06:00.0 enp6s0: Link is Down
[36259.812234] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[36260.352182] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[36261.372967] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[36261.907218] wlan0: authenticate with 10:50:72:2f:39:35 (local address=98:48:27:20:4c:84)
[36262.081267] wlan0: send auth to 10:50:72:2f:39:35 (try 1/3)
[36262.083125] wlan0: authenticated
[36262.085015] wlan0: associate with 10:50:72:2f:39:35 (try 1/3)
[36262.087171] wlan0: RX AssocResp from 10:50:72:2f:39:35 (capab=0x11 status=0 aid=6)
[36262.092191] wlan0: associated
[36262.151352] wlan0: Limiting TX power to 23 (23 - 0) dBm as advertised by 10:50:72:2f:39:35
[36263.035092] ata5: link is slow to respond, please be patient (ready=0)
[36264.030098] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[36264.031208] sd 4:0:0:0: [sdc] Starting disk
[36264.032200] ata5.00: configured for UDMA/133
[36319.610307] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[36319.643464] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[36777.234567] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[37832.508261] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[37832.541419] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[38275.786428] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[38329.737498] wlan0: disconnect from AP 10:50:72:2f:39:35 for new auth to 10:50:72:2f:39:31
[38329.800541] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[38330.335880] wlan0: authenticate with 10:50:72:2f:39:31 (local address=98:48:27:20:4c:84)
[38330.444110] wlan0: send auth to 10:50:72:2f:39:31 (try 1/3)
[38330.447348] wlan0: authenticated
[38330.447920] wlan0: associate with 10:50:72:2f:39:31 (try 1/3)
[38330.453675] wlan0: RX ReassocResp from 10:50:72:2f:39:31 (capab=0x411 status=0 aid=6)
[38330.458176] wlan0: associated
[38463.948846] wlan0: disconnect from AP 10:50:72:2f:39:31 for new auth to 10:50:72:2f:39:35
[38464.001962] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[38464.541377] wlan0: authenticate with 10:50:72:2f:39:35 (local address=98:48:27:20:4c:84)
[38464.714467] wlan0: send auth to 10:50:72:2f:39:35 (try 1/3)
[38464.716286] wlan0: authenticated
[38464.717166] wlan0: associate with 10:50:72:2f:39:35 (try 1/3)
[38464.719309] wlan0: RX ReassocResp from 10:50:72:2f:39:35 (capab=0x11 status=0 aid=5)
[38464.724093] wlan0: associated
[38464.736072] wlan0: Limiting TX power to 23 (23 - 0) dBm as advertised by 10:50:72:2f:39:35
[38811.532263] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[38811.565235] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[38813.971004] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[39907.529515] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[39907.562552] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[40478.563592] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[40959.541325] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[40959.574468] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[41070.043780] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[41741.542724] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[41741.575683] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
[43038.223867] wlan0: disconnect from AP 10:50:72:2f:39:35 for new auth to 10:50:72:2f:39:31
[43038.284851] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43038.815360] wlan0: authenticate with 10:50:72:2f:39:31 (local address=98:48:27:20:4c:84)
[43038.933215] wlan0: send auth to 10:50:72:2f:39:31 (try 1/3)
[43038.938685] wlan0: authenticated
[43038.939114] wlan0: associate with 10:50:72:2f:39:31 (try 1/3)
[43038.944417] wlan0: RX ReassocResp from 10:50:72:2f:39:31 (capab=0x411 status=0 aid=7)
[43038.949300] wlan0: associated
[43093.767547] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[43094.575609] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[43096.559608] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[43098.751643] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[43102.318251] rtw_8821au 1-8:1.0: failed to send h2c command
[43102.555949] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43103.047680] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[43103.648155] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43118.877138] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43144.104385] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43189.328248] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43274.552931] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43439.778400] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[43745.005699] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[44050.232440] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[44355.459071] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[44660.685905] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[44965.912856] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[45271.139777] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[45576.367454] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[45881.597447] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[46186.824281] rtw_8821au 1-8:1.0: MAC has not been powered on yet
[46433.317237] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000068000 engine 03 [IFB] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel 2 [00bfb9a000 X[2976]]
dubhater commented 2 months ago

It looks like the firmware stops working, no idea why.

"MAC has not been powered on yet" is printed when the chip is being powered on.

Instead of rebooting, have you tried reloading rtw_8821au? Have you tried unplugging the device?

Also, I see that your computer was suspended. Does the problem happen if the computer stays awake the whole time?

stkw0 commented 2 months ago

I waited ~50 hours without suspending it, it didn't failed. Then I suspended it two times leaving some hours in between, without failures. I don't know if some changed that I made to the kernel (for other reasons) could affect this issue. For the past 5 days I didn't had this issue again.

Will report back if at some point I have more clues :/

dubhater commented 2 months ago

Did you update rtw88 since your original report? I pushed some changes recently.

stkw0 commented 2 months ago

No. I updated now, if it happens again I will report back. I also found weird it switch automatically from 2.4 GHz to 5 GHz network (it being a desktop), I don't know if that could be related.

dubhater commented 2 months ago

It must be switching because the signal strength varies over time. The switching could be related. If you can give the 2.4 GHz and 5 GHz networks separate SSIDs, you could try to make it switch from one to the other in a loop using nmcli/iwctl. Maybe give it a few seconds after each switch.

stkw0 commented 2 months ago

For now there are no problems. If I have time I will try to build a test script. Thank you

stkw0 commented 2 months ago

I had some disconnects but now seems it reconnects correctly without hanging forever. I guess this issue can be closed as it could not be reproduced.

tratum commented 2 months ago

I was having the same problems but it was automatically resolved when I rebooted my system but now I am having the same problem again today

dubhater commented 2 months ago

I'm trying to reproduce it now:

for i in {001..100}; do nmcli connection down 64e4328c-6606-4648-93bc-247763c3bc5a; sleep 10; nmcli connection up 64e4328c-6606-4648-93bc-247763c3bc5a; sleep 10; done
dubhater commented 2 months ago

Still works.

dubhater commented 2 months ago

By the way, are either of you using KDE Plasma and its NetworkManager applet?

stkw0 commented 2 months ago

I am and also using iwd backend instead of wpa_supplicant

stkw0 commented 2 months ago

It happened again now. rmmod rtw_8821au and modprobe again fixed the issue. Here is the log of the failure and the recovery. Since last time I didn't pulled new commits from this repository. Nothing changed except that I updated to Linux 6.9.8:

[ 7860.564657] wlan0: Limiting TX power to 30 (30 - 0) dBm as advertised by 10:50:72:2f:39:35
[ 8613.130077] wlan0: disconnect from AP 10:50:72:2f:39:35 for new auth to 10:50:72:2f:39:31
[ 8613.731694] wlan0: authenticate with 10:50:72:2f:39:31 (local address=98:48:27:20:4c:84)
[ 8613.839173] wlan0: send auth to 10:50:72:2f:39:31 (try 1/3)
[ 8613.844410] wlan0: authenticated
[ 8613.846222] wlan0: associate with 10:50:72:2f:39:31 (try 1/3)
[ 8613.851052] wlan0: RX ReassocResp from 10:50:72:2f:39:31 (capab=0x411 status=0 aid=3)
[ 8613.855522] wlan0: associated
[ 8623.976358] rtw_8821au 1-8:1.0: write register 0x8c4 failed with -71
[ 8623.976478] rtw_8821au 1-8:1.0: read register 0x848 failed with -71
[ 8623.976596] rtw_8821au 1-8:1.0: write register 0x848 failed with -71
[ 8623.976718] rtw_8821au 1-8:1.0: read register 0xc00 failed with -71
[ 8623.976838] rtw_8821au 1-8:1.0: read register 0x8b0 failed with -71
[ 8623.976956] rtw_8821au 1-8:1.0: write register 0x8b0 failed with -71
[ 8627.011948] wlan0: disconnect from AP 10:50:72:2f:39:31 for new auth to 10:50:72:2f:39:35
[ 8627.516349] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[ 8627.623288] wlan0: authenticate with 10:50:72:2f:39:35 (local address=98:48:27:20:4c:84)
[ 8627.796034] wlan0: send auth to 10:50:72:2f:39:35 (try 1/3)
[ 8628.300319] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[ 8628.836342] wlan0: send auth to 10:50:72:2f:39:35 (try 2/3)
[ 8629.340327] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[ 8629.860645] wlan0: send auth to 10:50:72:2f:39:35 (try 3/3)
[ 8630.364338] rtw_8821au 1-8:1.0: failed to get tx report from firmware
[ 8630.884403] wlan0: authentication with 10:50:72:2f:39:35 timed out
[ 8670.516554] rtw_8821au 1-8:1.0: rtw8821a_power_off: bailing because RTW_FLAG_POWERON
[ 8681.133490] usbcore: deregistering interface driver rtw_8821au
[ 8681.154267] rtw_8821au 1-8:1.0: rtw8821a_power_off: bailing because RTW_FLAG_POWERON
[ 8681.295984] usb 1-8: reset high-speed USB device number 3 using xhci_hcd
[ 8685.031857] Loading firmware: rtw88/rtw8821a_fw.bin
[ 8685.032059] rtw_8821au 1-8:1.0: Firmware version 42.4.0, H2C version 0
[ 8685.073861] 00000000: 29 81 00 7c 01 00 01 00 4c 00 04 00 10 00 00 00  )..|....L.......
[ 8685.073863] 00000010: 25 26 26 27 27 27 2e 2e 2e 2e 2e ff ff ff ff ff  %&&'''..........
[ 8685.073864] 00000020: ff ff 1d 1b 19 19 1b 19 17 16 15 15 17 16 16 16  ................
[ 8685.073865] 00000030: fd ff ff ff ff ff 10 ff ff ff ff ff ff ff ff ff  ................
[ 8685.073866] 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073867] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073868] 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073869] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073870] 00000080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073870] 00000090: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073871] 000000a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073872] 000000b0: ff ff ff ff ff ff ff ff ba 27 1e 00 01 00 00 08  .........'......
[ 8685.073873] 000000c0: ff 09 00 ff 00 00 00 55 00 ff ff ff ff ff ff ff  .......U........
[ 8685.073874] 000000d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073874] 000000e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073875] 000000f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073876] 00000100: 57 23 20 01 ff ff 03 98 48 27 20 4c 84 0a 03 52  W# .....H' L...R
[ 8685.073877] 00000110: 65 61 6c 74 65 6b 20 18 03 38 30 32 2e 31 31 61  ealtek ..802.11a
[ 8685.073878] 00000120: 63 20 57 4c 41 4e 20 41 64 61 70 74 65 72 20 00  c WLAN Adapter .
[ 8685.073879] 00000130: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073880] 00000140: ff ff ff ff ff ff ff 0f ff ff ff ff ff ff ff ff  ................
[ 8685.073880] 00000150: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073881] 00000160: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073882] 00000170: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073883] 00000180: ff ff ff ff ff ff ff ff 83 ab 99 2d 03 93 98 a0  ...........-....
[ 8685.073884] 00000190: fc 8c 00 11 9b c4 00 ff ff ff ff ff ff ff ff ff  ................
[ 8685.073884] 000001a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073885] 000001b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073886] 000001c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073887] 000001d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073888] 000001e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.073888] 000001f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[ 8685.075057] usbcore: registered new interface driver rtw_8821au
[ 8687.195869] wlan0: authenticate with 10:50:72:2f:39:35 (local address=98:48:27:20:4c:84)
[ 8687.367864] wlan0: send auth to 10:50:72:2f:39:35 (try 1/3)
[ 8687.369748] wlan0: authenticated
[ 8687.370819] wlan0: associate with 10:50:72:2f:39:35 (try 1/3)
[ 8687.372767] wlan0: RX AssocResp from 10:50:72:2f:39:35 (capab=0x11 status=0 aid=4)
[ 8687.377249] wlan0: associated
[ 8687.466278] wlan0: Limiting TX power to 30 (30 - 0) dBm as advertised by 10:50:72:2f:39:35
dubhater commented 2 months ago

I pushed something that may help. Maybe it won't. Please pull and test.

tratum commented 2 months ago

Here's something silly that works for me whenever my connection breaks For Fedora

sudo systemctl restart NetworkManager
sudo systemctl restart NetworkManager.service
sudo reboot
dubhater commented 2 months ago

I pushed something that may help. Maybe it won't. Please pull and test.

Well, I ran into the disconnection problem again yesterday, and today too. I'm thinking it's somehow caused by my torrent client. @stkw0 and @tratum were you downloading or uploading a lot of Linux ISOs when the connection died? :)

When the connection died yesterday, qBittorrent was showing over 5 GiB downloaded and about the same uploaded. Today it showed 22 GiB downloaded and 7 uploaded.

I tried to trigger the disconnection using iperf3, but it downloaded and uploaded a lot with no issues.

I was wrong earlier, the firmware doesn't die. Everything keeps working, except the driver doesn't receive anything from the chip anymore. I can see it transmitting probe requests on channels 48 and 149, so it's switching the channel and transmitting fine.

tratum commented 2 months ago

@dubhater, I mean I was in the process of downloading the Windows ISO to set up a dual-boot configuration. However, I encountered frequent network disconnections randomly even before initiating the download. Additionally, upon switching to the Windows operating system, I faced another issue where I was unable to establish a connection to my Wi-Fi network even in the Windows OS.

I've been thinking about it, and I don't think the torrent client is the root cause of the disconnections

tratum commented 2 months ago

I'm happy to share that for now the issues I was experiencing with frequent disconnects and WiFi interruptions have been resolved. Here are the detailed system specifications for my current system:

         .';:cccccccccccc:;,.
      .;cccccccccccccccccccccc;.      OS: Fedora Linux 40 (Workstation Edition) x86_64 
    .:cccccccccccccccccccccccccc:.     Host: TUF Gaming FX505DT_FX505DT 1.0 
  .;ccccccccccccc;.:dddl:.;ccccccc;.     Kernel: 6.9.8-200.fc40.x86_64 
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.     Shell: bash 5.2.26 
.:ccccccccccccc;KMMc;cc;xMMc:ccccccc:.     DE: GNOME 46.3.1 
,cccccccccccccc;MMM.;cc;;WW::cccccccc,     WM: Mutter 
:cccccccccccccc;MMM.;cccccccccccccccc:      Terminal: gnome-terminal 
:ccccccc;oxOOOo;MMM0OOk.;cccccccccccc:     CPU: AMD Ryzen 7 3750H with Radeon Vega Mobile Gfx (8) @ 2.300GHz 
cccccc:0MMKxdd:;MMMkddc.;cccccccccccc;     GPU: NVIDIA GeForce GTX 1650 Mobile / Max-Q 
ccccc:XM0';cccc;MMM.;cccccccccccccccc'      GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series 
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;     Memory: 7958MiB / 30007MiB 
ccccc;0MNc.ccc.xMMd:ccccccccccccccc;     
cccccc;dNMWXXXWM0::cccccccccccccc:,      
cccccccc;.:odl:.;cccccccccccccc:,.       
:cccccccccccccccccccccccccccc:'.         
.:cccccccccccccccccccccc:;,..            
  '::cccccccccccccc::;,.                
stkw0 commented 2 months ago

were you downloading or uploading a lot of Linux ISOs when the connection died? :)

I had qbittorrent opened, but it was not transmitting a high amount of bandwidth. If it's related with that, maybe the problem is more about opening and closing connections (the DHT and so) than bandwidth.

dubhater commented 1 month ago

Next time it happens, before you do anything else, please gather some information with these simple steps:

  1. Mount debugfs: # mount -t debugfs none /sys/kernel/debug
  2. Prepare this command but don't run it yet: # cat /sys/kernel/debug/ieee80211/phy0/rtw88/{mac_{0..2},mac_{4..7},bb_{8,9},bb_{a..f}} > registers.txt On your system it may not be phy0.
  3. When you see the LED blinking, press enter to run the command. That's when the chip is definitely powered on, because it's transmitting probe requests while scanning.

If registers.txt is filled mostly with eaeaeaea eaeaeaea eaeaeaea eaeaeaea it means you missed (the chip was powered off) and need to run cat again.

stkw0 commented 1 month ago

It happened again but seems it's far less common now (still using rtw88 commit 5db150854a0d4dde0c121ad10486c3854b5139d1). rmmod & modprobe workaround the problem. I could not gather the information requested since I didn't have debugfs enabled. Will do it now and update to latest master.

stkw0 commented 1 month ago

Next time it happens, before you do anything else, please gather some information with these simple steps:

1. Mount debugfs: `# mount -t debugfs none /sys/kernel/debug`

2. Prepare this command but don't run it yet: `# cat /sys/kernel/debug/ieee80211/phy0/rtw88/{mac_{0..2},mac_{4..7},bb_{8,9},bb_{a..f}} > registers.txt` On your system it may not be phy0.

3. When you see the LED blinking, press enter to run the command. That's when the chip is definitely powered on, because it's transmitting probe requests while scanning.

If registers.txt is filled mostly with eaeaeaea eaeaeaea eaeaeaea eaeaeaea it means you missed (the chip was powered off) and need to run cat again.

Tried to follow this instructions but it changed from phy5 (on my machine) to phy6 when re-loading the module. Is there a way to fix it to have a predictable number? Also, would it be fine if I use tail -f to not miss anything?

dubhater commented 1 month ago

I don't think you can make it more predictable. tail -f doesn't work for this case.

dubhater commented 1 month ago

I got another idea. Please apply this patch and let me know if you see the error message "rtw_usb: probably just ran out of RX URBs" when the connection dies:

diff --git a/drivers/net/wireless/realtek/rtw88/usb.c b/drivers/net/wireless/realtek/rtw88/usb.c
index bf55360f9daf..149a200ffe19 100644
--- a/drivers/net/wireless/realtek/rtw88/usb.c
+++ b/drivers/net/wireless/realtek/rtw88/usb.c
@@ -671,6 +671,9 @@ static void rtw_usb_read_port_complete(struct urb *urb)
        }
        if (skb)
            dev_kfree_skb_any(skb);
+       rtwusb->skipped_resubmit++;
+       if (rtwusb->skipped_resubmit >= RTW_USB_RXCB_NUM)
+           pr_err_once("rtw_usb: probably just ran out of RX URBs\n");
    }
 }

diff --git a/drivers/net/wireless/realtek/rtw88/usb.h b/drivers/net/wireless/realtek/rtw88/usb.h
index 86697a5c0103..85bcb09b7997 100644
--- a/drivers/net/wireless/realtek/rtw88/usb.h
+++ b/drivers/net/wireless/realtek/rtw88/usb.h
@@ -82,6 +82,7 @@ struct rtw_usb {
    struct rx_usb_ctrl_block rx_cb[RTW_USB_RXCB_NUM];
    struct sk_buff_head rx_queue;
    struct work_struct rx_work;
+   int skipped_resubmit;
 };

 static inline struct rtw_usb_tx_data *rtw_usb_get_tx_data(struct sk_buff *skb)

I would test it myself but my RTL8812AU just died and I don't want to reload the driver until I'm sure I don't need any more information from it.

dubhater commented 1 month ago

Haha, after RTL8812AU died I plugged RTL8811AU and it also died a few hours later. Only when I'm not trying to make it happen...

I got the register contents from both. I confirmed that rtw88 is not even receiving messages from the firmware (this is the cause of the "failed to get tx report from firmware" errors).

If you haven't started yet, here is a better patch which shows the error code:

diff --git a/drivers/net/wireless/realtek/rtw88/usb.c b/drivers/net/wireless/realtek/rtw88/usb.c
index bf55360f9daf..4dbcc276a76c 100644
--- a/drivers/net/wireless/realtek/rtw88/usb.c
+++ b/drivers/net/wireless/realtek/rtw88/usb.c
@@ -664,7 +664,6 @@ static void rtw_usb_read_port_complete(struct urb *urb)
        case -ECOMM:
        case -EOVERFLOW:
        case -EINPROGRESS:
-           break;
        default:
            rtw_err(rtwdev, "status %d\n", urb->status);
            break;

If this is indeed the right direction, you will see something like rtw_8821au 1-8:1.0: status -XYZ. The connection will break when this appears for the fourth time.

stkw0 commented 1 month ago

In linux 6.10.3 seems that this patch is already applied? Will try with the latest Linux kernel and the latest master of this repo

dubhater commented 1 month ago

Nevermind, I tried it today. The problem is not there.

stkw0 commented 5 days ago

I don't really know if it's related. But I noticed an incredibly higher number of those failures doing intensive I/O with a USB 3 port in which a HDD is connected.

After runnig rmmod && modprobe I dumped the registers in case it helps: registers.txt