lwfinger / rtw88

A backport of the Realtek Wifi 5 drivers from the wireless-next repo.
632 stars 181 forks source link

[rt8821cu] Frequent disconnections with "failed to get tx report from firmware" #243

Open rcky844 opened 1 week ago

rcky844 commented 1 week ago

Hi, I had installed a rt8821cu module inside of my SurgeTab PH-101 in order to replace a failed 2.4GHz Realtek module, and it is soldered to only use USB with the antennas properly soldered too. However, under Linux, I would often get poor signal strength for WiFi.

The more problematic issue is frequent disconnections with "failed to get tx report from firmware" in dmesg:

[ 6029.374186] rtw_8821cu 1-2.2:1.2: failed to get tx report from firmware
[ 6029.934415] rtw_8821cu 1-2.2:1.2: firmware failed to leave lps state
[ 6031.358173] rtw_8821cu 1-2.2:1.2: failed to get tx report from firmware
[ 6033.342200] rtw_8821cu 1-2.2:1.2: failed to get tx report from firmware
[ 6039.606068] wlan0: authenticate with xx:xx:xx:xx:xx:xx (local address=xx:xx:xx:xx:xx:xx)
[ 6039.653509] wlan0: send auth to xx:xx:xx:xx:xx:xx (try 1/3)
[ 6040.159135] rtw_8821cu 1-2.2:1.2: failed to get tx report from firmware
[ 6040.853226] wlan0: send auth to xx:xx:xx:xx:xx:xx (try 2/3)
[ 6041.358145] rtw_8821cu 1-2.2:1.2: failed to get tx report from firmware
[ 6041.865881] wlan0: send auth to xx:xx:xx:xx:xx:xx (try 3/3)
[ 6042.366116] rtw_8821cu 1-2.2:1.2: failed to get tx report from firmware
[ 6042.831862] wlan0: authentication with xx:xx:xx:xx:xx:xx timed out

With the in-kernel driver, I would get this within 10 minutes. However, switching to this out-of-kernel driver with dkms dramatically lengthens this. I am able to bring WiFi back with a simple sudo rmmod rtw_8821cu; sudo modprobe rtw_8821cu.

Please note that I have specifically put the WiFi USB ID into the denylist of TLP, and tried it with USB autosuspend and the NetworkManager option for powersave configured as well, but to no avail.

Similar issue may be #205.

Please advise if there are any patches available, or any logs you would like.

dubhater commented 1 week ago

The disconnection looks like #205 to me. I added an error message to make sure. I tried to fix it in two ways so far, but no luck. I'll keep trying.

How is the signal strength with the code from this repository?

Do you have pictures of the modification you made inside the tablet? I'm just curious.

rcky844 commented 1 week ago

The disconnection looks like #205 to me. I added an error message to make sure. I tried to fix it in two ways so far, but no luck. I'll keep trying.

How is the signal strength with the code from this repository?

Do you have pictures of the modification you made inside the tablet? I'm just curious.

Comparing the in-kernel and the code in this repository, the one in-kernel will show full signal strength when there is network activity. Otherwise, both are at around 1~2 bars under KDE Plasma Desktop / Mobile (NetworkManager). The device is decently close to the router and other devices with Intel WiFi could get full bars. The only network visible and discoverable with this module is the one at my home. I live in a rather populated area so this may demonstrate a huge issue with some firmware (?).

I have attached a picture to my solder work, I aligned the USB pins of this module I bought off TaoBao (China's internal AliExpress). The antenna pins are connected to the IPEX connectors on the board as needed. I have tried different antenna placements, but they are currently where they were originally.

IMG_20241021_185906_195

Edit: I have since populated both antennas, and they should be the same arrangement as with the old WiFi card, which was no longer detected when I got it.

rcky844 commented 1 week ago

Oh btw, I am on Linux Kernel 6.11.4, linux-zen flavour provided by Arch Linux

dubhater commented 1 week ago

Nice!

the one in-kernel will show full signal strength when there is network activity

This is a bug, actually. It's fixed in this repository. "Network activity" means data frames. This chip doesn't report the signal strength for (some?) data frames. The driver mistakenly reports a signal strength of 0 for data frames, which is very high.

The bars are a bit vague. What's the signal displayed by iw dev wlan0 station dump? You should compare it with this driver: https://github.com/morrownr/8821cu-20210916

rcky844 commented 1 week ago

rtw_8821cu (this repo):

Station xx:xx:xx:xx:xx:xx (on wlan0)
        inactive time:  2019 ms
        rx bytes:       196189121
        rx packets:     347286
        tx bytes:       18398696
        tx packets:     68102
        tx retries:     0
        tx failed:      0
        beacon loss:    10
        beacon rx:      111549
        rx drop misc:   154
        signal:         -81 [-81] dBm
        signal avg:     -80 [-80] dBm
        beacon signal avg:      -80 dBm
        tx bitrate:     27.0 MBit/s VHT-MCS 1 40MHz VHT-NSS 1
        tx duration:    0 us
        rx bitrate:     6.0 MBit/s
        rx duration:    0 us
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            no
        TDLS peer:      no
        DTIM period:    1
        beacon interval:100
        short preamble: yes
        short slot time:yes
        connected time: 11477 seconds
        associated at [boottime]:       8212.822s
        associated at:  1729501988314 ms
        current time:   1729513464527 ms

8821cu (the other repo): Nothing returned

Signal strengths are similar.

rcky844 commented 1 week ago
Station xx:xx:xx:xx:xx:xx (on wlan0)
        inactive time:  283 ms
        rx bytes:       57775
        rx packets:     435
        tx bytes:       30702
        tx packets:     324
        tx retries:     0
        tx failed:      17
        beacon loss:    0
        beacon rx:      62
        rx drop misc:   11
        signal:         -62 [-62] dBm
        signal avg:     -48 [-48] dBm
        beacon signal avg:      -53 dBm
        tx bitrate:     26.0 MBit/s MCS 3
        tx duration:    0 us
        rx bitrate:     52.0 MBit/s MCS 5
        rx duration:    0 us
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        CTS protection: yes
        short preamble: yes
        short slot time:yes
        connected time: 21 seconds
        associated at [boottime]:       705.996s
        associated at:  1729515068488 ms
        current time:   1729515089164 ms

I tried again with the driver in this tree, but with a hotspot from my phone placed right above the antenna. I still think it can be improved with some sort of calibration, because moving it a few centimeters away causes the signal strength to drop significantly.

But then again, Bluetooth headphones using SBC-XQ is unusable with this module, not that it is completely related with this driver anyways.

rcky844 commented 1 week ago

I got this result on speedtest.net:

We have a 1000 Mbps up/down connection with "normal" Intel WiFi 6/6E devices (AX201 on Surface Go 2; AX211 on Lenovo Yoga 14c), and it is also fine on some Qualcomm cards. The BSSID I am connecting to on this rtw card is already set to 5GHz band.

Please tell me what you would need to debug this further.

dubhater commented 1 week ago

I need to know the signal strength you get with the other driver: https://github.com/morrownr/8821cu-20210916 That's the known good driver. If iw doesn't say anything, look at the applet in Plasma.

dubhater commented 1 week ago

Right, so click on the Networks icon in Plasma, click on your network, then on the Details tab. It says the signal strength there.

dubhater commented 1 week ago

If you have a good signal strength with the 8821cu driver, it's a bug in rtw88.

If you have bad signal strength with 8821cu, there is something wrong with your antennas. Maybe they're not connected, or they are not suitable for 5 GHz. The tablet had 2.4 GHz wifi originally, right? So maybe the antennas are not designed for both bands.

rcky844 commented 1 week ago

If you have a good signal strength with the 8821cu driver, it's a bug in rtw88.

If you have bad signal strength with 8821cu, there is something wrong with your antennas. Maybe they're not connected, or they are not suitable for 5 GHz. The tablet had 2.4 GHz wifi originally, right? So maybe the antennas are not designed for both bands.

"Antennas" in its essence are just wires connected in a certain way to radiate electrons. There is no need for them to be "designed" specifically for both bands. Any antennas that will fit this tablet will look the same and work the same.

But of course, I can provide a schematic diagram from the seller I bought it from: 1729564132007

See RF_0 and RF_1, I have connected them to the antenna pins respectively, but it still yields poor signal strength. As a test, I disabled Bluetooth and switched to 2.4GHz, but the signal strengths are still similar. The module seems to only be able to detect 2~3 networks in my surroundings, whereas other devices could see up to 20.

I will do more testing when I have time, and would definitely try some different antennas or re-work the connections.

rcky844 commented 1 week ago

I found a better way to measure the differences between the two drivers. In conclusion, the findings revealed that the driver in this repo is inferior to the Realtek 8821cu one.

The antennas connected on the motherboard has been swapped around to check for differences, but none were found as of this moment. It should be noted that I have connected the ground plate on the PCB as an antenna too.

It appears that my previous assumptions might be false, as the 8821cu driver did in fact perform similarly well to the Intel WiFi, but slightly worse for devices further away.

A control is used in the series of photographs below, with a known stable WiFi card (AX201). Command used on both device is watch nmcli dev wifi list. I am not sensorong the MAC addresses here and it is a conscious decision.

Findings

rtw_8821cu driver, tablet @ ~2m away from AP

20241022_104211

8821cu driver, tablet @ ~2m away from AP

20241022_104306

rtw_8821cu driver, tablet @ ~15m away from AP

20241022_104516

8821cu driver, tablet @ ~15m away from AP

20241022_104404

rcky844 commented 1 week ago

Speeds under 8821cu driver is also better than that of rtw_8821cu.

dubhater commented 1 week ago

The module seems to only be able to detect 2~3 networks in my surroundings, whereas other devices could see up to 20.

Rtw88 will detect fewer networks when it's connected to one because it adjusts the RX gain to hear the connected network. 8821cu turns up the gain during scans in order to hear all the networks it can. So if you want to compare the number of networks found by each driver you have to do it when not connected to a network.

To verify, you connected the middle pin of one antenna to pad 2 and the shield to pad 1? And the middle pin of the other antenna to pad 3 and the shield to pad 4?

It should be noted that I have connected the ground plate on the PCB as an antenna too.

But what did you mean here? Is this connection still there?

rcky844 commented 1 week ago

The module seems to only be able to detect 2~3 networks in my surroundings, whereas other devices could see up to 20.

Rtw88 will detect fewer networks when it's connected to one because it adjusts the RX gain to hear the connected network. 8821cu turns up the gain during scans in order to hear all the networks it can. So if you want to compare the number of networks found by each driver you have to do it when not connected to a network.

To verify, you connected the middle pin of one antenna to pad 2 and the shield to pad 1? And the middle pin of the other antenna to pad 3 and the shield to pad 4?

It should be noted that I have connected the ground plate on the PCB as an antenna too.

But what did you mean here? Is this connection still there?

I actually bridged both antenna together with the shield provided by the mainboard, which is a bad design to begin with, as most of it would go to the lower resistance connection instead of traveling through the antenna, but I don't think a -10% signal strength would be a thing.

I will try to desolder that antenna part and re-work it when I have access to a soldering station.

dubhater commented 5 days ago

There is a fix for the disconnections if you want to try it: https://github.com/lwfinger/rtw88/issues/205#issuecomment-2437776828

rcky844 commented 5 days ago

There is a fix for the disconnections if you want to try it: #205 (comment)

Thanks, I am now trying it with some content playback. My disconnections occurs pretty frequently compared to #205's OP and the linux-wireless people, especially during stressful situations for the tablet (e.g. high RAM usage, running into OOMs a lot).