pop-os / pop

A project for managing all Pop!_OS sources
https://system76.com/pop
2.44k stars 87 forks source link

Default kernel Realtek Driver RTL8822CE drops speed or dies randomly. #1302

Open VentGrey opened 3 years ago

VentGrey commented 3 years ago

Distribution (run cat /etc/os-release):

NAME="Pop!_OS"
VERSION="20.04 LTS"
ID=pop
ID_LIKE="ubuntu debian"
PRETTY_NAME="Pop!_OS 20.04 LTS"
VERSION_ID="20.04"
HOME_URL="https://pop.system76.com"
SUPPORT_URL="https://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
LOGO=distributor-logo-pop-os

Related Application and/or Package Version (run apt policy $PACKAGE NAME):

linux-image-5.4.0-7642-generic:
  Instalados: 5.4.0-7642.46~1598628707~20.04~040157c
  Candidato:  5.4.0-7642.46~1598628707~20.04~040157c
  Tabla de versión:
 *** 5.4.0-7642.46~1598628707~20.04~040157c 1001
       1001 http://ppa.launchpad.net/system76/pop/ubuntu focal/main amd64 Packages
        100 /var/lib/dpkg/status

Issue/Bug Description:

Default Pop! OS default kernel provides a driver for the Realtek RTL8822CE Wireless Card. However this card comes with certain issues / regressions with the current LTS kernel, I really don't know what happens on a low level but the signal receiver seems to be "clogged" after connecting to certain WiFi networks, making WiFi connections die suddenly or making download speeds drop to the point of 0 or at unbearable speeds.

Steps to reproduce (if you know):

Expected behavior:

WiFi working normally.

Other Notes:

This error also floods dmesg after some hours of computing usage:

[25409.551984] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25409.692438] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25430.439963] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25430.583973] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25430.735967] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25430.883967] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25438.296472] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25438.436469] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25438.576049] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25438.716039] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25443.060011] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25443.203980] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25444.951992] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25445.092016] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25445.231995] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25445.371997] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25488.676537] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25488.816478] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25488.956167] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25489.096474] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25490.212511] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25490.352094] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25490.492102] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25490.632087] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25510.836544] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25510.976517] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25511.776132] rtw_pci 0000:02:00.0: timed out to flush queue 1
[25511.916663] rtw_pci 0000:02:00.0: timed out to flush queue 2
[25512.056070] rtw_pci 0000:02:00.0: timed out to flush queue 1

some other notes about this error might point to:

Here are some link resources I came up on Google trying to solve this issue myself:

jacobgkau commented 3 years ago
  • I cannot confirm but apparently on kernel 5.8 this issue is no longer present or doesn't happen as often

Thank you for the report! I don't have a device with an RTL8822CE wireless card to test with right now (current System76 products use Intel wireless cards), but Pop!_OS 20.10 (and Ubuntu 20.10) will include kernel version 5.8.0, so if that version includes a fix, it should be shipping within the next three weeks or so.

VentGrey commented 3 years ago

I tried installing linux-image-5.8.X-generic directly from the repositories but I only came out with a strange systemd-boot layout and the wireless card + ELAN touchpad not working I'm ignorant on how does the kernel + modules upgrade process works on Pop! OS. There are a few fixes from GitHub repositories externally but doesn't seem to work fluently and cause kernel panics quite often. I will keep investigating and update this issue if I find a workaround for users who might stick to 20.04 :smiley_cat:

VentGrey commented 3 years ago

UPDATE ON 20.10:

This issue still happens on kernel 5.8 shipped with pop! OS 20.10 it does happen less ofter but it now comes on with a warning:

pcieport 0000:00:08.1: PME: Spurious native interrupt!
rtw_8822ce 0000:02:00.0: timed out to flush queue 1
VentGrey commented 3 years ago

More updates on 20.10:

A partial solution posted on Reddit which didn't work on 5.8 was to:

Disable btrtl (bluetooth driver for same wireless card) and btusb (because it uses the former).

Reason:

The old rtlwifi has a "btcoexist" folder, meaning that the new rtw88 driver might be lacking the capability to run with btrtl. It is not a priority for me to look into at this point.

It works in the first few minutes after boot, but it gets unusable, most probably still needing the equivalent of "option rtl8822be aspm=0" to turn off power saving. No idea at this point.

Already tried disabling modules, didn't work. This issue was also reported on Ubuntu's Launchpad and Official Reddit but the OP's didn't get a useful response. Most of them being "Upgrade to 5.5" or "No it didn't happen to me". Is there any tool to debug kernel modules and see if there is actually a conflict between two or more modules causing the flush problem?

cantappa commented 3 years ago

Since the issue was closed: Did you find a solution for the problem? If so I would be very interested in it since I'm facing the same problem on Ubuntu 20.04, kernel 5.8.0-23.

VentGrey commented 3 years ago

I didn't find one and I hope there is one soon. According to bugzilla reports the 5.9 kernel breaks this even further. As for the issue it does happen less often on Pop! 20.10 (but keeps happening).

cantappa commented 3 years ago

Good to know that it gets worth on 5.9. Then I guess I'll just wait...at some point it will hopefully be fixed.

VentGrey commented 3 years ago

I tried compiling the driver repos that lay around GitHub. No success, only kernel panics.

crazyaccess commented 2 years ago

1 up on the same issue :->

popanz commented 2 years ago

I can confirm the problem

leviport commented 2 years ago

Is this still a problem? I think there were some recent fixes in 5.16 for Realtek cards.

VentGrey commented 2 years ago

While I'm no longer in Pop! OS this issue still exists in 5.15.x I don't know about 5.16, when it arrives to my current distribution I'll let you know. For now a temporal fix I found was to compile the kernel myself without any distribution patches.

burfi-evooq commented 2 years ago

I still encountered this issue on Pop! OS and Ubuntu 22.04 LTS and couldn't fix it either. Tried to make any Linux Distribution run on my HP Elitebook G8 but none worked because of this issue. Edit: finally had enough and just bought an Intel AX200.

mytja commented 2 years ago

While I'm no longer in Pop! OS this issue still exists in 5.15.x I don't know about 5.16, when it arrives to my current distribution I'll let you know. For now a temporal fix I found was to compile the kernel myself without any distribution patches.

If you're not using an Ubuntu-based distribution, you can use the lwfinger rtw88 driver. I had this issue on Fedora, but no longer due to this driver. According to the author, Ubuntu has changed kernel APIs and it should not work, but I haven't tested that (he provides instructions for Ubuntu, so it might work afterall).

cantappa commented 2 years ago

@mytja Thanks a lot for the hint to the lwfinger rtw88 driver! Meanwhile I'm on Linux Mint 21 (kernel 5.15.0-41-generic) where I also still had the WIFI problems (WIFI connection randomly dying) with the error messages "rtw_8822ce 0000:02:00.0: timed out to flush queue 2" just as I had it on Ubuntu. Installing the lwfinger rtw88 driver and blacklisting the rtw88_* modules fully solved the problem for me.

ushby commented 2 years ago

@mytja Thanks a lot for the hint to the lwfinger rtw88 driver! Meanwhile I'm on Linux Mint 21 (kernel 5.15.0-41-generic) where I also still had the WIFI problems (WIFI connection randomly dying) with the error messages "rtw_8822ce 0000:02:00.0: timed out to flush queue 2" just as I had it on Ubuntu. Installing the lwfinger rtw88 driver and blacklisting the rtw88_* modules fully solved the problem for me.

Is there a way to achieve this result on Ubuntu 20.04 ?

cantappa commented 2 years ago

Is there a way to achieve this result on Ubuntu 20.04 ?

Don't know, haven't tested it on Ubuntu. I moved to Linux Mint some time ago for other reasons.

cantappa commented 2 years ago

Short update/correction: I noticed that the error messages "rtw_8822ce 0000:02:00.0: timed out to flush queue 1" are still appearing from time to time but I did not notice network problems. Could be that the network connection is still dying without me noticing it. But if so, it is now recovering by itself. Before, I always needed to disable/enable the wifi connection manually. Thus, for me the initial problem is solved.

cantappa commented 2 years ago

Again a short update: Since a few days I noticed my wifi network connection suddenly (randomly?) crashing again and only coming back to life after disabling and enabling the wifi again. Thus, I'm back to the old erroneous behavior :-(. I'm not aware of any changes on my side that could have caused that.

VentGrey commented 1 year ago

Reopening this as it seems that realtek has been a huge problem even after 2 years of this bug being opened :disappointed: I think this is not a problem for the Pop! Devs, realtek has proven to be awful in Linux systems :s

drujd commented 1 year ago

I have exactly the same problems on Steam Deck (with RTL8822CE soldered on), which uses the rtw88 driver. It happens when connected to QCA QCA9880 AP (in 5GHz vht80 mode). That's what happens when you cheap out and go with Realtek... :( Just typing it here so we know for sure this is probably a hardware/firmware/Qualcomm-Atheros interop problem that seems to happen on Windows as well. At least with Steam Decks, everyone I know that has Turris Omnia (with QCA QCA9880) has this problem.

erayrafet commented 1 year ago
<4>[306924.648314] rtw_8822be 0000:01:00.0: timed out to flush queue 1 RTL8822BE on ChromeOS Flex with Kernel 5.10 LTS.
VentGrey commented 1 year ago

Kernel 6.0.0 manually compiled drivers. I still get the dmesg log errors but WiFi seems to be working properly. No speed-drops / dead adapter.

erayrafet commented 1 year ago

Reopening this as it seems that realtek has been a huge problem even after 2 years of this bug being opened 😞 I think this is not a problem for the Pop! Devs, realtek has proven to be awful in Linux systems :s

TBH, Broadcom is the worst with their licensing mockery.

javierfurus commented 1 year ago

Kernel 6.0.0 manually compiled drivers. I still get the dmesg log errors but WiFi seems to be working properly. No speed-drops / dead adapter.

Which driver did you compile? These? https://github.com/lwfinger/rtw88

I am having the same issue on Ubuntu 22.10 (and previously on Pop), where the performance starts smooth but becomes abysmal soon after - I can't remote play because it keeps spiking and slowing down.

But that is not even the deal-breaker for me, but the constant slowdowns and crashes during meetings.

dungtrihp commented 1 year ago

I don't know why but I turn off Bluetooth and run: sudo modprobe -r rtw88_8822ce and then sudo modprobe rtw88_8822ce It works for me.

EgorChernik commented 5 months ago

sudo modprobe -r rtw88_8822ce sudo modprobe rtw88_8822ce

as a temp solution to avoid reboot - great! It worked for me for 20.04.6 LTS (kernel 5.15.0-105-generic)

mxchist commented 4 months ago

I encountered the messages timed out to flush queue with it onUbuntu 23.10, 6.5.0-35-generic #35-Ubuntu SMP PREEMPT_DYNAMIC. There was no specific conditions, just work when my laptop was connected to Wi-Fi. Don't look as it affects anything, just many messages in log. It is clogging a system log a minute, then stops. After a few minutes it is clogging the logs again, then stops. After researching system log I discovered this behavior began in January 2024.

           *-network
                description: Wireless interface
                product: RTL8822BE 802.11a/b/g/n/ac WiFi adapter
                vendor: Realtek Semiconductor Co., Ltd.
                physical id: 0
                bus info: pci@0000:05:00.0
                logical name: wlo1
                width: 64 bits
                clock: 33MHz
                capabilities: bus_master cap_list ethernet physical wireless
                configuration: broadcast=yes driver=rtw_8822be driverversion=6.5.0-35-generic firmware=N/A ip=192.168.0.135 latency=0 link=yes multicast=yes wireless=IEEE 802.11
                resources: irq:142 ioport:3000(size=256) memory:b1200000-b120ffff
mxchist commented 4 months ago

just many messages in log

Messages timed out to flush queue count in system log: 2024, January: 10 2024, February: 22 2024, March: 12 2024, April: 358 2024, May: 336