lwfinger / rtw88

A backport of the Realtek Wifi 5 drivers from the wireless-next repo.
611 stars 175 forks source link

Some errors with a Realtek 8821CE RFE Type 6 #98

Open SirLouen opened 2 years ago

SirLouen commented 2 years ago

I have installed this driver following the guidelines, but still some questions arise

  1. Kernel by default, has this same driver. Does this installation overwrite default kernel driver? I'm currently using a 5.19 liquorix kernel

  2. Here is the dmesg log:

[   14.307750] rtw_core: loading out-of-tree module taints kernel.
[   14.308178] rtw_core: module verification failed: signature and/or required key missing - tainting kernel
[   14.340763] rtw_8821ce 0000:01:00.0: enabling device (0000 -> 0003)
[   14.341659] rtw_8821ce 0000:01:00.0: Firmware version 24.11.0, H2C version 12
[   14.346728] intel_telemetry_core Init
[   14.359971] rtw_8821ce 0000:01:00.0: mac power on failed
[   14.359984] rtw_8821ce 0000:01:00.0: failed to power on mac
[   14.359989] rtw_8821ce 0000:01:00.0: failed to setup chip efuse info
[   14.359994] rtw_8821ce 0000:01:00.0: failed to setup chip information
[   14.372489] rtw_8821ce: probe of 0000:01:00.0 failed with error -114

According to the information I've gathered Firmware version 24.11.0 represents an RLT 8821CE chipset with RFE Type 6. I'm not sure why this is failing (and I'm not even sure if this driver is being loaded over the kernel default one)

Any ideas on this?

lwfinger commented 2 years ago

When you run this driver using a kernel with the device built in, you have to take precautions. Blacklisting the kernel driver is the best way to ensure that you are using drivers from this repo. The command 'lsmod | grep 88' will show what drivers are loaded. Having competing drivers would explain what you see.

I was careful in selecting the driver name. The kernel versions are rtw88_xxx, and this repo uses rtw_xxx.

I have no idea what thomaspinho has in his repo. I will be updating this one to kernel 6.1 specs in the next few days.

SirLouen commented 2 years ago

I was careful in selecting the driver name. The kernel versions are rtw88_xxx, and this repo uses rtw_xxx.

I blacklist the kernel version in modprobe.d with rtw_8821ce

blacklist rtw_8821ce

lwfinger commented 2 years ago

That will blacklist the driver from this repo, not the kernel version. The kernel has rtw88_8821ce.

SirLouen commented 2 years ago

That will blacklist the driver from this repo, not the kernel version. The kernel has rtw88_8821ce.

I'm not 100% confident of this, check this:

# locate rtw_8821ce.ko
/usr/lib/modules/5.19.0-9.1-liquorix-amd64/kernel/drivers/net/wireless/realtek/rtw88/rtw_8821ce.ko

Have you implemented a method to fully wipe this kernel after a installation so I can check this more thoroughly?

SirLouen commented 2 years ago

I can confirm that both drivers are using the exact same identifier

 cat /etc/modprobe.d/dkms.conf 
# modprobe information used for DKMS modules
#
# This is a stub file, should be edited when needed,
# used by default by DKMS.

blacklist rtl8821ce
blacklist wl
blacklist rtw_8821ce
# blacklist rtw88_8821ce
blacklist 8821ce`

As we can see here, I have removed rtw88_8821ce from my testing black list (to see if it loads from the kernel one), and nothing loads.

Maybe the old kernel driver was called that way (here example on a 5.10 kernel)

# locate rtw88_8821ce
/usr/lib/modules/5.10.0-18-amd64/kernel/drivers/net/wireless/realtek/rtw88/rtw88_8821ce.ko
lwfinger commented 2 years ago

In my 6.0.0-rc6 kernel, I get the following: finger@localhost:~/linux-2.6>find /lib/modules/$(uname -r) -name rtw88_8821ce.ko /lib/modules/6.0.0-rc6-00291-g31412942dc7b-dirty/kernel/drivers/net/wireless/realtek/rtw88/rtw88_8821ce.ko

I do not rename the kernel version. Note: dkms is not building anything that is in the kernel.

SirLouen commented 2 years ago

I've booted on the 5.10 kernel, as I commented before was using the rtw88_ option, and then blacklisting rtw88, I've been able to make this driver work. I think that the name of rtw_8821ce is going to be conflictive in the future, as I see that the kernel like liquorix is using the same name, creating a serious conflict. Maybe another completely different name like rtk_8821ce to really go away from all conventions could have been better.

Anyway I will test this thoroughly, because I've been wasting 2 days with multiple drivers, distros and configurations and this is the first time I've been able to make this chipset work under this weird revision.

First issue I've found is that I have left the computer AFK and when I returned the wifi driver was not working.

I wonder if this applies also to this driver

Checking dmesg, a ton of mac power on failed where found there. Is it already reported that suspension of the system makes this driver fail?

lwfinger commented 2 years ago

I am not going to rename the drivers. If liquorix is renaming the kernel drivers, that is a good reason to avoid that distro! It is more likely that they are packaging my drivers.

In file /usr/lib/modprobe.d/60-blacklist_rtw88.conf, you should have a line that says "blacklist rtw88_8821ce". That will keep the kernel version from loading. Note that the kernel drivers in 5.10 are much older than the ones in this repo.

If your interface is failing to come back up after sleep, that usually means that the BIOS is not coded correctly, you need to prepare a script named /lib/systemd/system-sleep/8821ce or /etc/pm/sleep.d/8821ce, depending on how new your distro is. The contents of this file should be

!/bin/sh

if [ "${1}" == "pre" ]; then modprobe -rv rtw_8821ce elif [ "${1}" == "post" ]; then modprobe -v rtw_8821ce fi

After you have created the file (as root), then make the script executable with the command

sudo chmod a+x /8821ce

Substitute the actual directory used in your system. Once this script is in place, the driver will be unloaded upon going to sleep or hibernation, and be reloaded when the system wakes up.

SirLouen commented 2 years ago

This is completely mindblowing

[    9.977410] rtw_8821ce 0000:01:00.0: enabling device (0000 -> 0003)
[    9.983219] rtw_8821ce 0000:01:00.0: firmware: direct-loading firmware rtw88/rtw8821c_fw.bin
[    9.983231] rtw_8821ce 0000:01:00.0: Firmware version 24.11.0, H2C version 12
[    9.983281] rtw_8821ce 0000:01:00.0: mac power on failed
[    9.983287] rtw_8821ce 0000:01:00.0: failed to power on mac
[    9.983289] rtw_8821ce 0000:01:00.0: failed to setup chip efuse info
[    9.983291] rtw_8821ce 0000:01:00.0: failed to setup chip information
[    9.999481] rtw_8821ce: probe of 0000:01:00.0 failed with error -114

I've found this error with the original kerner driver rtw88_8821ce as I mentioned yesterday Today I did a clean install on Windows 11 to check if the Wif driver was working correctly, and it works seamlessly. I also deactivated the Windows Fastboot option, because another user, commented that this could cause conflicts with certain wireless chipsets. Personally I think that this is irrelevant (because I saw the wifi chipset working with this driver before doing this), but just to discard extra possible issues

After doing all this, I proceeded to format again the harddrive and install latest MX Linux

Before installing this driver, I tried again with the rtw88_8821ce including the rtw_pci disable_aspm modprobe option just in case. Rebooted several times, but the error above popped constantly.

So finally and with all the things sorted out I removed the disable_aspm option, I blacklisted the rtw88_8821ce and the MXLinux rtl8821ce drivers and proceed to install this rtw_8821ce driver.

So far, so good. AFter the installation I rebooted and the wifi networks were there, showing to be connected.

As soon as I connected to any of them, connection dropped and messages about this drop start appearing (mostly failed to power on mac)

Now I've rebooted again and it doesn't display the wifi chipset, with the errors above. It's like a power failure for the chipset which doesn't have any sense because under windows it works flawlessly without drops. If I reset the module

# modprobe -r rtw_8821ce
# modprobe rtw8821ce

The network driver returns briefly to drop again after a couple of seconds. Here is the dmesg log

[  682.704511] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[  693.838604] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  693.982584] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  750.579286] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  750.719227] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  772.359461] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  772.499226] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  794.183420] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  794.327750] rtw_8821ce 0000:01:00.0: timed out to flush queue 2
[  841.324392] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[0]
[  841.324617] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[1]
[  841.324838] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[2]
[  841.325059] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[3]
[  841.325279] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[5]
[  841.325500] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[6]
[  853.746784] rtw_8821ce 0000:01:00.0: failed to poll offset=0x5 mask=0x2 value=0x0
[  853.746808] rtw_8821ce 0000:01:00.0: mac power on failed
[  853.746814] rtw_8821ce 0000:01:00.0: failed to power on mac
[  853.746820] rtw_8821ce 0000:01:00.0: leave idle state failed
[  853.747005] rtw_8821ce 0000:01:00.0: failed to leave ips state
[  853.747014] rtw_8821ce 0000:01:00.0: failed to leave idle state
[  853.747353] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[0]
[  853.747591] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[1]
[  853.747811] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[2]
[  853.748215] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[3]
[  853.748435] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[5]
[  853.748654] rtw_8821ce 0000:01:00.0: timed out to flush pci tx ring[6]

Some people are reporting this issue

https://github.com/pop-os/pop/issues/1302 https://patchwork.kernel.org/project/netdevbpf/patch/20211210081659.4621-1-jhp@endlessos.org/

After a couple of module reloads, the wifi stops loading and only shows this log in dmesg:

[ 1513.751952] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 1513.752293] cfg80211: Loaded X.509 cert 'benh@debian.org: 577e021cb980e0e820821ba7b54b4961b8b4fadf'
[ 1513.752624] cfg80211: Loaded X.509 cert 'romain.perier@gmail.com: 3abbc6ec146e09d1b6016ab9d6cf71dd233f0328'
[ 1513.752927] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 1513.752989] platform regulatory.0: firmware: direct-loading firmware regulatory.db
[ 1513.753013] platform regulatory.0: firmware: direct-loading firmware regulatory.db.p7s
[ 1514.085552] rtw_8821ce 0000:01:00.0: firmware: direct-loading firmware rtw88/rtw8821c_fw.bin
[ 1514.085569] rtw_8821ce 0000:01:00.0: Firmware version 24.11.0, H2C version 12
[ 1514.095395] rtw_8821ce 0000:01:00.0: mac power on failed
[ 1514.095407] rtw_8821ce 0000:01:00.0: failed to power on mac
[ 1514.095411] rtw_8821ce 0000:01:00.0: failed to setup chip efuse info
[ 1514.095416] rtw_8821ce 0000:01:00.0: failed to setup chip information
[ 1514.114615] rtw_8821ce: probe of 0000:01:00.0 failed with error -114

I wonder if I can recompile the driver with more dmesg verbosity, so we can try to debug further this issue.

SirLouen commented 2 years ago

New findings: I powered the laptop with the power cable, wireless became stable. But then I decided to reboot, and the driver has not come up since then.

This is the weirdest thing I've seen in my life

I would say with 100% of confidence that the laptop is broken or the wireless chipset, just because of this behaviour. The only problem here is, that if I run under W10/W11, the wireless connection is perfect. I've tested 3 times, just to make sure I'm not wrong on this aspect.

I'm completely lost at this point on what to do next.

lwfinger commented 2 years ago

If it behaves differently when connected to power than it does on battery, your chip is extremely voltage dependent. That I cannot fix as your laptop should provide the same voltage in both instances. I agree that the laptop is broken. The Windows driver is completely separate, although my neighbor has a cheap laptop in which this chip failed to keep a connection. I gave him an RTL8192EUS USB dongle, disabled the RTL8821CE, and he has been happy since.

After you reboot and the driver does not come up, what does 'lsmod | grep 88' show? If this result shows that the driver is loaded, then the laptop's power-up sequence is locking up the wifi chip.

SirLouen commented 2 years ago

After you reboot and the driver does not come up, what does 'lsmod | grep 88' show? If this result shows that the driver is loaded, then the laptop's power-up sequence is locking up the wifi chip.

Yep, driver is always loaded, probably being locked out but this is weird.

The Windows driver is completely separate

What do you mean with this? Obviously if the chipset works in Windows, we might assume that there could be a solution. Probably you don't have it if you are simply replicating the work of the Realtek employees. Currently I'm building 6.0-rc6 to see whats going on here

In your case, you don't have a way to debug further? Why the heck chipset is being locked exclusively for Linux (and not for Windows?)

lwfinger commented 2 years ago

It is possible that Linux loads drivers earlier in the power-on sequence that does Windows. As Windows is a black box, who knows? You could blacklist rtw_8821ce, and then do the loading in a start-up script, which will be delayed.

You obviously do not understand my role. I am a volunteer, without any knowledge of the chip internals, who is taking the Realtek Linux driver and modifying it to build on older kernels. I did not write ANY of these drivers.

Somewhere, there is a Linux version of the driver used by Windows. My latest experience with one of these was for the RTW8852BE. It took about a month of work to make it reliable on Linux. Two major bug were found by Linux that apparently caused no problems with Windows. I have no interest in finding and implementing such a driver for the RTL8821ce, even at my normal consulting rate.

If it is not possible to return the laptop, I would determine the form factor of the 8821ce, and replace it with a different chip.

SirLouen commented 2 years ago

It is possible that Linux loads drivers earlier in the power-on sequence that does Windows

Nope, this doesn't make sense.

Because I've found the exact same error just after I was able to load the driver and see the wireless networks.

Also found the issue, when I disconnected the power cord. This has something to do with the power management of the driver but I'm not sure whats going on

I'm going to contact straight to the realtek employees in the kernel bugzilla to see if they can find a solution

What for me is completely weird, is that your driver works sometimes, but the "official" rtw88_8821ce never works. This is why I though you had some extra technical knowledge that let you improve the driver itself.

lwfinger commented 2 years ago

My repo is running the driver as it will be in kernel 6.1. You are running an older version. That is why mine works better.

SirLouen commented 2 years ago

Latest is 6.0 rc7 just released today. Link to 6.1?? PS: I'm using kernel 6.0-rc6 right now and still failing.

lwfinger commented 2 years ago

The wireless drivers for 6.1 are found in git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git, but the rtw88 drivers there are essentially the same as in 6.0.

SirLouen commented 2 years ago

Yes now I've found that I'm using the exact same driver as yours with my current kernel

I've found the (weird) method to make this work always (but with a ton of flaws)

  1. First I have to turn down the computer without the power cord
  2. Then I have to plug the power cord and reload the module (modprobe -r rtw_8821ce && modprobe rtw_8221ce)

As soon as I unplug the power cord, the connection drops

This happens 100% of the times.

lwfinger commented 2 years ago

I just finished some long-term tests with an 8852ae, and installed my 88221ce card.

I find it a bit flaky on power up. I have to uninstall and reinstall the driver before it can make a connection. It authenticates and associates, but then it cannot handle a DHCP request for an IP. I am running the driver from v6.0-rc7.

There are a couple of module parameters for rtw88_pci that handle some problems with BIOS coding: disable_msi:Set=y disable_aspm:Set=y

You might try these individually or together to see if it handles the power disconnect. My Toshiba laptop, which comes from 2014, emits a couple of strange messages to the dmesg log when I unplug, but the wifi stays up. I am logging some "timed out to flush queue 2" messages that I need to examine, but they do not seem to be critical.

I will also investigate the power-up problem.

lwfinger commented 2 years ago

I finished my long-term test of an rtw8852ae, thus I can install an rtl8821ce card.

On my system, I found that it did not power-up from a cold start. It could associate and authenticate with WPA2, but it failed to get an IP address. The attached patch fixed that problem. I trust that you have the ability to apply a patch. If that patch does not work for you on startup, you should try msleep(2000) rather than msleep(1000). I would hesitate to delay the routine more than 2 sec. If the 1 sec delay works, you might try changing the 1000 to 500 to see if 1/2 sec is sufficient. I will also see how small a sleep is required here.

Disconnecting the power connection on my laptop does not interrupt the connection.

There are two parameters that can be applied to rtw88_pci or rtw_pci, namely

disable_msi:Set=y disable_aspm:Set=y

These two have fixed problems with certain BIOS errors on Lenovo laptops and some HP models. rtw8821ce_delay_start.txt

SirLouen commented 2 years ago

Disconnecting the power connection on my laptop does not interrupt the connection.

Mine does, so weird.

There are two parameters that can be applied to rtw88_pci or rtw_pci, namely

I have this on modprobe.d/local.conf

options rtw88_pci disable_aspm=1

I'm not 100% this is doing anything at all

These two have fixed problems with certain BIOS errors on Lenovo laptops and some HP models.

I'm going to test this patch. Does this work with your repo driver? Or with the kernel driver?

lwfinger commented 2 years ago

The file paths in the patch are for the repo. Other than that, the patch would apply to the kernel as well.

The options do not necessarily do anything with a given BIOS. You need to try the other one. For completeness, add a similar line for rtw_pci. That will cover you no matter which driver you load.

SirLouen commented 2 years ago

I've tried your patch but it doesn't do anything

If the computer loads with the power cord on, it will load drivers, but they won't do anything showing the errors I posted in the first message: https://github.com/lwfinger/rtw88/issues/98#issue-1384760032

The only way to activate this is running the computer without the power cord, and connecting it when the OS has fully loaded as I comment on my protocol here: https://github.com/lwfinger/rtw88/issues/98#issuecomment-1258456820

I've tried both ASPM and MSI options in modprobe but no differences (like the msleep patch)

The weirdest part is that once I load the OS with the power cord on, it wont ever turn the card on, regardless if I reset the card 10 times with modprobe. It's like if loading OS with power cord completely blocks the card.

Also if I load the OS with the power cord off, and then I put it on, restart the driver with modprobe it works. But if I unplug the power cable, the network card drops. But it drops forever, I won't have a method to restore it back (unless I power off the computer and do the protocol again)

I clearly thing that the issue boils down to the message

failed to power on mac

There is something with the power that makes this driver completely mad. In windows this doesn't happen, I can simply connect and disconnect the power cord 10 times and the connection is 100% stable. One could say that is a BIOS issue but is clearly not a BIOS issue, because I can start the wireless card way after the BIOS has loaded (I load without power cable connected, then I plug it, and it start working, so basically BIOS doesn't have anything to do with this, here there is a matter of power management issue in the driver).

My big question is what is "power on mac"? What is MAC?

lwfinger commented 2 years ago

MAC == Media Access Controller - the hardware layer that supports wireless. See https://fossbytes.com/data-link-layer-explained-mac-layer-llc-layer/ to see what that layer looks like.

The Windows driver is completely separate from this Linux driver. Most of the Windows drivers also can be built for Linux, but I was unable to find that source anywhere.

The BIOS still controls lots of things after it passes control on to the main operating system. It is responsible to provide a standard interface for the specific setup of the motherboard. If that were not the case, every MB would require a special driver for every OS run on it.

Your power problem is something that Realtek may have coded around in the Windows driver. If you could provide details on what changes at the base connector of the NIC, then we might be able to provide a quirk for your computer. It may be a change in the level of some voltage, but I think it is more likely to be a voltage spike or dip such that the internal state of the device is scrambled, and the only way to recover is to power it off.

I will discuss this problem with my contact at Realtek, but I am not sure we will be able to help.

lwfinger commented 2 years ago

He suggested that you add "rtw88_core.ko disable_lps_deep=y" to the options.

SirLouen commented 2 years ago

He suggested that you add "rtw88_core.ko disable_lps_deep=y" to the options.

Tested yesterday, but forget to report back: it doesnt solve the issue.

Your power problem is something that Realtek may have coded around in the Windows driver. If you could provide details on what changes at the base connector of the NIC, then we might be able to provide a quirk for your computer.

What do you mean with this, what is the base connector of the NIC? This is a chinese white-label laptop

lwfinger commented 2 years ago

I did not expect you to be able to make the necessary measurements. Even without them, it is clear that the power supply in the laptop is sub-standard.

I just committed an alternative driver for RTW8821CE in directory alt_rtl8821ce. I have only compiled it - not tested at all.

bragma commented 2 years ago

Hi, it seems that the alternative driver supports "concurrent mode" via flag "-DCONFIG_CONCURRENT_MODE". This should allow to run AP + client mode at the same time. Where is the source coming from?

lwfinger commented 2 years ago

It comes from a different Realtek group.

The kernel version allows multiple virtual interfaces because that is a feature of mac80211, and can run an AP and a STA at the same time. That situation is very easy to set up with NetworkManager. The alternate driver only uses cfg80211, thus it has to do the multiple interfaces in the driver. Sometimes concurrent mode works, and sometimes it does not. I did not test that driver at all. It is provided for comparison.

bragma commented 1 year ago

Hi @lwfinger, thanks for the answer. Do you mean it is already possible to setup AP + client with the regular rtw88 driver? I’m using an 8821ce chip.

lwfinger commented 1 year ago

That is what I said. Read the NewtworkManager documentation.

bragma commented 1 year ago

@lwfinger thanks for confirming this. I am setting up AP + client, but I can't bring both interfaces up at the same time (error RTNETLINK answers: Device or resource busy on up). Maybe someone with more experience with this driver can help me out. Maybe I'm missing something trivial.

Here is a log, trying to bring up ap0 and wlp3s0 at the same time:

root@test:~# iw phy phy0 interface add ap0 type __ap
root@test:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 00:ce:39:d1:5c:e7 brd ff:ff:ff:ff:ff:ff
4: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 5c:c5:63:b1:eb:4e brd ff:ff:ff:ff:ff:ff
5: ap0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 5c:c5:63:b1:eb:4e brd ff:ff:ff:ff:ff:ff
root@test:~# ip link set dev wlp3s0 address 5c:c5:63:b1:eb:40
root@test:~# ip link set dev ap0 up
root@test:~# ip link set dev wlp3s0 up
RTNETLINK answers: Device or resource busy
root@test:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 00:ce:39:d1:5c:e7 brd ff:ff:ff:ff:ff:ff
4: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 5c:c5:63:b1:eb:40 brd ff:ff:ff:ff:ff:ff
5: ap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 5c:c5:63:b1:eb:4e brd ff:ff:ff:ff:ff:ff
root@test:~# ip link set dev ap0 down
root@test:~# ip link set dev wlp3s0 up
root@test:~# ip link set dev ap0 up
RTNETLINK answers: Device or resource busy
root@test:~# iw phy
Wiphy phy0
...
        software interface modes (can always be added):
                 * AP/VLAN
                 * monitor
        interface combinations are not supported
...

Thanks!

lwfinger commented 1 year ago

I have only done it with the NetworkManager GUI, and not with the 8821CE. Perhaps someone at the linux-wireless ML can help, but cut your diagnostic into to a minimum. When I see lots of stuff, I turn off. If the person that answers needs more, they will ask.

bragma commented 1 year ago

Ok thanks. Just added a few command outputs to explain the problem, after getting a few "how can I understand if I don't see what happens?" answers elsewhere. Have a good day!

SirLouen commented 1 year ago

I've been absent for a while because I'm a little bored of this laptop. After this odyssey I've found that also the sound card is not working not even in kernel 6.0-rc7 so I feel this is a super-unsupported laptop for Linux

This said I've tried the alt driver. Checking the directory I see that there is a 8821ce.ko file so I thought that this could simply be the 8821ce from tomaspihno or similar https://github.com/tomaspinho/rtl8821ce

And it is. Same output and eveything works the exact same way as with any other driver: if I start the laptop without the power cord and then when it's fully loaded, I plugin the cord and restart the kernel module, it all of sudden, starts working

Same debug as in this repo: https://github.com/tomaspinho/rtl8821ce/issues/298

This is plainly not working. Probably as you say, because of some power configs not being set up on a standard way, making HALMAC not initialize for some reason.

This is when I load with the cord in from the beginning:

[    9.426205] RTW: module init start
[    9.426208] RTW: rtl8821ce v5.5.2_34066.20200325_COEX20180712-3232
[    9.426210] RTW: build time: Sep 28 2022 18:25:12
[    9.426211] RTW: rtl8821ce BT-Coex version = COEX20180712-3232
[    9.426231] RTW: rtw_inetaddr_notifier_register
[    9.426277] rtl8821ce 0000:01:00.0: enabling device (0000 -> 0003)
[    9.427188] RTW: Memory mapped space start: 0x80200000 len:00010000 flags:00140204, after map:0xffffa132406a0000
[    9.427194] RTW: CHIP TYPE: RTL8821CE
[    9.427197] RTW: Bus master is not enabled by BIOS! usPciCommand=0
[    9.427202] RTW: Failed to enable bus master! usPciCommand=0
[    9.427203] RTW: Pci Bridge Vendor is found: VID=0x8086, VendorIdx=0
[    9.427247] RTW: [HALMAC]11692M
               HALMAC_MAJOR_VER = 1
               HALMAC_PROTOTYPE_VER = 5
               HALMAC_MINOR_VER = 0
               HALMAC_PATCH_VER = 2
[    9.427253] RTW: ERROR [HALMAC][ERR]Chip id is undefined
[    9.427255] RTW: ERROR rtw_halmac_init_adapter: halmac_init_adapter fail!(status=54)
[    9.427257] RTW: rtl8821ce_set_hal_ops: [ERROR]HALMAC initialize FAIL!
[    9.427259] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.read_chip_version ###
[    9.427261] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.init_default_value ###
[    9.427262] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.intf_chip_configure ###
[    9.427263] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.read_adapter_info ###
[    9.427264] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_power_on ###
[    9.427265] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_power_off ###
[    9.427267] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_init ###
[    9.427268] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_deinit ###
[    9.427269] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.init_xmit_priv ###
[    9.427270] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.free_xmit_priv ###
[    9.427271] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_xmit ###
[    9.427272] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.mgnt_xmit ###
[    9.427273] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_xmitframe_enqueue ###
[    9.427275] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.init_recv_priv ###
[    9.427276] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.free_recv_priv ###
[    9.427277] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.inirp_init ###
[    9.427278] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.inirp_deinit ###
[    9.427279] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.irp_reset ###
[    9.427281] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.interrupt_handler ###
[    9.427282] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.enable_interrupt ###
[    9.427283] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.disable_interrupt ###
[    9.427284] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.dm_init ###
[    9.427285] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.dm_deinit ###
[    9.427286] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_dm_watchdog ###
[    9.427288] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.set_chnl_bw_handler ###
[    9.427289] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.set_hw_reg_handler ###
[    9.427290] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.GetHwRegHandler ###
[    9.427291] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.get_hal_def_var_handler ###
[    9.427292] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.SetHalDefVarHandler ###
[    9.427294] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.GetHalODMVarHandler ###
[    9.427295] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.SetHalODMVarHandler ###
[    9.427296] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.SetBeaconRelatedRegistersHandler ###
[    9.427297] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.fill_h2c_cmd ###
[    9.427298] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.hal_mac_c2h_handler ###
[    9.427300] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.fill_fake_txdesc ###
[    9.427301] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.fw_dl ###
[    9.427302] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.set_tx_power_index_handler ###
[    9.427303] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.get_tx_power_index_handler ###
[    9.427304] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.sreset_init_value ###
[    9.427306] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.sreset_reset_value ###
[    9.427307] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.silentreset ###
[    9.427308] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.sreset_xmit_status_check ###
[    9.427309] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.sreset_linked_status_check ###
[    9.427310] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.sreset_get_wifi_status ###
[    9.427312] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.sreset_inprogress ###
[    9.427313] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.init_mac_register ###
[    9.427314] RTW: ### rtw_hal_ops_check - Error : Please hook hal_func.init_phy ###
[    9.427321] RTW: rtw_pci_primary_adapter_init Failed!
[    9.427421] RTW: module init ret=0

And this is when I load without the cord, and then I plug it in after restarting the module:

[  100.181445] RTW: module init start
[  100.181452] RTW: rtl8821ce v5.5.2_34066.20200325_COEX20180712-3232
[  100.181455] RTW: build time: Sep 28 2022 18:25:12
[  100.181456] RTW: rtl8821ce BT-Coex version = COEX20180712-3232
[  100.181476] RTW: rtw_inetaddr_notifier_register
[  100.181541] rtl8821ce 0000:01:00.0: enabling device (0000 -> 0003)
[  100.181731] RTW: Memory mapped space start: 0x80200000 len:00010000 flags:00140204, after map:0xffffa13240660000
[  100.181735] RTW: CHIP TYPE: RTL8821CE
[  100.181738] RTW: Bus master is enabled. usPciCommand=7
[  100.181750] RTW: PCIe Header Offset =70
[  100.181754] RTW: PCIe Capability =2
[  100.181757] RTW: Link Control Register =40
[  100.181760] RTW: Clock Request =0
[  100.181763] RTW: Driver Sets default Cache Line Size...
[  100.181772] RTW: Pci Bridge Vendor is found: VID=0x8086, VendorIdx=0
[  100.181814] RTW: [HALMAC]11692M
               HALMAC_MAJOR_VER = 1
               HALMAC_PROTOTYPE_VER = 5
               HALMAC_MINOR_VER = 0
               HALMAC_PATCH_VER = 2
[  100.181857] RTW: rtw_hal_config_rftype RF_Type is 0 TotalTxPath is 1
[  100.181861] RTW: Chip Version Info: CHIP_8821C_Normal_Chip_UMC_E_CUT_1T1R_RomVer(4)
[  100.181871] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.184677] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_LOGICAL_EFUSE
[  100.184689] RTW: HW EFUSE
[  100.184693] RTW: 0x000: 29 81 00 BC  09 10 0B 00  AC 04 A4 35  10 03 30 0B  
[  100.184708] RTW: 0x010: 25 26 26 26  26 20 22 22  23 24 24 13  FF FF FF FF  
[  100.184722] RTW: 0x020: FF FF 2C 2B  2D 2E 26 26  26 24 22 22  24 22 1F 21  
[  100.184735] RTW: 0x030: 22 FF FF FF  FF FF EC FF  FF FF 27 28  29 29 29 29  
[  100.184748] RTW: 0x040: 28 29 2A 2A  2A 02 FF FF  FF FF FF FF  FF FF FF FF  
[  100.184762] RTW: 0x050: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.184775] RTW: 0x060: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.184788] RTW: 0x070: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.184802] RTW: 0x080: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.184815] RTW: 0x090: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.184828] RTW: 0x0A0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.184842] RTW: 0x0B0: FF FF FF FF  FF FF FF FF  7F 55 20 00  FF FF FF FF  
[  100.184855] RTW: 0x0C0: FF 21 00 00  00 00 00 00  00 FF 06 FF  FF FF FF FF  
[  100.184868] RTW: 0x0D0: E0 51 D8 3A  DF 21 EC 10  21 C8 EC 10  21 C8 C3 FF  
[  100.184881] RTW: 0x0E0: 80 8D 80 08  00 00 11 3C  27 00 10 20  01 21 C8 FE  
[  100.184896] RTW: 0x0F0: FF 4C E0 00  04 0C 00 80  02 00 00 FF  1F 1E F0 00  
[  100.184909] RTW: 0x100: DA 0B 21 C8  E7 46 03 00  E0 4C C8 21  01 0A 03 52  
[  100.184922] RTW: 0x110: 65 61 6C 74  65 6B 20 12  03 42 6C 75  65 74 6F 6F  
[  100.184936] RTW: 0x120: 74 68 20 52  61 64 69 6F  20 00 FF FF  FF FF FF FF  
[  100.184986] RTW: 0x130: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185001] RTW: 0x140: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185016] RTW: 0x150: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185031] RTW: 0x160: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185046] RTW: 0x170: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185059] RTW: 0x180: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185073] RTW: 0x190: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185086] RTW: 0x1A0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185100] RTW: 0x1B0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185114] RTW: 0x1C0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185127] RTW: 0x1D0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185141] RTW: 0x1E0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185155] RTW: 0x1F0: FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF  
[  100.185170] RTW: EEPROM ID = 0x8129
[  100.185172] RTW: EEPROM Version = 0
[  100.185181] RTW: EEPROM Regulatory=0x01
[  100.185183] RTW: EEPROM Board Type=0x01
[  100.185186] RTW: EEPROM Enable BT-coex, ant_num=2
[  100.185188] RTW: hal_com_config_channel_plan chplan:0x7F
[  100.185189] RTW: EEPROM crystal_cap=0x55
[  100.185190] RTW: EEPROM ThermalMeter=0x20
[  100.185191] RTW: EEPROM Customer ID=0x00
[  100.185192] RTW: EEPROM SupportRemoteWakeup=0
[  100.185193] RTW: EEPROM rfe_type=0x6
[  100.185195] RTW: WIFI Module is iPA/iLNA
[  100.185196] RTW: EEPROM tx_bbswing_24G =0x00
[  100.185197] RTW: EEPROM tx_bbswing_5G =0x00
[  100.189629] RTW: SetHwReg: bMacPwrCtrlOn=1
[  100.206265] RTW: _rtw_hal_set_fw_rsvd_page((null)) Get [ NOR ] RsvdPageNUm  ==>
[  100.206269] RTW: LocPsPoll: 4
[  100.206272] RTW: LocNullData: 5
[  100.206275] RTW: LocQosNull: 6
[  100.206277] RTW: LocBTQosNull: 7
[  100.206278] RTW: _rtw_hal_set_fw_rsvd_page((null)) Get [ NOR ] RsvdPageNUm <==
[  100.206553] RTW: rtl8821c_fw_dl Download Firmware from array success
[  100.206556] RTW: NIC FW Version:20 SubVersion:1
[  100.206673] RTW: SetHwReg: bMacPwrCtrlOn=0
[  100.206674] RTW: hal_read_mac_hidden_rpt OK! (1, 0ms), fwdl:1, id:0x19
[  100.206681] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206685] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206689] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206693] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206696] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206699] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206702] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206706] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206709] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206713] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206715] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206719] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206722] RTW: ERROR [HALMAC][ERR]Dump efuse in suspend
[  100.206725] RTW: is_valid_id_status: HALMAC_FEATURE_DUMP_PHYSICAL_EFUSE
[  100.206727] RTW: rtw_hal_read_chip_info in 24 ms
[  100.206737] RTW: init_channel_set((null)) ChannelPlan ID:0x7f, ch num:37
[  100.206815] RTW: init_mlme_default_rate_set: support CCK
[  100.206817] RTW: init_mlme_default_rate_set: support OFDM
[  100.207042] RTW: rtw_alloc_macid((null)) if1, mac_addr:ff:ff:ff:ff:ff:ff macid:1
[  100.207053] RTW: IQK FW offload:enable
[  100.207057] RTW: init_phydm_cominfo: fab_ver=1 cut_ver=4
[  100.207060] RTW: rtw_regsty_chk_target_tx_power_valid return _FALSE for band:0, path:0, rs:0, t:-1
[  100.207110] RTW: phy_ConfigBBWithPgParaFile(): No File PHY_REG_PG.txt, Load from HWImg Array!
[  100.207118] RTW: default power by rate loaded
[  100.207120] RTW: phy_txpwr_by_rate_chk_for_path_dup duplicate 2.4G [A] to [B]
[  100.207167] RTW: PHY_ConfigRFWithPowerLimitTableParaFile(): No File TXPWR_LMT.txt, Load from HWImg Array!
[  100.207425] RTW: default power limit loaded
[  100.207466] RTW: default mapping domain:0x7f to regd_name:FCC
[  100.207859] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207862] RTW: rtl8821ce_init_txbd_ring queue:0, ring_addr:0000000075a4ecbc
[  100.207867] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207869] RTW: rtl8821ce_init_txbd_ring queue:1, ring_addr:00000000277a3aa5
[  100.207871] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207874] RTW: rtl8821ce_init_txbd_ring queue:2, ring_addr:0000000014490f41
[  100.207875] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207878] RTW: rtl8821ce_init_txbd_ring queue:3, ring_addr:000000005d843107
[  100.207879] RTW: rtl8821ce_init_txbd_ring entries num:2
[  100.207882] RTW: rtl8821ce_init_txbd_ring queue:4, ring_addr:0000000084f9c9e8
[  100.207883] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207886] RTW: rtl8821ce_init_txbd_ring queue:5, ring_addr:00000000ce9b5bba
[  100.207887] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207889] RTW: rtl8821ce_init_txbd_ring queue:6, ring_addr:000000005fcff288
[  100.207891] RTW: rtl8821ce_init_txbd_ring entries num:128
[  100.207893] RTW: rtl8821ce_init_txbd_ring queue:7, ring_addr:00000000bbd298fe
[  100.207896] RTW: rtw_macaddr_cfg mac addr:e0:51:d8:3a:df:21
[  100.207899] RTW: GetHalDefVar: [WARNING] HAL_DEF_VARIABLE(30) not defined!
[  100.207902] RTW: SetHalDefVar: [WARNING] HAL_DEF_VARIABLE(29) not defined!
[  100.207903] RTW: bDriverStopped:True, bSurpriseRemoved:False, bup:0, hw_init_completed:False
[  100.207938] RTW: rtw_wiphy_alloc(phy0)
[  100.207941] RTW: rtw_wdev_alloc(padapter=00000000ba7a68b2)
[  100.207948] RTW: rtw_wiphy_register(phy0)
[  100.207950] RTW: Register RTW cfg80211 vendor cmd(0x67) interface
[  100.208063] RTW: rtw_reg_notifier: NL80211_REGDOM_SET_BY_CORE
[  100.208505] RTW: rtw_ndev_init(wlan0) if1 mac_addr=e0:51:d8:3a:df:21
[  100.208591] RTW: rtw_ndev_notifier_call(wlan0) state:17
[  100.208917] RTW: cfg80211_rtw_get_txpower
[  100.208922] RTW: rtw_ndev_notifier_call(wlan0) state:5
[  100.209743] RTW: cfg80211_rtw_get_txpower
[  100.210169] RTW: pci_enable_msi ret=0
[  100.210192] RTW: Request_irq OK, IRQ 131
[  100.210361] RTW: module init ret=0

I don't understand well the logs, but for me it is interesting this parts:

ON FAIL:

ERROR [HALMAC][ERR]Chip id is undefined

RTW: Bus master is not enabled by BIOS! usPciCommand=0

ON SUCCESS:

Chip Version Info: CHIP_8821C_Normal_Chip_UMC_E_CUT_1T1R_RomVer(4)

RTW: Bus master is enabled. usPciCommand=7

So weird that "Bus master is not enabled" and after I plug in the power cord and I restart the kernel module with modprobe suddenly, Busmaster is enabled and happy. Maybe power cord can magically manipulate the BIOS without even restarting the computer.

Basically, there is something on boot, that is hindering the capacity of reading the Busmaster bit by Linux and the driver.

lwfinger commented 1 year ago

More likely is that when the power cord is attached, the behavior of the PCI bus matches what the BIOS expects.

The best I can offer you is a method that would hold off loading the driver until the machine is fully booted, and then start the driver, but that would not be much better than doing it manually.

That ERROR [HALMAC][ERR]Chip id is undefined shows me that the driver cannot access the most basic parts of the PCI interface. Everything is screwed up in that condition!!!!

SirLouen commented 1 year ago

That ERROR [HALMAC][ERR]Chip id is undefined shows me that the driver cannot access the most basic parts of the PCI interface. Everything is screwed up in that condition!!!!

Yes, I've tried to remove the cord after the connection was stablished and the Busmaster is recognized but then once I remove the cord for the second time, it doesn't permit to restablish the connection ever more

The errors that persist are these ones:

[    9.427253] RTW: ERROR [HALMAC][ERR]Chip id is undefined
[    9.427255] RTW: ERROR rtw_halmac_init_adapter: halmac_init_adapter fail!(status=54)
[    9.427257] RTW: rtl8821ce_set_hal_ops: [ERROR]HALMAC initialize FAIL!

As you say, driver cannot access the PCI. So maybe the problem is with the MoBo not with the Network card?

SirLouen commented 1 year ago

This is the system info

# dmidecode
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.2.0 present.
Table at 0x6DFDC000.

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: American Megatrends Inc.
        Version: X13GTE.E.L4XB376.6S.S3E3P2W7.SDZ.AOC.L003
        Release Date: 07/21/2022
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 4928 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 1.20

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: Default string
        Product Name: Default string
        Version: Default string
        Serial Number: Default string
        UUID: 03000200-0400-0500-0006-000700080009
        Wake-up Type: Power Switch
        SKU Number: Default string
        Family: Notebook

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: Default string
        Product Name: Default string
        Version: Default string
        Serial Number: Default string
        Asset Tag: Default string
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: Default string
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0

Looks seriously bad

lwfinger commented 1 year ago

Oh, I am absolutely convinced that the problem is with the motherboard, or the power regulator. Clearly, the problem is NOT with the network card. It is just the canary in the coal mine.

yurym commented 1 year ago

Same ASPM issue on my Rock 3A board with Armbian, kernel version 5.18.0 and 6.1.11. Sometimes after reboot wifi module is working, but usually no.

[10504.680704] rtw_8821ce 0000:01:00.0: Firmware version 24.11.0, H2C version 12
[10504.725404] rtw_8821ce 0000:01:00.0 wlp1s0: renamed from wlan0
0000:01:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821CE 802.11ac PCIe Wireless Network Adapter
        Subsystem: AzureWave RTL8821CE 802.11ac PCIe Wireless Network Adapter
        Kernel driver in use: rtw_8821ce
        Kernel modules: rtw_8821ce, 8821ce

On tomaspinho's rtl8821ce driver exact same problem.

lwfinger commented 1 year ago

When you report a problem, please do not say "same problem". Always report your log entries. The problem may look the same to you, but it may be from a different cause.

I am not sure how much ARM usage there has been with any of these drivers. Since you mention an ASPM issue, have you tried the module option to disable ASPM? If not, do the following: As root, edit the file /usr/lib/modules.d/70-rtw88_core.conf and add 2 lines options rtw_pci disable_aspm=y options rtw88_pci disable_aspm=y

yurym commented 1 year ago

Sorry for the short message. I will try to write more. For example. Today I turned on the board for the first time, the wifi module was not found. I think he is sleeping. After that I added options to /usr/lib/modules-load.d/70-rtw88_core.conf and rebooted the board. After reboot the wifi module was found.

Mar  5 06:29:08 rock-3a kernel: [   12.072530] cfg80211: Loading compiled-in X.509 certificates for regulatory database
Mar  5 06:29:08 rock-3a kernel: [   12.073622] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
Mar  5 06:29:08 rock-3a kernel: [   12.079957] cfg80211: loaded regulatory.db is malformed or signature is missing/invalid
Mar  5 06:29:08 rock-3a kernel: [   12.215281] rtw_core: loading out-of-tree module taints kernel.
Mar  5 06:29:08 rock-3a kernel: [   12.215589] rtw_core: module verification failed: signature and/or required key missing - tainting kernel
Mar  5 06:29:08 rock-3a kernel: [   12.241594] rtw_8821ce 0000:01:00.0: enabling device (0000 -> 0003)
Mar  5 06:29:08 rock-3a kernel: [   12.248590] rtw_8821ce 0000:01:00.0: Firmware version 24.11.0, H2C version 12
Mar  5 06:29:08 rock-3a kernel: [   12.554492] rtw_8821ce 0000:01:00.0 wlp1s0: renamed from wlan0

Then I rebooted the board again, the module was gone. I tried manually load the driver. Unsuccessfully.

root@rock-3a:~# lspci
0000:00:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd Device 3566 (rev 01)
0002:00:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd Device 3566 (rev 01)
root@rock-3a:~# modprobe rtw_pci disable_aspm=y
root@rock-3a:~# cat /sys/module/rtw_pci/parameters/disable_aspm
Y
root@rock-3a:~# modprobe rtw_8821ce
root@rock-3a:~# lsmod|grep rtw
rtw_8821ce             16384  0
rtw_8821c              86016  1 rtw_8821ce
rtw_pci                24576  1 rtw_8821ce
rtw_core              155648  2 rtw_8821c,rtw_pci
mac80211              954368  2 rtw_core,rtw_pci
cfg80211              925696  2 rtw_core,mac80211
root@rock-3a:~# lspci
0000:00:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd Device 3566 (rev 01)
0002:00:00.0 PCI bridge: Fuzhou Rockchip Electronics Co., Ltd Device 3566 (rev 01)

Repeated reboots do not help. This is a classic situation in my case. Wifi module works fine on my laptop with Windows 10 and Ubuntu 20.10.

lwfinger commented 1 year ago

If fhe command 'lspci' does not show the device, then it will not be found when the system is booted. You will need to find out where Rasbian stores the file that controls modules loaded at boot, and add rtw_8821ce and rtw88_8821ce to that list. There are reports on the web that /etc/modules is the correct file, but that may not be right.

Aford-Liuke commented 3 months ago

I have the same issue, I found two URLs like this: https://bugzilla.kernel.org/show_bug.cgi?id=216530 https://bbs.archlinux.org/viewtopic.php?id=281984 I suspect it's a power management issue. Follow the operation of the first link, and the wireless network will work normally.