morrownr / 88x2bu

Linux Driver for USB WiFi Adapters that are based on the RTL8812BU and RTL8822BU Chipsets
435 stars 73 forks source link

iw dev wlanX del kernel hang #46

Closed crgorect closed 3 years ago

crgorect commented 3 years ago

Looking into a kernel hang issue that seems to be happening with a 8822bu chipset wifi module that we are working with on a custom board.

This is running on a Raspberry Pi CM4 with a 64 bit Centos 8 Stream OS variant which is fairly bare bones. We are also running an 8812au chipset variant with your other driver. We can call iw dev wlan8812au del without any issues, but upon running the same command on the interface for the 8822bu, the kernel seems to hang. I have moved to various versions of iw and all paths lead back to a driver issue. I have been having hard time getting any sort of logs off of the device as after execution the entire device hangs and a hard power off and reboot is required. It seems none of the logs are surviving the reboot and am looking for any recommendations on how to debug and rule the driver out or rule the driver in. It seems this does not cause a core dump itself and have not seen a kernel panic other then a process is stuck.

morrownr commented 3 years ago

This sounds familiar. I've been working what sounds like your issue and others that are closely related for over 2 months and am still working on it. First let me tell you what I know and then we can see if we can figure this out.

If you are going to be using usb wifi adapters on a RasPi4x, my extensive testing shows only 2 ultra stable chipsets in the AC1200 class. The rtl8812au and the mt7612u. How do I know this? You can see that I maintain 5 out-of-kernel drivers here and maintain an information site here as well. I use these drivers and am constantly testing them.

Of the 5 Realtek drivers here, 3 of them show issues similar to what you are seeing on the RasPi4x: 88x2bu, 8814au and 8821cu. The 8812au and 8821au do not show the problem. Initially I thought it was a driver problem so I spent a lot of time to trying to trace flow around the drivers. I could not track down a problem as the problems seemed to be moving around. To make a long story short, I can get stable function on a RasPi4b now with the drivers as they are but the USB adapter has to be the only USB device plugged in and I have shut off other subsystems that use power such as the onboard wifi and bt.

I ordered a USB power meter the other day. I don't have it yet but I will document what I find when able. I think the problem is RasPi4x specific and has to do with power and the USB subsystem. Of course, there could be a lot of things at play so as I have time, I will continue to investigate.

Something could help is if you would provide very specific details of your entire setup.

Regards.

crgorect commented 3 years ago

Very interesting.

To give you a better description of our custom board. We have a built in 5 amp power controller with two surface mount wifi modules. I can also verify that the wifi modules are getting the correct amount of power as the power controller is directly powering them outside of the CM4. I had previously thought a power issue may be at play, but after moving from our Raspberry Pi prototypes to the new board I am fairly convinced power was not the problem (I am not certain exactly how to prove this within the module has we did not design these specific modules).

I was leaning more towards either a driver or an ifconfig issue. I had suspected that ifconfig had been creating the given interface instead of iw and that was causing some underlying problem. I happened upon a successful call when I typed in the wrong interface name and managed to properly delete the 8812au interface without issue.... This drove me back to thinking that it had to be a driver issue, but maybe there is something else that I am just not thinking of.

morrownr commented 3 years ago

I'm perfectly willing to work on the driver but without a good handle on what the problem is, I can't see that as something productive to do. I just started testing on a RasPi3b to see if that helps narrow things down.

I didn't ask you what mode you are using. I only see this "going down" problem when in AP mode and generally only when using 80 MHz channel width.

Have you checked to see if backfeeding is an issue? That is, current from your external power supply feeding back into the board? The RasPi4 is well known to not handle that well at all.

Why this is not a problem with 3 of the wifi chipsets and is a problem with 3 is not known but I compared source and cannot determine a problem. Here is what I noticed but I don't know if it has to do with anything:

The chipsets that don't work well:

rtl8812bu rtl8814au rtl8821cu

The chipsets that work well:

mt7612u rtl8812au rtl8811au

The chipsets that work well are the 3 oldest chipsets. The driver for the mt7612u is in the Linux kernel and is a very good standards compliant driver. The driver for the 8812au is the best driver I have ever seen come out of Realtek. It is a really good driver.