diederikdehaas / rtl8812AU

Realtek 8812AU USB WiFi driver
Other
475 stars 177 forks source link

Netis WF2190 woes [LONG] #48

Closed bullfrogreal closed 7 years ago

bullfrogreal commented 7 years ago

Hi all, first time I write here so please be gentle! :)

I got this usb wifi adapter after long time checking online and trusting that the RTL chip could work decently on Linux. I'm using this with my desktop, running Ubuntu 16.04. My router is just in the next room and currently I'm connected using a "hacked together" wifi bridge, which works great and fast but it isn't convenient for me, with all the desk real estate that it's using. A wifi adapter seems the right compromise to make.

Now, I followed the instructions here and all looked like working fine. Adapter recognized, connects to router (a Linksys EA9500) and speed is decent for such a small thingy. Browsing and casual internet use works ok. But suddenly, I realized I had a couple of issues. Note that all the config bits are done by NetworkManager.

FIRST ISSUE

First, the "easy problem"... the adapter sometimes just decides to stop working... when it does, it still shows in iwconfig but it's dead... cannot scan, cannot connect... modprobe -r and reload brings it back.

Here's what I get when I plug in the adapter:

[   69.897650] usb 3-4.2: new high-speed USB device number 9 using xhci_hcd
[   69.986149] usb 3-4.2: New USB device found, idVendor=0bda, idProduct=8812
[   69.986151] usb 3-4.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   69.986152] usb 3-4.2: Product: 802.11n NIC
[   69.986153] usb 3-4.2: Manufacturer: Realtek
[   69.986154] usb 3-4.2: SerialNumber: 123456
[   71.019816] cfg80211: World regulatory domain updated:
[   71.019820] cfg80211:  DFS Master region: unset
[   71.019821] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
[   71.019823] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
[   71.019824] cfg80211:   (2457000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm), (N/A)
[   71.019826] cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (N/A, 2000 mBm), (N/A)
[   71.019827] cfg80211:   (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (N/A)
[   71.019829] cfg80211:   (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2000 mBm), (0 s)
[   71.019831] cfg80211:   (5490000 KHz - 5730000 KHz @ 160000 KHz), (N/A, 2000 mBm), (0 s)
[   71.019832] cfg80211:   (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 2000 mBm), (N/A)
[   71.019833] cfg80211:   (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 0 mBm), (N/A)
[   71.022362] RTL871X: module init start
[   71.022364] RTL871X: rtl8821au v4.3.14
[   71.022365] RTL871X: rtl8821au BT-Coex version = BTCOEX20150128-51
[   71.124382] RTL871X: rtw_ndev_init(wlan0)
[   71.124657] usbcore: registered new interface driver rtl8821au
[   71.124659] RTL871X: module init ret=0
[   71.134265] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   71.501497] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   71.547221] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   75.554260] RTL871X: nolinked power save enter
[   76.080312] RTL871X: nolinked power save leave
[   76.080388] RTL871X: rtw_set_802_11_connect(wlan0)  fw_state=0x00000008
[   76.240385] RTL871X: start auth
[   76.541123] RTL871X: auth success, start assoc
[   76.543101] RTL871X: rtw_cfg80211_indicate_connect(wlan0) BSS not found !!
[   76.543110] RTL871X: assoc success
[   76.543166] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   76.546136] RTL871X: send eapol packet
[   76.550684] RTL871X: send eapol packet
[   76.550715] RTL871X: set pairwise key camid:4, addr:60:38:e0:71:a2:f2, kid:0, type:AES
[   76.551689] RTL871X: set group key camid:5, addr:60:38:e0:71:a2:f2, kid:2, type:AES
[   76.577969] audit: type=1400 audit(1481960284.979:66): apparmor="DENIED" operation="open" profile="/sbin/dhclient" name="/opt/HDD/var/lib/NetworkManager/dhclient-wlan0.conf" pid=2814 comm="dhclient" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[   76.577984] audit: type=1400 audit(1481960284.979:67): apparmor="DENIED" operation="mknod" profile="/sbin/dhclient" name="/opt/HDD/var/lib/NetworkManager/dhclient-b3caa731-7c2f-4cee-bd69-cebfe601c35a-wlan0.lease" pid=2814 comm="dhclient" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
[   76.617562] audit: type=1400 audit(1481960285.019:68): apparmor="DENIED" operation="mknod" profile="/sbin/dhclient" name="/opt/HDD/var/lib/NetworkManager/dhclient-b3caa731-7c2f-4cee-bd69-cebfe601c35a-wlan0.lease" pid=2814 comm="dhclient" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
[   85.574740] audit: type=1400 audit(1481960293.969:69): apparmor="DENIED" operation="open" profile="/sbin/dhclient" name="/opt/HDD/var/lib/NetworkManager/dhclient6-wlan0.conf" pid=3044 comm="dhclient" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[   85.574750] audit: type=1400 audit(1481960293.969:70): apparmor="DENIED" operation="open" profile="/sbin/dhclient" name="/opt/HDD/var/lib/NetworkManager/dhclient6-b3caa731-7c2f-4cee-bd69-cebfe601c35a-wlan0.lease" pid=3044 comm="dhclient" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

Here's iwconfig while it's working:

wlan0     IEEE 802.11AC  ESSID:"Milhouse_5_high"  Nickname:"<WIFI@REALTEK>"
          Mode:Managed  Frequency:5.785 GHz  Access Point: 60:38:E0:71:A2:F2
          Bit Rate:867 Mb/s   Sensitivity:0/0
          Retry:off   RTS thr:off   Fragment thr:off
          Encryption key:****-****-****-****-****-****-****-****   Security mode:open
          Power Management:off
          Link Quality=78/100  Signal level=100/100  Noise level=0/100
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

This is what dmesg shows when it stops (nothing before that "disconnect or roaming" bit):

[ 4431.431229] RTL871X: linked_status_chk(wlan0) disconnect or roaming
[ 4433.441320] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)NOT empty(180) in 2000 ms
[ 4433.441323] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)0x200=0x00d61010, 0x204=0x008d0000 (0x00d61010, 0x008d0000)
[ 4439.455984] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)NOT empty(180) in 2004 ms
[ 4439.455986] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)0x200=0x00d61010, 0x204=0x00390000 (0x00d61010, 0x00390000)
[ 4443.465974] RTL871X: There are some pkts to transmit
[ 4443.465980] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4445.561850] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)NOT empty(180) in 2000 ms
[ 4445.561852] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)0x200=0x00d61010, 0x204=0x00050000 (0x00d61010, 0x00050000)
[ 4449.574183] RTL871X: There are some pkts to transmit
[ 4449.574186] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4451.578252] RTL871X: There are some pkts to transmit
[ 4451.578255] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4453.583332] RTL871X: There are some pkts to transmit
[ 4453.583334] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4456.576009] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)NOT empty(180) in 2000 ms
[ 4456.576013] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)0x200=0x00d61010, 0x204=0x00050000 (0x00d61010, 0x00050000)
[ 4458.554690] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 4460.594937] RTL871X: There are some pkts to transmit
[ 4460.594939] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4462.688544] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)NOT empty(181) in 2008 ms
[ 4462.688547] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)0x200=0x00d61010, 0x204=0x00050000 (0x00d61010, 0x00050000)
[ 4466.707544] RTL871X: There are some pkts to transmit
[ 4466.707546] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4468.711557] RTL871X: There are some pkts to transmit
[ 4468.711560] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4470.716539] RTL871X: There are some pkts to transmit
[ 4470.716542] RTL871X: free_xmitbuf_cnt: 4, free_xmit_extbuf_cnt: 0
[ 4473.710399] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)NOT empty(180) in 2004 ms
[ 4473.710402] RTL871X: SetHwReg8812A:(HW_VAR_CHECK_TXBUF)0x200=0x00d61010, 0x204=0x00050000 (0x00d61010, 0x00050000)

And this is how iwconfig shows after it goes dead:

wlan0     unassociated  Nickname:"<WIFI@REALTEK>"
          Mode:Managed  Frequency=5.785 GHz  Access Point: Not-Associated
          Sensitivity:0/0
          Retry:off   RTS thr:off   Fragment thr:off
          Encryption key:off
          Power Management:off
          Link Quality:0  Signal level:0  Noise level:0
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

If I have a ping in the background, constantly using the connection, it doesn't go down so often but it still does so it doesn't seem strictly a power saving thing. FWIW, power saving looks like being off from what iwconfig says and a further "iwconfig wlan0 power off" returns:

Error for wireless request "Set Power Management" (8B2C) :
    SET failed on device wlan0 ; Operation not permitted.

SECOND ISSUE

The second, more troublesome (but fun) issue is that I noticed regular drops in connectivity... like 5 seconds or so every few minutes. I dug deeper and here's what I found running a simple ping from my desktop to my router/laptop:

  1. drops in connectivity are Swiss clock regular: there's a pattern where connectivity will pause (that's right, PAUSE, read on) for exactly 60, 80 and 120 seconds (in cycling sequence)
  2. desktop (origin) tcpdump would show packets leaving at the expected timestamp but nothing would arrive to the other side
  3. once the "pause interval" ends, the packets all get sent (and received on the other side), like if they were queued but the hw was sleeping or busy doing something else
  4. logs don't show anything happening at that time (I checked syslog but no lines were produced during those blackouts)
  5. I tried from desktop to both router and laptop and from router/laptop to desktop. I also changed my router to an ASUS RT-AC88U. I tried at 5Ghz (on different bands), 2.4Ghz, on 802.11AC, on 802.11G... same result in all cases
  6. I tried with 4.3.14 and with 4.3.22-beta drivers but no noticeable change

The interesting bits from tcpdump logs which drove me to the above are (192.168.1.98 is desktop, Milhouse is my router):

  1. here it works:

    13:52:46.967226 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 60, length 64
    13:52:46.970113 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 60, length 64
    13:52:47.969209 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 61, length 64
    13:52:47.972557 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 61, length 64
    13:52:48.970643 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 62, length 64
    13:52:48.973914 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 62, length 64
    13:52:49.971990 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 63, length 64
    13:52:49.976145 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 63, length 64
    13:52:50.973218 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 64, length 64
    13:52:50.976063 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 64, length 64
  2. then it suddenly looks like no answer is coming back:

    13:52:51.975140 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 65, length 64
    13:52:52.974962 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 66, length 64
    13:52:53.974962 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 67, length 64
    13:52:54.974993 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 68, length 64
  3. and after about 4-5 seconds, looks like the answers to those packets are coming back all in one shot:

    13:52:55.902298 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 65, length 64
    13:52:55.902300 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 66, length 64
    13:52:55.902301 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 67, length 64
    13:52:55.902302 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 68, length 64
    13:52:55.974452 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 69, length 64
    13:52:55.977190 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 69, length 64
    13:52:56.976267 IP 192.168.1.98 > Milhouse: ICMP echo request, id 5198, seq 70, length 64
    13:52:56.979298 IP Milhouse > 192.168.1.98: ICMP echo reply, id 5198, seq 70, length 64
  4. Finally, here is where you can see the patterns:

    ... (it's working normally until suddenly...)
    16:25:11.616929 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 109, length 64
    16:25:12.625751 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 110, length 64
    16:25:13.633732 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 111, length 64
    16:25:14.641770 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 112, length 64
    16:25:14.917280 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 109, length 64
    16:25:14.918805 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 110, length 64
    16:25:14.918807 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 111, length 64
    16:25:14.918809 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 112, length 64
    ... (~80 sec of normal operations)
    16:26:34.754975 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 192, length 64
    16:26:35.761709 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 193, length 64
    16:26:36.769809 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 194, length 64
    16:26:37.777768 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 195, length 64
    16:26:37.914155 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 192, length 64
    16:26:37.914156 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 193, length 64
    16:26:37.914158 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 194, length 64
    16:26:37.914159 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 195, length 64
    ... (~102 sec of normal operations)
    16:28:16.924273 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 294, length 64
    16:28:17.923671 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 295, length 64
    16:28:18.927645 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 296, length 64
    16:28:19.927693 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 297, length 64
    16:28:20.914872 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 294, length 64
    16:28:20.914873 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 295, length 64
    16:28:20.914874 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 296, length 64
    16:28:20.914875 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 297, length 64
    ... (~120 sec of normal operations)
    16:30:17.084328 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 414, length 64
    16:30:18.083668 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 415, length 64
    16:30:19.083672 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 416, length 64
    16:30:20.083669 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 417, length 64
    16:30:20.910408 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 414, length 64
    16:30:20.910412 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 415, length 64
    16:30:20.910416 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 416, length 64
    16:30:20.910419 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 417, length 64
    ... (~120 sec of normal operations)
    16:32:17.252843 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 534, length 64
    16:32:18.261750 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 535, length 64
    16:32:19.269758 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 536, length 64
    16:32:20.271653 IP 192.168.1.98 > Milhouse: ICMP echo request, id 6714, seq 537, length 64
    16:32:20.911465 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 534, length 64
    16:32:20.911471 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 535, length 64
    16:32:20.911474 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 536, length 64
    16:32:20.911477 IP Milhouse > 192.168.1.98: ICMP echo reply, id 6714, seq 537, length 64
    ... (and so on, staying at 120 sec intervals thereafter)

The intervals start sometime at ~60 seconds right after the driver is loaded, then increase to ~80, ~100 and finally ~120 seconds thereafter as you see above.

To note that running tcpdump on the other side (laptop) doesn't show ANY packet coming during those few seconds, seeing them all in one shot afterwards (and that's why the receiving side replies in bulk for those 4 packets seq 65-68). It makes me think that perhaps the adapter as some internal queue which suddently stops being processed for those few seconds and tcpdump is not aware of it.

The adapter is cheap and it isn't a big deal if I have to throw it in the trash but it's an intriguing problem and I enjoyed the digging but now I've reached the end of my sparse knowledge and looking for someone willing/able to take a stab at it. Oh, and yeah, it could be that this is a faulty adapter, perhaps... I don't have another machine to use it on and test Windows drivers... I may look for someone to lend me something to experiment with. Just that faulty seems an odd chance when there are such regular patterns.

Happy to provide other logs and outputs if needed. Bullfrog

bullfrogreal commented 7 years ago

A little unimpressed with the (lack of) replies here but meh... best effort is best effort so can't complain.

Anyway, decided to dig deeper, installed other drivers, enabled debug info and found the culprit.

Apparently either the chip, the device itself or the base rtl driver (or a mix of those) are not able to get through band scanning fast enough so the gaps I saw were caused by NetworkManager doing the usual regular iw scan.

If you find yourself in the same situation, one of the ways to fix it is to force NetworkManager to join a specific BSSID (may be not great if you are on a laptop and moving a lot around, though). That way the scans will not happen anymore at regular intervals. Another more effective solution would be to move away from NetworkManager completely and handle your connectivity in other ways.

Hope this can help others facing a similar problem.