morrownr / USB-WiFi

USB WiFi Adapter Information for Linux
2.65k stars 175 forks source link

AWUS036ACM stability issues #300

Open da-mkay opened 1 year ago

da-mkay commented 1 year ago

Hi,



I set up a 2,4 GHz AP using a Raspberry Pi 4B and an ALFA AWUS036ACM.
 Besides the USB Stick (plugged into a USB2 port) only LAN and USB-C (for power) is connected. 
iperf3 measures around 941 mbps from The RPi4 via LAN to a Linux-Server in the same LAN.

 A client connected to the AP measures 90-100 mbps to that same server using iperf3. That’s enough as I just want to fully utilise my 100 mbps internet connection.
 An internet speed test measures 85 to 90 mbps. That‘s ok.



Now to the problem:
 I have 3 devices connected to the AP: One MacBook Pro 2013, one MacBook Pro 2022 and one iPhone 14 Pro.
 For testing purpose I ran a download on the MacBooks in an infinite loop using curl and start a download from the iPhone manually as well in parallel.
 I always see the same picture:


Unfortunately nothing was logged by hostpd when downloads failed on MacBook Pro. 
I saw some „disassociated“ logs for the iPhone, but after adjusting some hostapd settings and looking at the logs as soon as the download hung on the iPhone, I saw that nothing was logged for the iPhone.

 Since the ALFA AWUS036ACM is advertised here as highly recommended, I wonder if someone has similar issues. 
Previously I had a Fritz AC 860, which had similar issues. Even worse: there I experienced those freezes even on the MBP 2013, so that I had to reconnect. One time, I even had to reboot the RPi, because WiFi AP was not working at all anymore. 

I already tried different USB Ports, even USB3 with disable_usb_sg, but this did not help and speed was worse. 
I also tried different hostapd settings, see below.



OS: Raspberry Pi OS Lite (64-bit) - 2023-05-03



$ cat /etc/issue

Debian GNU/Linux 11 \n \l

$ uname -r

6.1.21-v8+

hostapd config:


driver=nl80211
ctrl_interface=/var/run/hostapd
ctrl_interface_group=0

auth_algs=1

wpa_key_mgmt=WPA-PSK

beacon_int=100

ssid=MYSSID

channel=11

hw_mode=g

ieee80211n=1

wmm_enabled=1

wpa_passphrase=MYPASSWORD
interface=wlan0

wpa=2

wpa_pairwise=CCMP

country_code=DE

ignore_broadcast_ssid=0



ap_isolate=1


# For Alfa

dtim_period=2

max_num_sta=32

rts_threshold=2347

fragm_threshold=2346


ieee80211d=1

ieee80211h=1

require_ht=1

ht_capab=[LDPC][GF][SHORT-GI-20][TX-STBC][RX-STBC1]

# having the same problem with default ht_capab



#ieee80211ac=1



# Test against iPhone-WiFi-freezes. Does not work. 



#ap_max_inactivity=86400

#dtim_period=3

#disassoc_low_ack=0




onboard WiFi is disabled.

morrownr commented 1 year ago

Hi @da-mkay

Cool. Time to play with a Pi.

Main Menu item 8 is about AP mode. The main document provides example hostapd.conf files and tested settings for many options. For the mt7612u based adapters, it recommends:

ht_capab=[HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40]

I think your ht_capab line is including some items that are not supported with the ACM. It is very important to get the ht_capab and vht_capab lines right. I can show you how to determine the right settings for any specific adapter if you like.


require_ht=1

I've never seen that setting do anything but cause problems.


# Enables support for 5GHz DFS channels (requires ieee80211d=1)
ieee80211h=1

This is only needed if you are going to support DFS channels... of which, there are none for the 2.4 GHz band.


ap_isolate=1



Do you need this?


Well, instead of going over various lines, let me post the example for 2.4 I have in the document I posted about above:

# /etc/hostapd/hostapd-2g.conf
# Documentation: https://w1.fi/cgit/hostap/plain/hostapd/hostapd.conf
# 2022-08-08

# SSID
ssid=myPI-2g
# PASSPHRASE
wpa_passphrase=myPW1234
# Band: a = 5g (a/n/ac), g = 2g (b/g/n)
hw_mode=g
# Channel
channel=6
# Country code
country_code=US

# Bridge interface
bridge=br0
# WiFi interface
interface=wlan1

# nl80211 is used with all Linux mac80211 (in-kernel) and modern Realtek drivers
driver=nl80211
#ctrl_interface=/var/run/hostapd
#ctrl_interface_group=0

beacon_int=100
dtim_period=2
max_num_sta=32
rts_threshold=2347
fragm_threshold=2346
#send_probe_response=1

# security
# auth_algs=3 is required for WPA3-SAE and WPA3-SAE Transition mode
auth_algs=1
macaddr_acl=0
ignore_broadcast_ssid=0
wpa=2
wpa_pairwise=CCMP
# WPA2-AES
wpa_key_mgmt=WPA-PSK
# WPA3-SAE
#wpa_key_mgmt=SAE
#wpa_group_rekey=1800
rsn_pairwise=CCMP
# ieee80211w=2 is required for WPA3-SAE
#ieee80211w=2
# If parameter is not set, 19 is the default value.
#sae_groups=19 20 21 25 26
#sae_require_mfp=1
# If parameter is not 9 set, 5 is the default value.
#sae_anti_clogging_threshold=10

# IEEE 802.11n
ieee80211n=1
wmm_enabled=1
#
# Note: Only one ht_capab= line should be active. The content of these lines is
# determined by the capabilities of your adapter.
#
# generic 20 MHz setting
ht_capab=[SHORT-GI-20]
#
# generic 40 MHz setting
#ht_capab=[HT40+][HT40-][SHORT-GI-20][SHORT-GI-40]
#
# RasPi4B internal wifi
#ht_capab=[HT40+][HT40-][SHORT-GI-20][SHORT-GI-40][DSSS_CCK-40]
#
# rt5370 - rt3070
#ht_capab=[HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40][RX-STBC1]
#
# ar9271
#ht_capab=[HT40+][HT40-][SHORT-GI-20][SHORT-GI-40][RX-STBC1][DSSS_CCK-40]
#
# mt7612u - mt7610u
#ht_capab=[HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40]
#
# rtl8812au - rtl8811au -  rtl8812bu - rtl8811cu
#ht_capab=[HT40+][HT40-][SHORT-GI-20][SHORT-GI-40][MAX-AMSDU-7935]
# rtl8814au
#ht_capab=[LDPC][HT40+][HT40-][SHORT-GI-20][SHORT-GI-40][MAX-AMSDU-7935][DSSS_CCK-40]

# End of hostapd-2g.conf

There could be other causes but let's get hostapd.conf lined out and if it is not rock solid, we can continue.

Cheers,

@morrownr

da-mkay commented 1 year ago

First of all, thank you for the quick reply.

For the mt7612u based adapters, it recommends:

ht_capab=[HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40]

I think your ht_capab line is including some items that are not supported with the ACM.

I got that ht_capab from some issue here about the ACM. I tried also the one from the main page. AFAIK I had the same issue then.

I can show you how to determine the right settings for any specific adapter if you like.

That would be great.

require_ht=1

I've never seen that setting do anything but cause problems.

Okay, THX, I will give it a try without the option.

# Enables support for 5GHz DFS channels (requires ieee80211d=1)
ieee80211h=1

This is only needed if you are going to support DFS channels... of which, there are none for the 2.4 GHz band.

Good point. At some time I was just trying to get best speed and reliability and tried various settings from verious sources. THX for the explanation.

ap_isolate=1



Do you need this?

Yes, actually that’s the reason why I‘m using the Raspi as an AP in the first place.

Well, instead of going over various lines, let me post the example for 2.4 I have in the document I posted about above: … There could be other causes but let's get hostapd.conf lined out and if it is not rock solid, we can continue.

As far as I remember I already tried the settings from the pages here, but I will give it a try again next week when I‘m back from holiday 🌴

morrownr commented 1 year ago

Check in when you get back and we'll do some work.

Enjoy and watch out for fires.

da-mkay commented 1 year ago

Hi, I did some more tests the last days.

I used the same hostapd settings from the pages here, well almost:

driver=nl80211
ctrl_interface=/var/run/hostapd
ctrl_interface_group=0
ssid=MYSSID
wpa_passphrase=MYPASSWORD
interface=wlan0

# -------------------------------

# Band: a = 5g (a/n/ac), g = 2g (b/g/n)
hw_mode=g
# Channel
channel=11
# Country code
country_code=DE

beacon_int=100
dtim_period=2
max_num_sta=32
rts_threshold=2347
fragm_threshold=2346
#send_probe_response=1

# security
# auth_algs=3 is required for WPA3-SAE and WPA3-SAE Transition mode
auth_algs=1
macaddr_acl=0
ignore_broadcast_ssid=0
wpa=2
wpa_pairwise=CCMP
# WPA2-AES
wpa_key_mgmt=WPA-PSK
# WPA3-SAE
#wpa_key_mgmt=SAE
#wpa_group_rekey=1800
rsn_pairwise=CCMP
# ieee80211w=2 is required for WPA3-SAE
#ieee80211w=2
# If parameter is not set, 19 is the default value.
#sae_groups=19 20 21 25 26
#sae_require_mfp=1
# If parameter is not 9 set, 5 is the default value.
#sae_anti_clogging_threshold=10

# IEEE 802.11n
ieee80211n=1
wmm_enabled=1

# mt7612u - mt7610u
ht_capab=[HT40+][HT40-][GF][SHORT-GI-20][SHORT-GI-40]

I had the same problems when downloading files from the internet. To ensure the problems are not caused by the servers or the internet connection, I created a small script that runs on a server in the LAN. Opening the servers IP in the browser of a client like iPhone/MacBook/Android phone, the server will start to send random data to that browser infinitely (via websocket) as fast as possible. In the browser I then get a visual feedback when the connection is lost and it retries infinitely.

With this test I only see connection losses on the iPhone, so not on the MacBook anymore (even after 6 hours the MacBook was still connected and receiving data). The connection loss of the iPhone usually happens within 15 to 60 minutes. And it doesn‘t matter if I run the test on multiple clients in parallel or on iPhone only (having the full bandwidth available).

One could say that this is an iPhone problem then. However, using the WiFi of my router the same test using the same devices shows no problems. The iPhone has not a single connection loss within 3 hours in this case.

I wonder if there are some hostapd settings I can use to solve this or if this is some compatibility issue between iPhone and the ACM, such that I need a different adapter to solve this (Realtek based?).

morrownr commented 1 year ago

Hi again @da-mkay

Glad you made it back.

I had to reread the entire thread to get back up to speed, which was a challenge for me as I am having vision problems that hopefully will be corrected over the next couple of months.

My thoughts at this point: (correct me if you see something you don't agree with)

It appears that your hostapd.conf changes have improved but not fully corrected the problem. That is good and indicates we have better settings in place but there are other things that could be contributing so maybe some more investigation is in order.

My use of my ACM with a RasPi4B and hostapd has mostly been while using 5 GHz and with care, I can see 400 Mbps indefinitely with whatever I connect. I say with care because I know exactly how to set it up so that I don't see any problems. There are problems to be had with the bad USB3 chipset in the Pi4B and I know not to use a powered hub as they are problematic when setting up AP's on Pi's.

My recommendation for testing this setup is for you to put the ACM in the USB2 port that is on the same plane as the circuit board and remove the flash drive for now. This is for testing to eliminate possible sources of problems.

If you are not seeing any problems in the hostapd log, then how about we press on and see if anything is showing up in the PiOS log:

$ sudo dmesg

Something to keep in mind given that you are working with a very modern Iphone and a very mature Macbook, is incompatibilities are more likely to show up on the new and old end of the spectrum. I wish the, hopefully, soon to be released new PiOS based on Debian 12 was available as it is a rebase that includes many badly needed parts that are really showing their age in the current PiOS. There are dated versions of wpa_supplicant, hostapd, and other things that could be contributing to this problem. I can show you how to compile and install new versions of these apps if you are interested and do not want to wait for the next PiOS.

I would like to try to duplicate this problem but I do own an iPhone. My son is planning to buy an iphone 14 soon so there is hope.

I think we have run onto a very specific issue that is showing up due to the specific combination of hardware and software that isin use. If it is a bug in the ACM driver and we can show exactly where the problem happens, then we can report it.

such that I need a different adapter to solve this (Realtek based?).

I maintain 6 Realtek out-of-kernel drivers here and would tell you if there is a better solution. The mt7612u driver and chipset are pretty darn stable. Could there be a bug that is causing this? Sure. My recommendation is to continue testing to see if we can determine the problem.

Cheers

da-mkay commented 1 year ago

I had to reread the entire thread to get back up to speed, which was a challenge for me as I am having vision problems that hopefully will be corrected over the next couple of months.

I keep my fingers crossed!

It appears that your hostapd.conf changes have improved but not fully corrected the problem.

Oops, I forgot to mention that I did the same (new) test also with my old settings. The result was the same for my old settings and the new ones: MacBook Pro showed no error anymore, iPhone did. Maybe the MacBook errors in the past were caused by internet connection or the download servers.

My recommendation for testing this setup is for you to put the ACM in the USB2 port that is on the same plane as the circuit board and remove the flash drive for now. This is for testing to eliminate possible sources of problems.

I am already using the USB2 Ports (tested both). And no flash drive, USB hub etc. is connected. Only USB-C to power the Raspi4 is connected, Ethernet and the ACM via USB2.

If you are not seeing any problems in the hostapd log, then how about we press on and see if anything is showing up in the PiOS log:

$ sudo dmesg

Yes, nothing gets logged when the iPhone looses connection, even when running hostapd with -dd flags. During connection loss also dmesg does not show anything.

The only thing logged regarding the ACM is:

$ sudo dmesg | grep mt76
[    8.794444] mt76x2u 1-1.3:1.0: ASIC revision: 76120044
[    8.863772] mt76x2u 1-1.3:1.0: ROM patch build: 20141115060606a
[    9.050146] mt76x2u 1-1.3:1.0: Firmware Version: 0.0.00
[    9.050190] mt76x2u 1-1.3:1.0: Build: 1
[    9.050206] mt76x2u 1-1.3:1.0: Build Time: 201507311614____
[    9.934644] usbcore: registered new interface driver mt76x2u

Something to keep in mind given that you are working with a very modern Iphone and a very mature Macbook, is incompatibilities are more likely to show up on the new and old end of the spectrum. I wish the, hopefully, soon to be released new PiOS based on Debian 12 was available as it is a rebase that includes many badly needed parts that are really showing their age in the current PiOS. There are dated versions of wpa_supplicant, hostapd, and other things that could be contributing to this problem. I can show you how to compile and install new versions of these apps if you are interested and do not want to wait for the next PiOS.

I just compiled hostapd 2.10 from sources, but the problem was the same: connection loss of iPhone after 19 minutes.

I would like to try to duplicate this problem but I do own an iPhone. My son is planning to buy an iphone 14 soon so there is hope.

Haha, I bet he will not give his new phone away for a few testing hours 😋

I think we have run onto a very specific issue that is showing up due to the specific combination of hardware and software that isin use. If it is a bug in the ACM driver and we can show exactly where the problem happens, then we can report it.

My recommendation is to continue testing to see if we can determine the problem.

Unfortunately I am running out of ideas what I can test. hostapd has so many settings to fine tune. Would be great to take some very stable configuration from a project like ddwrt or openwrt, but I couldn't find one.

morrownr commented 1 year ago

Would be great to take some very stable configuration from a project like ddwrt or openwrt, but I couldn't find one.

This might be a good idea and you do appear to be rather knowledgeable person. I have used OpenWRT for a long time. It uses hostapd but it is in its own very stable os. Pi4B is one of the many supported systems with OpenWRT:

https://openwrt.org/toh/raspberry_pi_foundation/raspberry_pi

If you have an extra sd card and want to set your Pi4B up with OpenWRT, I say go for it. OpenWRT has a driver for the ACM but you do have to install it with Luci or manually depending on what you decide. This would be interesting. You might enjoy it. If you have questions, somebody around here can probably point you in the right direction but the OpenWRT forums are very active.

bjlockie commented 1 year ago

Maybe try disabling scatter gather. Google how. It doesn't appear to be the problem but it's worth a try.

Maybe some sort of power saving thing?

morrownr commented 1 year ago

@bjlockie

Maybe try disabling scatter gather.

I'm pretty sure I saw above somewhere where he said that he tried that. This thread is long and it is hard to remember everything. With that said, I have never seen the need for turning scatter gather off with AP mode using the 2.4 GHz band. I think that is because of the slower speeds on 2.4 GHz.

Maybe some sort of power saving thing?

I'm hesitant to rule this out but I'm beginning to form an opinion that the disconnection is not being initiated by the AP he has built. We are not seeing anything in the logs that would indicated such. I have no idea how to get information from the logs in an iPhone or Macbook Pro because I do not have that hardware but if I did, I would be looking for clues in the logs of those devices. This does not mean that those devices are at fault.

I recommended that he look at digging another sd card out so as to burn it with OpenWRT for these reasons:

da-mkay commented 1 year ago

Yes, I already have disable_usb_sg=1.

I also just tried OpenWRT on the RPi4 with ACM and experienced the same issue. The iPhone lost connection while running the test for 10 to 20 minutes. The MacBook from 2013 kept running for 90 minutes until I stopped the test manually. Again nothing was logged in dmesg during iPhone connection loss.

morrownr commented 1 year ago

@da-mkay

Random thoughts:

Would you consider changing to 5 GHz for testing? With OpenWRT on the Pi, it is easy with Luci. With RasPiOS, several hostapd.conf lines will need to be added or changed. Also, need to change the ACM to a USB3 port or the speed won't be where we would expect. I understand that you are probably using 2.4 for the range but the ACM does reasonably well on 5 GHz.

You could test different 2.4 GHz channels as well. This really is a puzzle and I know you want solid results right now but wifi can work in mysterious ways.

morrownr commented 1 year ago

@da-mkay

FYI: A couple of days ago I through a clean installation of OpenWRT 22.3.5 on my ARM based wifi router. The router has a USB2 port and a USB3 port. I put the ACM in the USB3 port. Iset it to channel 6. My router user uses channel 1 as it is the best local 2.4 GHz channel. I installed iperf3 and the driver for the ACM.

My mission was to see if the ACM would drop the connection. I tried several USB WiFi adapters as client. Nothing about 30 minutes. I tried my noteboot computer. I finally decided to do a long test. I did an overnight test. I was still going this morning. Not a single retry.

Okay, so my RasPi4B was not used as it is busy on another project but I did what I could with what is available. I got nothing.

Do you have anything besides the RasPi4B to test? X86 boxes are generally very stable. A RasPi3B or 3B+ would be good boxes to test with as they can handle 2.4 GHz speeds and and would eliminate some things that might be a problem in the 4B.

Where are we? The iPhone 14 and RasPi4B are still possibilities to be the guilty party.

Have you read any good iPhone 14 support forums to see if any other iPhone 14 users are reporting anything similar? Any recent updates to the iPhone 14 that are not installed yet?

da-mkay commented 1 year ago

Thanks for your effort @morrownr ! I think you really need an iPhone to see the issue. Because I don't have any issues with Android Phones or MacBooks.

I also tested 5Ghz now on OpenWRT where I installed kmod-mt76x2u:

Saying "connection loss" I mean that the websocket-connection was lost. Usually, the iPhone still thought for some time, that it was connected to the WiFi. But no new connection was possible until WiFi reconnected.

I also had the chance to test an iPhone 11 now running the same most recent iOS -> same problem. Once the test on the iPhone 11 failed, I saw this in dmesg:

[  639.545531] mt76x2u 1-1.4:1.0: error: mt76x02u_mcu_wait_resp failed with -110
[  642.105634] mt76x2u 1-1.4:1.0: error: mt76x02u_mcu_wait_resp failed with -110

Moreover, reconnect to the WiFi was not possible anymore. Actually, nobody could connect anymore. Not even the MacBook. I had to restart the Raspi.

I tested again using iPhone 14 Pro, and this time came to the same result:

[ 1958.199170] mt76x2u 1-1.4:1.0: error: mt76x02u_mcu_wait_resp failed with -110
[ 1960.759180] mt76x2u 1-1.4:1.0: error: mt76x02u_mcu_wait_resp failed with -110

Again, no connection to WiFi was possible and I had to restart the Raspi.

I also tested OpenWRT on a USB-Stick installation on my x86 PC. At first it looked like it was more stable, but in the end I saw the same problems on the iPhone. I tested it with USB3.

Since only iPhones are affected it could be an iPhone issue. But again, connecting it to the WiFi of my router, I don't see those issues. Moreover, by then I am a bit worried about running the ACM as an AP. Because it seems that any WiFi client could break the WiFi. Not only accidentally, but with purpose by attackers. Doesn't feel good.

Regarding the errors in dmesg, I found something similar here, but with a different error code. The solution was to use a kernel 6.4, my RaspiOS has 6.1. Maybe I give that a try.

morrownr commented 1 year ago

The solution was to use a kernel 6.4, my RaspiOS has 6.1. Maybe I give that a try.

I have a good guide for compiling a new kernel for the RasPiOS if you are interested.

I'm seeing some good info in your post. I can do some searching.

It is very possible that this is a bug that has been fixed either intentionally or otherwise.

da-mkay commented 1 year ago

Shortly after sending my last comment I realised that the error shown in dmesg happened when using OpenWRT on x86 machine, not when using my Raspi. Nevertheless, I compiled kernel 6.4 for Raspi (official guide). However, nothing has improved. iPhones still show connection losses during my test. However, until now it did not break the whole AP, as it did before sometimes.

I also tried an iPad with iPadOS 15.x. No problems there within 6 hours. Even after upgrading to newest iPadOS 16.6: no problems within 3 hours.

Since the iPhones also showed no errors when connected to my routers WiFi, I tried again with the Raspis internal WiFi. Here, also no connection loss happened within 3 hours.

So, somehow the ACM and the iPhones do not work well together in 2.4GHz mode. But I’m most concerned about the fact that sometimes the whole AP broke down.

morrownr commented 1 year ago

@ZerBea

I am adding you to this issue because we are at a point in troubleshooting this issue where the best option to continue may be to capture packets from an iPhone 14. I have some health issues that may limit my ability to help starting this week and you are far more knowledgeable than me. The OP is @da-mkay and he has worked hard to help determine where the issue is.

If you have time to take a look, I would appreciate it.

@morrownr

ZerBea commented 1 year ago

@morrownr , thanks for the invitation to this report.

Usually log files and configuration settings are a good starting point. Unfortunately there are a way too many screws (you can turn) to figure out what exactly went wrong. I recommend to run tshark / or Wireshark in parallel on the AWUS036ACM device (in monitor mode) to view what it receive/transmit. Additional I recommend to set up a second system and run tshark / Wireshark to get an independent view of the traffic that is really going over the air. Now compare log files of the AP machine to dump file on the AP machine and dump file on the independent monitor machine to hunt for the problem.

ZerBea commented 1 year ago

As a starting point, take a look at the Frame Control Field:

Frame Control Field: 0x0802
    .... ..00 = Version: 0
    .... 10.. = Type: Data frame (2)
    0000 .... = Subtype: 0
    Flags: 0x02
        .... ..10 = DS status: Frame from DS to a STA via AP(To DS: 0 From DS: 1) (0x2)
        .... .0.. = More Fragments: This is the last fragment
        .... 0... = Retry: Frame is not being retransmitted
        ...0 .... = PWR MGT: STA will stay up
        ..0. .... = More Data: No data buffered
        .0.. .... = Protected flag: Data is not protected
        0... .... = +HTC/Order flag: Not strictly ordered

where the Retry Bit is a good indicator about the quality: .... 0... = Retry: Frame is not being retransmitted Frame was successful transmitted and acknowledged. The more retries (usually up to 7 frames) before the hardware give up, the worse the quality. .... 1... = Retry: Frame is being retransmitted

Check this on both directions: AP to CLIENT .... ..10 = DS status: Frame from DS to a STA via AP(To DS: 0 From DS: 1) (0x2) CLIENT to AP .... ..01 = DS status: Frame from STA to DS via an AP (To DS: 1 From DS: 0) (0x1)

I'll say that you have to figure out whether the problem is related to the RF connection between device and CLIENT (e.g.: no RF connection any longer) the device itself (e.g. overheating) the connection between the device and the Raspberry (voltage unstable) the software running on the Raspberry (usually hostapd - configuration or bug) the kernel / driver running on the Raspberry (regression) or to the CLIENT.

If you "see" a stable connection on the RF side (frames are transmitted and acknowledged - no retires), but your download rate is going down, the problem is located to the Raspberry.

ZerBea commented 1 year ago

There is a relationship between rate and RF-bandwidth. The higher the rate the greater the RF-bandwidth the lower the range. https://dsp.stackexchange.com/questions/29555/what-is-bit-rate-how-is-it-related-to-bandwidth

Compared to your MAC the iPhone antenna is poor.

MAC >>++++++++++++++++++++++++..AP..................-----> 
iPhone >>+++++++++++++++++++++.......AP------------------>
+ = good signal strength
. = usable signal strength
- = signal lost

Now we increase the rate (that will increase the bandwidth and reduce the range:

MAC >>+++++++++++++++++++..............AP....-----> 
iPhone >>++++++++++++++..............-----.AP------------------>
+ = good signal strength
. = usable signal strength
- = signal lost

In general:

CLIENT++++++++++++++++++++++++AP+++...........------>  == no retries
CLIENT++++++++++++++++++++++....AP..........------------->  == a few retires
CLIENT+++++++++++++++........--------.AP--------------------->  == maximum retires & connection get lost

If everything is fine, you can leave this RF part, because you know that the device is working as expected (on the RF side).

ZerBea commented 1 year ago

Next step is to check the connection between the RPi and the device, especially the power consumption. Voltage and current should be stable over a long time.

ZerBea commented 1 year ago

Now do the same as mentioned above, but on different kernels to make sure it is not a driver regression.

ZerBea commented 1 year ago

BTW: https://github.com/morrownr/USB-WiFi/issues/300#issuecomment-1672320973

It is always good to check (set) the regulatory domain setting, because the impact is huge:

$ sudo iw reg set 00
$ hcxdumptool -I wlp22s0f0u9u3

Requesting physical interface capabilities. This may take some time.
Please be patient...

interface information:

phy idx hw-mac       virtual-mac  m ifname           driver (protocol)
---------------------------------------------------------------------------------------------
  0   3 00c0caae1f73 d85dfb54fffd * wlp22s0f0u9u3    mt76x2u (NETLINK)

available frequencies: frequency [channel] tx-power of Regulatory Domain: 00

  2412 [  1] 20.0 dBm     2417 [  2] 20.0 dBm     2422 [  3] 20.0 dBm     2427 [  4] 20.0 dBm
  2432 [  5] 20.0 dBm     2437 [  6] 20.0 dBm     2442 [  7] 20.0 dBm     2447 [  8] 20.0 dBm
  2452 [  9] 20.0 dBm     2457 [ 10] 20.0 dBm     2462 [ 11] 20.0 dBm     2467 [ 12] 20.0 dBm
  2472 [ 13] 20.0 dBm     2484 [ 14] 20.0 dBm     5180 [ 36] 20.0 dBm     5200 [ 40] 20.0 dBm
  5220 [ 44] 20.0 dBm     5240 [ 48] 20.0 dBm     5260 [ 52] 20.0 dBm     5280 [ 56] 20.0 dBm
  5300 [ 60] 20.0 dBm     5320 [ 64] 20.0 dBm     5500 [100] 20.0 dBm     5520 [104] 20.0 dBm
  5540 [108] 20.0 dBm     5560 [112] 20.0 dBm     5580 [116] 20.0 dBm     5600 [120] 20.0 dBm
  5620 [124] 20.0 dBm     5640 [128] 20.0 dBm     5660 [132] 20.0 dBm     5680 [136] 20.0 dBm
  5700 [140] 20.0 dBm     5720 [144] 20.0 dBm     5745 [149] 20.0 dBm     5765 [153] 20.0 dBm
  5785 [157] 20.0 dBm     5805 [161] 20.0 dBm     5825 [165] 20.0 dBm     5845 [169] disabled
  5865 [173] disabled

$ sudo iw reg set US
$ hcxdumptool -I wlp22s0f0u9u3

Requesting physical interface capabilities. This may take some time.
Please be patient...

interface information:

phy idx hw-mac       virtual-mac  m ifname           driver (protocol)
---------------------------------------------------------------------------------------------
  0   3 00c0caae1f73 d85dfb54fffd * wlp22s0f0u9u3    mt76x2u (NETLINK)

available frequencies: frequency [channel] tx-power of Regulatory Domain: US

  2412 [  1] 23.0 dBm     2417 [  2] 23.0 dBm     2422 [  3] 23.0 dBm     2427 [  4] 23.0 dBm
  2432 [  5] 23.0 dBm     2437 [  6] 23.0 dBm     2442 [  7] 23.0 dBm     2447 [  8] 23.0 dBm
  2452 [  9] 23.0 dBm     2457 [ 10] 23.0 dBm     2462 [ 11] 23.0 dBm     2467 [ 12] disabled
  2472 [ 13] disabled     2484 [ 14] disabled     5180 [ 36] 20.0 dBm     5200 [ 40] 20.0 dBm
  5220 [ 44] 20.0 dBm     5240 [ 48] 20.0 dBm     5260 [ 52] 20.0 dBm     5280 [ 56] 20.0 dBm
  5300 [ 60] 20.0 dBm     5320 [ 64] 20.0 dBm     5500 [100] 20.0 dBm     5520 [104] 20.0 dBm
  5540 [108] 20.0 dBm     5560 [112] 20.0 dBm     5580 [116] 20.0 dBm     5600 [120] 20.0 dBm
  5620 [124] 20.0 dBm     5640 [128] 20.0 dBm     5660 [132] 20.0 dBm     5680 [136] 20.0 dBm
  5700 [140] 20.0 dBm     5720 [144] 20.0 dBm     5745 [149] 20.0 dBm     5765 [153] 20.0 dBm
  5785 [157] 20.0 dBm     5805 [161] 20.0 dBm     5825 [165] 20.0 dBm     5845 [169] 20.0 dBm
  5865 [173] 20.0 dBm

$ sudo iw reg set IN
$ hcxdumptool -I wlp22s0f0u9u3

Requesting physical interface capabilities. This may take some time.
Please be patient...

interface information:

phy idx hw-mac       virtual-mac  m ifname           driver (protocol)
---------------------------------------------------------------------------------------------
  0   3 00c0caae1f73 d85dfb54fffd * wlp22s0f0u9u3    mt76x2u (NETLINK)

available frequencies: frequency [channel] tx-power of Regulatory Domain: IN

  2412 [  1] 23.0 dBm     2417 [  2] 23.0 dBm     2422 [  3] 23.0 dBm     2427 [  4] 23.0 dBm
  2432 [  5] 23.0 dBm     2437 [  6] 23.0 dBm     2442 [  7] 23.0 dBm     2447 [  8] 23.0 dBm
  2452 [  9] 23.0 dBm     2457 [ 10] 23.0 dBm     2462 [ 11] 23.0 dBm     2467 [ 12] 23.0 dBm
  2472 [ 13] 23.0 dBm     2484 [ 14] disabled     5180 [ 36] 20.0 dBm     5200 [ 40] 20.0 dBm
  5220 [ 44] 20.0 dBm     5240 [ 48] 20.0 dBm     5260 [ 52] 20.0 dBm     5280 [ 56] 20.0 dBm
  5300 [ 60] 20.0 dBm     5320 [ 64] 20.0 dBm     5500 [100] 20.0 dBm     5520 [104] 20.0 dBm
  5540 [108] 20.0 dBm     5560 [112] 20.0 dBm     5580 [116] 20.0 dBm     5600 [120] 20.0 dBm
  5620 [124] 20.0 dBm     5640 [128] 20.0 dBm     5660 [132] 20.0 dBm     5680 [136] 20.0 dBm
  5700 [140] 20.0 dBm     5720 [144] 20.0 dBm     5745 [149] 20.0 dBm     5765 [153] 20.0 dBm
  5785 [157] 20.0 dBm     5805 [161] 20.0 dBm     5825 [165] 20.0 dBm     5845 [169] 20.0 dBm
  5865 [173] 20.0 dBm
da-mkay commented 1 year ago

@ZerBea Thank you for your input.

I recommend to run tshark / or Wireshark in parallel on the AWUS036ACM device (in monitor mode) to view what it receive/transmit.

You mean without running the ACM as AP? Because, when setting it to monitor mode I cannot use it as AP anymore. But if I do not run the ACM as AP I cannot reproduce the problem because the ACM is part of it 😉

If I run tshark while in master mode I do not get the WLAN packages.

ZerBea commented 1 year ago

Mixed mode (MONITOR and AP) should work fine:

$ lsusb
ID 0e8d:7612 MediaTek Inc. MT7612U 802.11a/b/g/n/ac Wireless Adapter
$ iw dev
phy#1
    Interface mon0
        ifindex 6
        wdev 0x100000002
        addr 00:c0:ca:ad:0e:49
        type monitor
        txpower 20.00 dBm
    Interface wlp22s0f0u9u3
        ifindex 5
        wdev 0x100000001
        addr 00:c0:ca:ad:0e:49
        ssid WPA3-Network
        type AP
        channel 1 (2412 MHz), width: 20 MHz (no HT), center1: 2412 MHz
        txpower 20.00 dBm
        multicast TXQ:
            qsz-byt qsz-pkt flows   drops   marks   overlmt hashcol tx-bytes    tx-packets
            0   0   0   0   0   0   0   0       0

run iw to add monitor interface run ip link to bring interface up run Wireshark on monitor interface run hostapd on AP interface

ZerBea commented 1 year ago

To monitor the NETLINK communication, just setup a NETLINK MONITOR interface (nlmon):

sudo ip link add  nlmon0 type nlmon
sudo ip link set dev nlmon0 up

run hostapd on AP interface sudo hostapd wlan/sae/sae/hostapd.conf run Wireshark on nlmon interface and look for NL80211 packets: 696 22:34:53,167838265 00:c0:ca:ad:0e:49 cc:f7:35:70:21:64 802.11 168 Probe Response, SN=0, FN=0, Flags=........, BI=100, SSID="WPA3-Network"

You can do that in parallel with the WLAN MONITOR interface (mon0).

Wait until the problem occurs. Than inspect the last packets (of WLAN MONITOR interface and NETLINK NL80211 packets).

da-mkay commented 1 year ago

Thank you again! I got it working in mixed mode. Unfortunately I missed your last message about monitoring NETLINK communication. So, I only captured packages of mon0 and checked the retry-flag:

wlan.da == IPHONE-MAC && wlan.fc.type == 2
—> 3457 packages to iPhone
wlan.da == IPHONE-MAC && wlan.fc.type == 2 && wlan.fc.retry ==1
—> 0 packages to iPhone were retried
wlan.sa == IPHONE-MAC && wlan.fc.type == 2
—> 458673 from iPhone
wlan.sa == IPHONE-MAC && wlan.fc.type == 2 && wlan.fc.retry ==1
—> 7221 = 1.57%  from iPhone retried

In the meantime I bought another adapter, an 8812bu based Alfa AWUS036ACU. Using this adapter and the same hostapd settings (except for ht_capab) works pretty well using the out-of-kernel driver. I got not a single connection loss on the iPhone within hours of testing. And speed seems to be fine, if not better than ACM. Maybe speed was better on ACM when multiple clients tried to max-out the bandwidth in parallel.

However, there is also one small issue with the ACU: I noticed when running my test on multiple clients in parallel, that somehow one device often wins, meaning that it gets almost all available bandwidth while the other clients slow down drastically. Often, the speed on those devices go down to 0 for multiple minutes. Sometimes they then stabilize and start to receive data again at a few Mbps. Sometimes they stay in the state where they get a data package only every few minutes. However, rarely I saw a websocket-connection loss on the Android-Phone, but connection to the wifi was never lost.

Is there a way to ensure bandwidth is spread almost-equally across clients? Because besides this small issue the setup looks almost perfect now and is more stable than the ACM setup before. The wifi never broke down during my tests as it did with the ACM several times.

ZerBea commented 1 year ago

Thanks for the information, which is very useful. I can partially reproduce this behavior.

Partially means that not all combinations of WiFi hardware are affected.

Some combinations are working fine, while other running into serious problems. This is a combination that caused a lot of trouble. This devices don't fit to each other. ALFA AWUS036ACM <---> RTL8821CE 802.11ac PCIe Wireless Network Adapter

ALFA AWUS036ACM (penetration device) ID 0e8d:7612 MediaTek Inc. MT7612U 802.11a/b/g/n/ac Wireless Adapter running in "active monitor mode" - transmit/receive 802.11 raw packets

CLIENT (target) Realtek PICe device ID 04:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821CE 802.11ac PCIe Wireless Network Adapter running in managed mode - respond 802.11 packets

AWUS036ACM doesn't "see" all packets (at first time) transmitted by the CLIENT That depend on: the distance between the ACM and the CLIENT - the shorter the distance (e.g. < 50cm), the greater the loss the higher the transmit (TX) power on both sides (in combination with a short distance), the greater the loss the higher the rate and as a result of this the greater the bandwidth, the greater the loss the higher the rate and the greater the bandwidth and interference with neighbor channels, huge loss

I do not use hostapd, So I don't know if hostapd change channel, rate and bandwidth in case of interferences, like my test router do. If not, this might be one reason.

For me, it looks like the ACM is an excellent device to be used as CLIENT, but it has some problems running as an ACCESS POINT (AP), because the requirements (running as AP or CLIENT) are very different. I assume, the hardware of an AP is different to the hardware of a WiFi USB pen and the ACM doesn't provide this.

To make sure it is not related to the chipset itself, I ordered an ALL WA1200AC (same chipset, same driver): https://www.allnet.de/en/allnet-brand/produkte/wlan/wlan-adapter/p/allnet-all-wa1200ac-1200mbit-wireless-ac-usb-30-dual-band-adapter/

BTW: This problem is not related to the ACM, only. I discovered several hardware combinations that don't fit to each other, too:

ipTIME <---> RTL8821CE 802.11ac PCIe Wireless Network Adapter

ipTIME WiFi USB pen ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter

Nearly the same problems.

FRITZ!Box 7430 (router) <---> RTL8821CE 802.11ac PCIe Wireless Network Adapter No problems

da-mkay commented 1 year ago

Good to see that I am not the only one seeing these issues. I'm curious how the ALL WA1200AC will perform. Keep me updated 😉

In the meantime I returned the ACM. I am pretty pleased with the Alfa AWUS036ACU. But I am still trying to achieve fair bandwidth between clients. When running my test (server in LAN sends infinite content to clients connected to the ACU), bandwidth is not spread fair between the clients. Instead, at some point in time one client gets stuck and receives the next data chunks only after seconds or even minutes. I played around with tc qdisc on wlan0 but could not really improve the situation. Could someone point me in the right direction? Could the in-kernel driver show improvements here?

ZerBea commented 1 year ago

I got the ALLNET WA1200AC.

First tests are not very impressive, because it looks like mt76 USB 3.x support is completely broken: https://github.com/ZerBea/hcxdumptool/issues/337#issuecomment-1719231735

morrownr commented 1 year ago

@ZerBea

I don't have one of the ALLNET adapters but about a year ago or so I got a report from a user with a mt7612u based adapter that would not do USB3. He had 2 of the adapters and sent one to me and I still have it. Let me dig up my notes and pass them on to you.

Have you tried a different USB3 port on the same system? Have your tried a USB3 port on a different system?

it looks like mt76 USB 3.x support is completely broken

I have several adapters based on the mt7612u chipset and the only one that I have seen that had a problem with USB3 is the one I mentioned above. I have a laptop with gen 3.1 ports so let me take a look with some adapters and see what happens.

ZerBea commented 1 year ago

I did a lot more.

The motherboard is a MSI B-550A Pro (AMD).

Ports external:

USB-C 3.2 Gen 2 (10 Gbit/s) -> 1
USB-A 3.2 Gen 2 (10 Gbit/s) -> 1
USB-A 3.2 Gen 1 (5 Gbit/s) -> 2
USB-A 2.0 ->  4

Ports internal

USB 3.2 Gen 1 (5 Gbit/s) ->  2
USB-C 3.2 Gen 2 (10 Gbit/s -> 1
USB 2.0 -> 4

tested devices:

ALFA AWUS036ACM
ALLNET WA1200AC

Tested kernels 6.4 and 6.5

If the devices are connected to an USB 3.x port they don't work If the devices are connected to an USB 2 port, everything is working as expected. If an external USB 2 hub is connected to one of of the USB 3.x ports and the ALFA or the ALLNET is connected to the USB 2 hub everything is working as expected.

tested USB 2 hubs:

DIGITUS DA-70220
DIGITUS DA-70224
morrownr commented 1 year ago

Tested:

Alfa ACM (mt7612u) Acer Aspire A514-54 11th Gen Intel® Core™ i5-1135G7 × 8 Ubuntu 23.04 Linux 6.2.0-32-generic

USB 3.2 Gen 1 (Type A)

Result:

$ lsusb -t /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M |__ Port 4: Dev 2, If 0, Class=Vendor Specific Class, Driver=mt76x2u, 5000M

$ iw dev phy#1 Interface wlx00c0caadcb84 ifindex 4 wdev 0x100000001 addr 00:c0:ca:ad:cb:84 type managed txpower 23.00 dBm multicast TXQ: qsz-byt qsz-pkt flows drops marks overlmt hashcoltx-bytes tx-packets 0 0 0 0 0 0 0 00 phy#0 Interface wlp2s0 ifindex 3 wdev 0x1 addr xx:4c:a1:76:1c:09 ssid Txxxx type managed channel 1 (2412 MHz), width: 20 MHz, center1: 2412 MHz txpower 3.00 dBm multicast TXQ: qsz-byt qsz-pkt flows drops marks overlmt hashcoltx-bytes tx-packets 0 0 0 0 0 0 0 00

The ACM is PHY#1 but you know that.

I'm sending this message connected with the ACM. I can test other adapters based on the mt7612u. Let me know if there is anything else I can test.

ZerBea commented 1 year ago

Thanks for the test. Intel systems are not affected.

I have a second AMD system MSI KRAIT. Same behavior as the MSI B-550A.

Not affected is my norebook (also AMD): ASUS TUF FX505DT All three ports ( 2 x USB3 / 1x USB 2) are working as expected - same kernel as the desktop system.

morrownr commented 1 year ago

Do you know what chipsets are used in the USB3 hubs in the systems where you are seeing the problem?

ZerBea commented 1 year ago

AMD Renoir/Cezanne https://linux-hardware.org/index.php?id=pci:1022-1639-1022-1639

morrownr commented 1 year ago

@ZerBea

AMD Renoir/Cezanne

Another piece of the puzzle.

@da-mkay probably thinks there are all kinds of incompatibilities when it comes to USB WiFi and he is right but it is not as bad as it may look. The Plug and Play list that I started and is on the Main Menu is also an attempt select specific adapters where there is evidence that the adapters are known to be relatively trouble free. The ACM that started this issue is about as solid and trouble free as any adapter that I am aware of.

However, The vast majority of USB3 capable adapters have only been tested on basic USB3, not USB3 genX. After testing my ACM on my modern laptop yesterday, I happened to test a rtl8832bu based adapter in a USB3 port on the other side of the system and the system would not recognize it. When I moved the adapter to the other side of the system, it showed up and worked fine.

I do answer a lot of questions from users given the 6 Realtek driver repos that are maintained here. Here is what I think to be true to minimize problems with USB WiFi adapters:

If you pay attention to the guidelines I just suggested, it won't guarantee you a problem free experience using USB WiFi adapters with Linux but it will increase the odds GREATLY.

If I can come across any info to help @ZerBea , I will pass it on.

morrownr commented 10 months ago

@da-mkay

What is your status?

FYI: With the release of RasPiOS 2023-10-10, I have been working on my AP guide. Network Manager is now the default and dhcpcd is not even installed. I played around with some options but decided to disable NM and use systemd-networkd plus systemd-resolved along with hostapd. I am using my Alfa ACM with scatter-gather turned off. Three days into testing and abusing, it is a rock.

Setup: No powered hubs. ACM is plugged into a USB3 port. The only other device plugged into a USB port is a webcam plugged into a USB2 port. It provides video to my LAN for security reasons. I run the RasPi4B headless via VNC.

If you want to test the guide, please do.