kaloz / mwlwifi

mac80211 driver for the Marvell 88W8864 802.11ac chip
395 stars 119 forks source link

Increased memory usage on WRT3200ACM #152

Closed admiral0 closed 7 years ago

admiral0 commented 7 years ago

Hello,

I noticed that the 3200 has a "resting" memory usage of 190Mb RAM. When I start transfering data heavily (300 MBits/s) it starts to use more memory until the system reboots.

  1. Any tips on how to debug this without a serial console on lede?
  2. This doesn't look like a memory leak, because it goes back to "resting" values if I stop transfering.

More info. I am running lede trunk, and latest master from this repo.

admiral0 commented 7 years ago

Dmesg : https://pastebin.com/3xgx3XKE Htop : http://i.imgur.com/uM1FDRl.png

yuhhaurlin commented 7 years ago

Thanks for your information. I suspect something, I will check it when I come back to office next week.

farchord commented 7 years ago

Yeah I'm getting the same problem. Within about 5-10mins tops, the router just freezes and reboots.

w3dul3b

yuhhaurlin commented 7 years ago

Can you help to confirm one thing for me? If only one interface is enable (for example, 5 GHz), then this problem won't happen.

BrainSlayer commented 7 years ago

@yuhhaurlin i found out something. at least for me this issue is only present if i start a bw test on 5 ghz. its does not occur on 2.4 ghz interface. (both are activated). its reproduceable with a simple speedtest with my iphone 6 client

yuhhaurlin commented 7 years ago

@BrainSlayer I can reproduce it on our AP DB board and WRT3200ACM: If only one interface is enable, the memory leak problem won't happen. I only enable 5 GHz on my AP DB board, so I can test iperf test over night without any problem. But if I enable 2.4 GHz at same time, free memory will be dropped quickly. I find it is the same on WRT3200ACM. So it is not related to any devices or any version of LEDE.

yuhhaurlin commented 7 years ago

That is the reason why I want to confirm this is the same problem encountered by community.

BrainSlayer commented 7 years ago

@yuhhaurlin sure its not related to lede. i did not even use lede testing but dd-wrt. anyway. if i disable 2.4 on my side the problem is gone. but what i finally wanted to say and this is more interesting. if both are activated. the issue is only present on 5 ghz. but wont happen if i test 2.4 ghz. this is not related to any signal issue since bw is good in both cases in my test

yuhhaurlin commented 7 years ago

I have not tested 2.4 ghz when both interfaces are enable. I also try 88W8864 (our AP DB board can hook different modules), this problem is not there. I will try to fix it. Thanks for your information.

farchord commented 7 years ago

I'd like to confirm, enabling only 5ghz does not trigger this problem.

At least this makes my router a bit more usable in the mean time :)

yuhhaurlin commented 7 years ago

@farchord Thanks for your information.

BrainSlayer commented 7 years ago

sounds for me there is a global shared buffer which is not safe for multiple interfaces. i will try todo some research to find a possible cause

yuhhaurlin commented 7 years ago

Yes. That is also one possible problem that I think.

kb3tbx commented 7 years ago

bringing my report from the other thread!

I have flashed the latest LEDE build (w/ 4.9.20 kernel) from @davidc502 on a WRT3200ACM, and I do not see a memory leak, with only limited testing so far 1.5GB file slow download on 5Ghz radio, 5.5 hours uptime.

I would like to confirm that this was my first use of LEDE, clean factory defaults w/ WPA2/AES encryption, maybe fq-codel configured and only had the 5ghz radio enabled for several 1GB size files from 3MB/s cable ISP - easy work. I saw the memory come up reasonably during the transfer, and go back down afterwards. I had many more Hours of uptime after...

I wanted to do a hairpin iperf3 test using a laptop on 5ghz and Android tablet Hurricane Electric APP on 2.4Ghz, but a neighborhood power outage got in the way. Maybe I will be patient. Thanks again, Jim A.

farchord commented 7 years ago

If it helps, my wifi setup is essentially, all 3 interfaces have the same ssid, all on WPA2-Personal. And the memory doesn't increase during transfers but it does incrementally as long as the interface is enabled.

EDIT: Nevermind. It goes down MUCH faster if you transfer data.

jsgiv commented 7 years ago

I can confirm that this issue is specific to only when the 2.4 Ghz band is enabled....

Details:

WRT3200 ACM ATT Gigabit fiber connection

I'm currently running LEDE (last "stable" build from cybrnook -with latest "beta" driver - which appears to be compiled/pulled as of 4/28/2017) - located here: https://www.dropbox.com/sh/2a7hkorqir0ch5f/AAA0q3SrAWMVoHmXgnlYqFTQa/STABLE/r3356-STABLE-BETADRIVER?dl=0

The issue is very easy to reproduce:

If I go to dslreports.com/speedtest - and execute several back to back speed tests in a row:

I haven't tested Brainslayer's latest DD-WRT build - was going to but it appears he's also confirmed the same behavior already...

yuhhaurlin commented 7 years ago

Thanks all of you. BTW, I close #157. The log shows it is radio 2 which is not mwlwifi.

kb3tbx commented 7 years ago

Yes. we need to get more specific. Radio 0 and 1 have Marvell 88W8964 chips, Only these need to be tested. Radio 0 is 5Ghz, 802.11 (a) (n) or (ac). Radio 1 is 2.4Ghz, 802.11 (b) (g) (n)

Even though 'Radio 2' has a very capable 88W8887 chip, it seems to be provisioned for Receive-only monitoring, in conjunction with Radio 0; to accomplish required DFS in the 5Ghz band. This should NOT be enabled as an access point while testing the mwlwifi driver.

kb3tbx commented 7 years ago

This graph from LEDE/Luci shows memory while idle for a time, and the two dips are with 50MB iperf3 transfers from laptop (server) on 5Ghz - radio0 to tablet (client) on 2.4Ghz - radio1. Quite a unique pattern. lede memory iperf3 capture

Noltari commented 7 years ago

I tested latest driver revision yesterday and I can say that memory issues are still present. For my test I only enabled the 5GHz interface and memory related crashes of other running apps popped out (mcproxy, bird4, udpxy...).

farchord commented 7 years ago

@kb3tbx Sorry, I'm still a newb at all this stuff. I went ahead and enabled the other 5ghz radio (Radio0). I didn't know about that. I just woke up and yet can still say I'm gonna go to sleep tonight less stupid! XD

With the 2 5ghz bands enabled, we'll see if this goes well!

farchord commented 7 years ago

memory

So,, I checked when I got back this afternoon. This morning around 5 I enabled the 5ghz band. Also disabled radio2.

I get back this afternoon and I'm watching IPTV and it's pixelating as heck. I try to get in the router, I can't.

So I reboot it, and with the 5ghz radio (radio0) enabled (Radio1 and 2 are disabled) it's still filling it's memory like crazy.

I'll go back to my isp's modem for now I guess.

inteliboy commented 7 years ago

according to various sources WRT3200ACM has 128MB extra DDR3 memory for each radio, I assume for caching the data. perhaps the mwlwifi driver does not take that into account but the firmware does, hence the increased usage of regular RAM? just a guess

Chadster766 commented 7 years ago

I can't reproduce this issue with kernel 4.9.26, hostapd 2.6 and wpa_supplicant 2.6.

root@MCDEBIAN:~# free
             total       used       free     shared    buffers     cached
Mem:        510404     262768     247636       9896       6956      71280
-/+ buffers/cache:     184532     325872
Swap:            0          0          0
root@MCDEBIAN:~# cat /sys/kernel/debug/kmemleak
root@MCDEBIAN:~#
root@MCDEBIAN:~# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
br0        1500 0    899345      0      0 0       1321699      0      0      0 BMRU
eth0       1500 0   1280634      0      0 0        846216      0      0      0 BMRU
eth1       1500 0  40481907      0      0 0      80630603      0      0      0 BMRU
lo        65536 0         9      0      0 0             9      0      0      0 LRU
wlan0      1500 0  79965508      0      0 0      40203411      0      0      0 BMRU
wlan1      1500 0    154521      0      0 0        220751      0      0      0 BMRU
BrainSlayer commented 7 years ago

@Chadster766 and you think 266 mb mem usage is normal?

BrainSlayer commented 7 years ago

@Chadster766 for me kernel 4.9 is not even working. i just get messages like [ 4.051179] mwlwifi 0000:01:00.0: Refused to change power state, currently in D3 and the driver failed to work at all

Chadster766 commented 7 years ago

Interesting I'm not having any issue with mwlwifi and kernel 4.9.26:

root@MCDEBIAN:~# cat /proc/version
Linux version 4.9.26 (root@McDev2) (gcc version 4.8.4 20141219 (release) (4.8.4-1+11-1) ) #24 SMP Sun May 7 13:32:58 CDT 2017
root@MCDEBIAN:~# modinfo mwlwifi
filename:       /lib/modules/4.9.26/kernel/drivers/mwlwifi/mwlwifi.ko
license:        GPL v2
author:         Marvell Semiconductor, Inc.
version:        10.3.4.0-20170421
description:    Marvell Mac80211 Wireless PCIE Network Driver
srcversion:     DBEF86CABCDF914A7688AC7
alias:          pci:v000011ABd00002B40sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002B38sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A55sv*sd*bc*sc*i*
depends:
intree:         Y
vermagic:       4.9.26 SMP mod_unload ARMv7 p2v8
root@MCDEBIAN:~#
Chadster766 commented 7 years ago

If I load the same setup on a WRT1900AC V2 the memory usage is less:

root@MCDEBIAN:~# free
             total       used       free     shared    buffers     cached
Mem:        510392     106464     403928       5464       6348      56624
-/+ buffers/cache:      43492     466900
Swap:            0          0          0
root@MCDEBIAN:~# modinfo mwlwifi
filename:       /lib/modules/4.9.26/kernel/drivers/mwlwifi/mwlwifi.ko
license:        GPL v2
author:         Marvell Semiconductor, Inc.
version:        10.3.4.0-20170421
description:    Marvell Mac80211 Wireless PCIE Network Driver
srcversion:     DBEF86CABCDF914A7688AC7
alias:          pci:v000011ABd00002B40sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002B38sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A55sv*sd*bc*sc*i*
depends:
intree:         Y
vermagic:       4.9.26 SMP mod_unload ARMv7 p2v8
farchord commented 7 years ago

Hmmm I'll check with David if it's possible to compile a 4.9.26 kernel build and see if it's that simple...

Chadster766 commented 7 years ago

The 4.9.26 kernel isn't enough you also need hostapd 2.6 and wpa_supplicant 2.6.

farchord commented 7 years ago

Thanks! :)

Chadster766 commented 7 years ago

The WRT3200ACM has been under heavy use.

root@MCDEBIAN:~# free
             total       used       free     shared    buffers     cached
Mem:        510400     232224     278176       9588      10984      33368
-/+ buffers/cache:     187872     322528
Swap:            0          0          0
root@MCDEBIAN:~# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
br0        1500 0   4149926      0      0 0       5909149      0      0      0 BMRU
eth0       1500 0   5768608      0      0 0       3970263      0      0      0 BMRU
eth1       1500 0   3117491      0      0 0       4503546      0      0      0 BMRU
lo        65536 0         3      0      0 0             3      0      0      0 LRU
wlan0      1500 0    438997      0      0 0        737777      0      0      0 BMRU
wlan1      1500 0    606534      0      0 0        776374      0      0      0 BMRU
root@MCDEBIAN:~#
root@MCDEBIAN:~# iw dev wlan0 station dump|grep -i station
Station 16:91:82:00:88:a8 (on wlan0)
Station bc:72:b1:79:6d:b8 (on wlan0)
Station 98:f1:70:08:10:0d (on wlan0)
root@MCDEBIAN:~# iw dev wlan1 station dump|grep -i station
Station 16:91:82:00:88:a6 (on wlan1)
Station ac:89:95:70:3d:83 (on wlan1)
Station 58:6d:8f:eb:67:33 (on wlan1)
Station 78:4f:43:10:6a:bb (on wlan1)
Station b0:05:94:11:ef:cb (on wlan1)
Station 6c:ad:f8:b8:53:c0 (on wlan1)
Chadster766 commented 7 years ago

Same setup on a WRT1900AC V1:

root@MCDEBIAN:~# free
             total       used       free     shared    buffers     cached
Mem:        250744     103472     147272       5400       4176      56724
-/+ buffers/cache:      42572     208172
Swap:            0          0          0
BrainSlayer commented 7 years ago

you cannot compare the wrt1900 with the wrt3200acm. the chipsets are different and the driver has been implemented in a very different way on both platforms. they are also using different chipset firmwares

anomeome commented 7 years ago

imo, relevant data points, from different devices, running a common wrapper around different BLOBs. For those with such inclinations, I have put up a LEDE build with 4.9.27k. Have only flashed to a rango, no real testing, but here if you want to kick it around.

Edit: should not be an issue, just don't keep settings.

Edit2: @farchord , this was really about testing the aforementioned, but as regards the image contents you may want to check targets/mvebu/generic/config.seed as to what is built. At any rate wrong thread.

farchord commented 7 years ago

@anomeome I'm running David's LEDE can I just upgrade to it?

farchord commented 7 years ago

@anomeome aiite, im gonna swap the SFP module back to my 3200acm, do a config backup (Just in case) and flash your binary clean. Gonna take me some time to post the results (Got to resetup my Vlans and etc) but to quote Arnold:

"I'll be back"

farchord commented 7 years ago

Nevermind I can't. You don't have IGMP proxy available so my IPTV wouldn't work.

EDIT: I'll try it just for the sake of seeing if the wifi springs a leak

farchord commented 7 years ago

I did some basic testing. Sorry about the out of subject posts, I'm a bit of a newb. But I did some tests on the image, and there doesn't seem to be any leak. Couldn't get on the internet though so I just went on wifi and spammed the LEDE interface a bit.

anomeome commented 7 years ago

That is encouraging, and rather surprising actually. I am still a ways away from being able to turn on the radios and test. Not sure what the no WAN may be about, will put a post here with a thought.

Fr3DBr commented 7 years ago

So once this problem is fixed, both radios (2.4 ghz and 5 ghz) will work simultaneously without issues ?

Chadster766 commented 7 years ago

IMO the driver is still using way to much memory even through it does work well.

When the system gets to down to about 53mb free memory a temporary jitter occurs when streaming which clears up when the free memory goes back up to normal.

I have not found any proof as to which component is using up the memory since its not listed among commands ps, top or htop. Nothing in the /proc or /sys/kernel/debug directories lists the culprit module.

yuhhaurlin commented 7 years ago

Please help to check 10.3.4.0-20170512. Thanks.

Chadster766 commented 7 years ago

Will do :smiley:

Chadster766 commented 7 years ago

Testing mwlwifi version 10.3.4.0-20170512:

WRT3200ACM in residential environment:

root@MCDEBIAN:~# uname -a
Linux MCDEBIAN 4.9.26 #25 SMP Sun May 7 15:58:27 CDT 2017 armv7l GNU/Linux
root@MCDEBIAN:~# modinfo mwlwifi
filename:       /lib/modules/4.9.26/kernel/drivers/mwlwifi/mwlwifi.ko
license:        GPL v2
author:         Marvell Semiconductor, Inc.
version:        10.3.4.0-20170512
description:    Marvell Mac80211 Wireless PCIE Network Driver
srcversion:     758DB3D70241D49D18F8603
alias:          pci:v000011ABd00002B40sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002B38sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A55sv*sd*bc*sc*i*
depends:
intree:         Y
vermagic:       4.9.26 SMP mod_unload ARMv7 p2v8
root@MCDEBIAN:~# hostapd -v
hostapd v2.6
User space daemon for IEEE 802.11 AP management,
IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator
Copyright (c) 2002-2016, Jouni Malinen <j@w1.fi> and contributors
root@MCDEBIAN:~# free
             total       used       free     shared    buffers     cached
Mem:        510400     263588     246812       9588       5420      71696
-/+ buffers/cache:     186472     323928
Swap:            0          0          0
root@MCDEBIAN:~#

WRT1900AC V1 in a commercial production environment:

root@MCDEBIAN:~# uname -a
Linux MCDEBIAN 4.9.26 #25 SMP Sun May 7 15:58:27 CDT 2017 armv7l GNU/Linux
root@MCDEBIAN:~# modinfo mwlwifi
filename:       /lib/modules/4.9.26/kernel/drivers/mwlwifi/mwlwifi.ko
license:        GPL v2
author:         Marvell Semiconductor, Inc.
version:        10.3.4.0-20170512
description:    Marvell Mac80211 Wireless PCIE Network Driver
srcversion:     758DB3D70241D49D18F8603
alias:          pci:v000011ABd00002B40sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002B38sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A55sv*sd*bc*sc*i*
depends:
intree:         Y
vermagic:       4.9.26 SMP mod_unload ARMv7 p2v8
root@MCDEBIAN:~# hostapd -v
hostapd v2.6
User space daemon for IEEE 802.11 AP management,
IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator
Copyright (c) 2002-2016, Jouni Malinen <j@w1.fi> and contributors
root@MCDEBIAN:~# free
             total       used       free     shared    buffers     cached
Mem:        250744      85856     164888       9504       4148      50212
-/+ buffers/cache:      31496     219248
Swap:            0          0          0
root@MCDEBIAN:~#
BrainSlayer commented 7 years ago

new testversion http://www.dd-wrt.com/wrt3200.zip based on kernel 4.9 this time

s-pimenta commented 7 years ago

Thanks @BrainSlayer ! I will test right now, on my WRT3200ACM.

I also just woke up! The time difference between Portugal and Germany is not too much (1 hour).

s-pimenta commented 7 years ago

Currently doing some tests on DD-WRT v3.0-r32014 std (05/12/17), on my WRT3200ACM, with the 10.3.4.0-20170512 mwlwifi driver

The first thing 1 noticed is the eSata light turned on (but nothing is connected at both USB/eSata ports), maybe is a issue related with DD-WRT?!

5GHz and 2.4GHz radios are ON.

Right now nothing conclusive/reproducible, but just to point out the first time running iperf3, got about more 100Mbits/s on 5GHz (80MHz) than OEM firmarare (about 440-450Mbit/s), but after about 10 min dropped to 0Mbit/s, 5GHz did not stopped working (other devices connected at 5GHz fine) nor out of memory.

Run again the iperf3 without reconnecting/rebooting the router, and got about 330-350Mbit/s (about the same speed as OEM firmware). But after 20min dropped to 80Mbit/s and stayed at that speed during 10min.

-- Rebooted the router to check if I can get again the same speed before (440Mbit/s), and the only one device connected (only the laptop 2x2 wifi AC), and nothing more connected got about (440-450Mbit/s) running about 55min.

RAM usage at bootup:

Total Available 98% 511852 kB / 524288 kB 
Free 63% 323204 kB / 511852 kB 
Used 37% 188648 kB / 511852 kB 
Buffers 2% 4372 kB / 188648 kB 
Cached 6% 10524 kB / 188648 kB 
Active 6% 10700 kB / 188648 kB 
Inactive 3% 6196 kB / 188648 kB

RAM usage after 55min of iperf3:

Memory
Total Available 98% 511852 kB / 524288 kB 
Free 62% 318832 kB / 511852 kB <----------- 1% difference
Used 38% 193020 kB / 511852 kB 
Buffers 2% 4432 kB / 193020 kB 
Cached 6% 10780 kB / 193020 kB 
Active 6% 11036 kB / 193020 kB 
Inactive 3% 6380 kB / 193020 kB

I will continue (when I have time) to continue the tests.

ValCher1961 commented 7 years ago

After hour test new drivers at 5 GHz and not seeing any changes in memory, I plucked up courage and ran parallel to the iperf on 2 GHz and everything works fine. I cite only the screen, everything is visible. Thanks, great job! image

P.S. I have installed Debian, kernel-4.10.15

Chadster766 commented 7 years ago

Both router's are testing well. I would call this memory issue resolved :smiley: