raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.08k stars 4.96k forks source link

Pi becomes sluggish/unresponsive using wired Ethernet port with Bravia TV #343

Closed mcayland closed 10 years ago

mcayland commented 11 years ago

I'm using a Raspberry Pi as a NAT router between a wired/wireless networks, and I'm experiencing some serious performance problems with the Model B rev 2.0 on-board Ethernet. This is running the latest version of Raspbian downloaded direct from the main downloads page.

The setup is to allow a remote internet-enabled Sony TV without wireless adapter support to stream internet video from a local wireless router. The wireless adapter is an EdiMax WiFi dongle which works really well on its own; the problem is that when the TV is streaming video through the wired interface, the Pi becomes very unresponsive with ping times on the wireless side rising from around 15ms to 500-600ms. This causes the video stream to freeze every 10-20s when trying to watch streaming video.

Things I've checked:

1) Power problems - I've confirmed with a multimeter that the Pi is getting enough power when both wired and wireless network adapters are running; i've even temporarily replaced FS3 with a wire just to be sure. The power adapter is capable of supplying 1500mA and maintains good voltage, even when the Pi is apparently under load.

2) Latest updates - I've run rpi-update with BRANCH=fiq_split and that doesn't seem to make any difference.

3) Kernel options - I've been through all of the steps except increasing vm.min_free_kbytes listed at http://elinux.org/R-Pi_Troubleshooting#Networking and they haven't made any difference at all.

From reading around the forums, my general feeling is that there are still several issues with the USB drivers which are likely to be related to this problem - can anyone confirm the current status of the USB drivers?

Note I should just add that "top" shows no processes hogging the CPU, which would indicate to me that the excess time is being spent in kernel space rather than user space.

mcayland commented 11 years ago

I did another test whereby I connected my laptop to the Pi instead of the TV, and did a wget on a large file. This showed a sustained transfer speed of around 600K/s and the Pi remained responsive to pings on the wired port.

My thoughts are:

1) Does the wired Ethernet take priority over the wireless ethernet? This may explain why it was the wireless interface that was showing the higher ping times. I'd need another laptop to test the wireless interface though.

2) Could the Bravia IP stack be doing something that causes excessive traffic on the Ethernet port? I don't see how 600K/s should bring a Pi to its knees :/

popcornmix commented 11 years ago

fiq_split branch is now deprecated. The fiq_split code (and additional improvements) are now on the master branch. Any complaints in dmesg log?

mcayland commented 11 years ago

Thanks for the reply. I didn't realise fiq_split is deprecated - it might be worth someone adding a sticky page for USB to the Pi forums as there is so much out-of-date information related to USB problems and fiq_split on there.

dmesg doesn't show any related error messages at all. Another quick test shows that if I disconnect the cable from the wired interface then the ping times on the wireless interface immediately drop back to normal. I can post the complete dmesg output if required.

Also what speed does the Pi wired ethernet run at? mii-tool on Raspbian reports that it's running at 1000baseTx whereas other posts suggest that the port is only a 10/100baseTx?

popcornmix commented 11 years ago

The ethernet is 10/100. You should see a line in dmesg showing the speed. e.g. [ 4.921464] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1

mcayland commented 11 years ago

Ah I see - I think my mistake is that the Broadcom drivers supports the new ethtool rather than the older mii-tool ioctl interface. I'll head over to the Pi and double-check in a bit.

Based on this I have another couple of questions:

1) Is 600K/s USB transfer enough to make a Pi completely unresponsive? 2) Is there a way to measure the impact/performance of USB transfers?

I've seen some posts mentioning that it is possible to count interrupts/sec from /proc, but I'm wondering if there are any further counters specific to the USB/ethernet hub, e.g. no. of retries that would indicate that the Pi is being overloaded.

FWIW if there are monitoring patches like this available, I could probably cross-compile a custom kernel and report back the values if that helps?

popcornmix commented 11 years ago

1) No. The pi can play BluRay level videos at ~50Mbit/s (> 6MB/s) through ethernet or USB mass storage. 2) when transferring data what does top report as idle %

P33M commented 11 years ago

I can replicate this. The problem is likely due to memory allocation thrash. In the extreme case we see stuff like #153 where we end up with actual failures, but in the cases that normally provoke this I assume that the kernel allocator has to do a lot of work to find a contiguous free block of the size requested.

I have yet to profile this and understand it fully - not strictly related to USB, but more likely to happen in conjunction with a USB device plus "bad" driver because most of the bytes in/out of a Pi are usually going to be through networks.

One thing to try: Can you repeat your test with a kernel compiled with the memory allocator set to SLUB?

mcayland commented 11 years ago

@popcornmix : i've just installed vnstat on the pi itself, and while it is sluggish, it seems to be reporting ~1.1Mbps data rate on eth0 and wlan0 when streaming video, so that should be well within the parameters above.

Before streaming starts, top reports idle as being around 96% which drops to around 90-92% when trying to stream video.

mcayland commented 11 years ago

@P33M : glad that you can reproduce. as mentioned above, the incoming data rate from eth0 is about 1.1Mbps. Since most people report that they can use a Pi with no problems as an WiFi hotspot, I wonder if it is something in the smsc95xx driver which produces too many USB interrupts?

From the link you gave above, I tried increasing vm.min_free_kbytes to around 128Kb but that actually seemed to make the ping response times worse during streaming - the times went from around 400-500ms to 700-800ms.

In terms of a kernel to try, I've installed an ARM cross-compiler so if you can point me towards a github repository/branch to try (along with a suitable kernel .config), I'll have a go at trying the different allocator.

Another interesting thing: I've just tried switching the Bravia down to "Normal" resolution (down from its default of high), and the data rate stays at ~1.1Mbps - BUT now I get about 10s of 400-500ms ping times followed by about 10s of normal (~20ms) ping times which seems to be enough to keep the TV from buffering constantly. However the Pi still remains unresponsive during the periods of slow ping response. Does that information help at all?

P33M commented 11 years ago

Use this repo (rpi-3.6.y branch). Config is in arch/arm/configs/bcmrpi_defconfig.

Copy that to .config, then make ARCH=arm menuconfig and under General Setup, scroll down to Choose Slab Allocator -> SLUB.

Build in the normal way - plenty of guides floating around - copy both the modules and the kernel across.

Ferroin commented 11 years ago

@mcayland as has been stated in similar threads, vm.min_free_kbytes is a hint to the virtual memory system to try to keep that much memory free if at all possible; in effect, it's the limit below which the kernel starts swapping pages out to disk to try to free up more memory without a pending allocation. Increasing it above about 16348 (16MiB, because the value is in kiB) will actually decrease preformance for almost all workloads on the Pi because the system will usually spend more time swapping to disk than is really necessary, which is probably why you were seeing such bad latency with it at 131072 (which is what i assume you mean by 128Kb, since it can't be less than 4096 on the Pi).

mcayland commented 11 years ago

@P33M okay i'm just setting up the build environment now. I see there have been some recent USB commits, so should I try a standard kernel (without SLUB) first to use as a baseline?

mcayland commented 11 years ago

@Ferroin thanks for the clarification. I saw the hint about raising vm.min_free_kbytes in one of the Pi Forums which is why I tried it, but as you correctly point out increasing it is not going to help here. There is a lot of out of date/conflicting information around on the forums :/

mcayland commented 11 years ago

@P33M one other thought: has anyone got oprofile to work on the Pi yet? if so, I could try and grab a kernel profile when the Pi is being unresponsive.

mcayland commented 11 years ago

@P33M I left the build running overnight and it failed somewhere in the SCSI driver section :/ Just to double check, I just did a --depth 1 clone which gives me this:

build@kentang:~/src/linux/rpi/linux$ git clone https://github.com/raspberrypi/linux.git --depth 1 build@kentang:~/src/linux/rpi/linux$ git branch -a

....

build@kentang:~/src/linux/rpi/linux$ make ARCH=arm CROSS_COMPILE=arm-bcm2708hardfp-linux-gnueabi- O=../rel-linux/

Then this morning I found the following error:

CC [M] drivers/scsi/mvsas/mv_64xx.o CC [M] drivers/scsi/mvsas/mv_94xx.o LD [M] drivers/scsi/mvsas/mvsas.o LD drivers/scsi/osd/built-in.o CC [M] drivers/scsi/osd/osd_initiator.o /home/build/src/linux/rpi/linux/drivers/scsi/osd/osd_initiator.c: In function 'build_test': /home/build/src/linux/rpi/linux/drivers/scsi/osd/osd_initiator.c:68:2: error: size of unnamed array is negative /home/build/src/linux/rpi/linux/drivers/scsi/osd/osd_initiator.c:69:2: error: size of unnamed array is negative make[4]: * [drivers/scsi/osd/osd_initiator.o] Error 1 make[3]: * [drivers/scsi/osd] Error 2 make[2]: * [drivers/scsi] Error 2 make[1]: * [drivers] Error 2 make: *\ [sub-make] Error 2 build@kentang:~/src/linux/rpi/linux$

Any thoughts? The only change I've made to .config is to switch the allocator to SLUB as mentioned above.

P33M commented 11 years ago

Something is broken with your build environment, or your CPU is flakey due to overclock/overheat. More likely your build environment.

I would prefer it if you did use the latest 3.6 branch for testing and comparison - but it's unlikely that any of the recent commits will have any great bearing on the likelihood of hitting the problem.

Ferroin commented 11 years ago

@P33M I don't know a huge amount about Linux's SCSI subsystem, but I am pretty sure that the Pi doesn't need the OSD Initiator module. I've heard generic reports from people on a wide variety of ARM platforms that there are problems with that module at compile time with some kernel configurations.

mcayland commented 11 years ago

@Ferroin @P33M well the good news is that I've managed to build the kernel with SLUB allocator :)

I ended up throwing away everything I had, re-cloning the entire repository and then rebuilding everything from scratch and this time it just worked. From memory the only things I can think I did differently:

1) I did a complete git clone, i.e. the whole repository rather than with --depth 1

2) I realised that I forgot to set CROSS_COMPILE when running menuconfig, i.e. this time I did:

make ARCH=arm CROSS_COMPILE=arm-bcm2708hardfp-linux-gnueabi- O=../rel-linux/ menuconfig

instead of:

make ARCH=arm O=../rel-linux/ menuconfig

when generating the configuration for the build. The build was run exactly the same both times:

make ARCH=arm CROSS_COMPILE=arm-bcm2708hardfp-linux-gnueabi- O=../rel-linux/

3) I ran the build in the background while I was using my laptop, rather than just leaving it overnight (maybe some kind of power-saving tried to kick in?)

So I've no idea what actually made the build work, but at least I now have something I can try and report back...

mcayland commented 11 years ago

@P33m i've just tried the new kernel/modules and it hasn't made any difference - ping times still around the 500ms mark :( What's the next step?

P33M commented 11 years ago

Lock s-foils in attack position.

With your rsync/laggy pi (in the process of being laggy, not "afterwards") can you do cat /proc/buddyinfo?

Also can you do top and tell me if kswapd is on the list, if so then what % cpu is it using?

mcayland commented 11 years ago

@P33M here is the output of /proc/buddyinfo whilst streaming:

Node 0, zone Normal 58 21 9 6 3 3 2 2 2 4 92

and here is the output when idle:

Node 0, zone Normal 56 19 10 5 2 4 2 3 2 4 92

In terms of threads, I don't see kwapd showing up at all - the next highest CPU processes (around 0.3% or so) are RTKTHREAD and kworker/0:2.

Ferroin commented 11 years ago

@mcayland do you have anything running that would be doing lots of I/O to the SD card? Because of the way the SD card controller in the CPU is designed, the CPU is pretty much useless during any actual access to the SD card. Along these lines, processes that are waiting on disk I/O are listed as using 0% of the CPU (even though the IO itself uses a lot of the CPU time on the Pi), the easy way to tell is to look in the status colum for processes that have a capital D.

mcayland commented 11 years ago

@Ferroin not as far as I can tell. the setup is really simple in that it's just the standard Raspbian image downloaded from the Pi Foundation website with 2 modifications:

1) I've added the wireless WPA credentials to /etc/network/interfaces 2) I've added a script that runs the following on startup:

iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE

And that really is it - i've used rpi-config to ensure that we don't even start X so that it boots direct to the console in order to try and maximise performance that way.

mcayland commented 11 years ago

@P33M thinking about this further, I'm still wondering if it's to do with the priority at which the hub driver processes incoming interrupts.

Most people use the Pi as a wireless hotspot, so if a wireless client demands data then it gets routed across to the cable to a remote device, the remote device then processes and sends back a reply which is then sent back to the wireless. So in other words, the demand is controlled by the data transfer across the wireless interface.

Now I'm doing this in reverse so that the wired ethernet port is connected to the client; so in this case the client will push data through to the onboard ethernet port which is then routed across to the wireless port. Hence the demand is controlled by the rate at which the Pi can respond over the wired port.

I know from testing with my laptop that the ping response from the wired interface seems fine when running under load, and I read somewhere that the Pi always processes USB data from the wired interface first. So could it be that the wired client is pushing data to the Pi at the fastest rate it can, and because of the higher data rate and the fact that if multiple interrupts occur then the wired port always gets serviced first, that the wireless port is being starved?

P33M commented 11 years ago

I thought for a while there we had a prime example of the issue i described earlier.

What happens if you monitor vmstat 2? Post a log of idle / laggy / idle again if you can. Also a full top trace (write output to a file, then kill it after a few seconds). It's possible there could be some sort of network layer starvation going on, but it's better to rule out the lower levels such as being CPU bound.

mcayland commented 11 years ago

@P33M thanks the follow-up - I'm away now until next week, so i'll try the vmstat/top traces early next week and post them here.

mcayland commented 11 years ago

@P33M apologies for the delay getting back to you - the TV/Pi is actually at my gf's house and I forgot she was working nights this week!

Anyhow I've got the information you asked for and uploaded it to the following links:

vmstat 2 "idle" output http://www.ilande.co.uk/tmp/vmstat-idle.txt

vmstat 2 "busy" output http://www.ilande.co.uk/tmp/vmstat-busy.txt

top "busy" trace http://www.ilande.co.uk/tmp/top-busy.txt

mcayland commented 11 years ago

@P33M just a ping to see if these traces were of any use to you? do they give any ideas as to what to try next?

P33M commented 11 years ago

It appears that the Pi isn't cpu-bound - idle time is at 98%. I would expect much higher times if there were memory allocation stress or even simple network I/O processing.

Can you substitute the wlan dongle for another? I have one where the radio is nearly fried and as such has terrible performance with not much showing for it.

mcayland commented 11 years ago

@P33M okay some progress - I borrowed a wireless USB dongle based on the zd1211rw driver, tried it and it seems to work! Ping times seem to hover around 10-30ms when streaming which is exactly what I expect. Now my first suspicion is the 8192cu driver: firstly there are quite a few options that can be potentially tweaked:

root@raspberrypi:/home/pi# modinfo -p 8192cu rtw_ips_mode:The default IPS mode (int) ifname: (charp) rtw_initmac: (charp) rtw_channel_plan: (int) rtw_chip_version: (int) rtw_rfintfs: (int) rtw_lbkmode: (int) rtw_network_mode: (int) rtw_channel: (int) rtw_mp_mode: (int) rtw_wmm_enable: (int) rtw_vrtl_carrier_sense: (int) rtw_vcs_type: (int) rtw_busy_thresh: (int) rtw_ht_enable: (int) rtw_cbw40_enable: (int) rtw_ampdu_enable: (int) rtw_rx_stbc: (int) rtw_ampdu_amsdu: (int) rtw_lowrate_two_xmit: (int) rtw_rf_config: (int) rtw_power_mgnt: (int) rtw_low_power: (int) rtw_wifi_spec: (int) rtw_antdiv_cfg: (int) rtw_enusbss: (int) rtw_hwpdn_mode: (int) rtw_hwpwrp_detect: (int) rtw_max_roaming_times:The max roaming times to try (uint) rtw_force_iol:Force to enable IOL (bool) rtw_intel_class_mode:The intel class mode 0: off, 1: on rtw_mc2u_disable: (int)

Secondly it looks as if the kernel team's rtl8192cu driver is starting to look more stable (see http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/drivers/net/wireless/rtlwifi?id=fd3930f70c8d14008f3377d51ce039806dfc542e). How easy would it be to try and port the USB dwc_otg driver from the rpi-3.10y branch to a reasonably current git kernel as a test?

mcayland commented 11 years ago

@P33M : after spending some time on this, I've confirmed that it's definitely a problem with the particular wireless driver/hardware combination - both the Realtek vendor driver and the kernel rtlwifi drivers have different issues which cause poor performance. On the plus side, I've been working with Larry Finger (the kernel rtlwifi/rtl8192cu maintainer) and we've gone a long way to improving the reliability of the kernel driver so hopefully it will be possible to switch at some point in the future.

In the meantime, given that I can happily stream ~1.3MB/s through the Pi with a properly working wireless driver without any issue then I am happy that this is not a Pi USB driver issue I am seeing - so thanks once again for your help, and feel free to close.

hydn79 commented 10 years ago

I have these exact issues. With rtl871xdrv driver. Where can I find a better or modified driver that would resolve the high ping latency?

Thanks.

mcayland commented 10 years ago

In the end, I spent a lot of time working with Larry Finger to improve rtlwifi (rather than Realtek's awful 8192cu) and have fixed a few bugs to the point where on recent kernels I now see the same performance between the rtlwifi and 8192cu drivers.

In my case we determined that the hardware rate-control algorithm struggles in my particular environment, and Larry was unable to reproduce the failure at his end so it came to a halt. My eventual fix was to switch wireless dongles and during testing I found that both rt2800usb and ath9k_htc both worked much better for me.