raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.
5.17k stars 1.68k forks source link

Interrupt collision between smsc95xx and USB storage drivers under heavy load #9

Closed benosteen closed 11 years ago

benosteen commented 12 years ago

Steps to reproduce:

1) Lots of files on a USB drive, plugged in and mounted. 2) Begin a download of a large file (100Mb+ is suggested) to that USB drive. 3) During download, try to access large numbers of files (suggestions to follow)

This will at some indeterminate point freeze the system with kernel panics from the USB storage driver - "... not syncing: Fatal exception in interrupt" and kernel errors from the ethernet driver : "kevent may have dropped the interrupt."

Suggested means to replicate step 3)

If rootfs is on USB, apt-get install'ing a group of packages, apt-cache search and so on are good ways to uncover this collision. Otherwise, searching or grepping through a reasonable number of files on the USB is enough (find . | xargs grep -i "foo") for example.

It is hard to capture this error, as the kern.log doesn't sync the errors to disc, and the errors flash by too fast on tty to see them with any clarity.

Recreated with latest kernel + UAS built in and new modules and with kernel modules from 13/04 - with rootfs on USB and with the stock rootfs on SD. Having the rootfs on SD makes it more difficult to simulate the type of storage demand required to replicate the bug however.

pepedog commented 12 years ago

If the kernel_debug doesn't have devtmpfs it may fail. Not sure, think config.txt has to have something set for debug, perhaps someone can advise. Just google devtmpfs udev

asb commented 12 years ago

@pepedog: thanks for the heads up regarding CONFIG_DEVTMPFS and udev. Even Debian sid is only using udev 175, so that requirement hadn't cropped up. It sounds like it would be worth enabling.

guisacouto commented 12 years ago

I'm not sure if this is the same issue or not. Please tell me if it is different.

I've been thinking a bit about this problem while downloading torrents (heavy network+usb storage), and I thought that maybe giving transmission an higher nice, so it has a lower priority could help. This way it wouldn't take all the cpu when it's needed by some system process or something.

This did kind of help. Know I'm not getting a kernel panic, and the system keeps running, but "usb-storage" crashes! The nice is set to 19 (lowest priority possible).

The dmesg is here: http://pastebin.com/Y4mnP709

mgreeves commented 12 years ago

rewolff,

Your PL2303 issue is probably different. There are reports that the prolific PL-2303X has the same vendor ID and product ID as the older PL-2303. I've seen lsusb report a pl-2303 with MaxPacketSize of 64 and suspect its a 2303x. Running a x64 3.1.10 kernel I had a problem where long transfers experienced dropped data. Replacing the device with a FTDI FT232 eliminated the error.

guisacouto commented 12 years ago

Been doing some research.. smsc95xx is just an ethernet chip right? The fix that was submited to the kernel about this colission between usb storage and high network usage was in the smsc95xx drivers or somehing really in the kernel? If it was just in the drivers, that pretty much explains why it hasn't fixed my problem, since I'm just using wifi and no ethernet... So I guess there is still a problem when usb devices are compeeting for the unique usb bus.

guisacouto commented 12 years ago

After some updates, I'm getting a different kernel panic (a lot shorter in output), when downloading+usb storage. Here is a screen: http://img577.imageshack.us/img577/6830/20120610215808.jpg

rewolff commented 12 years ago

In that case, please check the following:

* R.E.Wolff@BitWizard.nl * http://www.BitWizard.nl/ \ +31-15-2600998 \ Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 -- BitWizard writes Linux device drivers for any device you may have! -- The plan was simple, like my brother-in-law Phil. But unlike Phil, this plan just might work.

volpino commented 12 years ago

@guisacouto i'm getting the exact same issue with OpenELEC. I'm running transmission and xbmc and I tried different versions of openelec, i even tried to recompile the current one from git. I have a Logilink hub (it's in the working peripheral section on the raspberry wiki) with an external ntfs hdd, a wifi dongle and a mce remote.

sdebruyn commented 12 years ago

More people are running into this issue, so am I. It was reported with some links to threads here: https://github.com/raspberrypi/linux/issues/56

volpino commented 12 years ago

I had the issue only using wifi. I workarounded with an access point in client mode that let me connect the RaspberryPi via ethernet but still use wifi.

Dmole commented 12 years ago

smsc95xx.turbo_mode=N to /boot/cmdline.txt fixes this problem !!!!

sdebruyn commented 12 years ago

It doesn't. It's still happening here with turbo mode disabled.

lorenzos commented 12 years ago

Just to report, I had exactly the same error and I solved with:

Before that edits, I experienced this issue about twice or more times every hour, while using my Raspberry to do lot of network data transfers (file sharing at 250KB/s circa) and very very frequent SD file reads/writes. Never got a kernel panic, btw.

After that edits, I have not experienced any problem at all for two days now.

popcornmix commented 11 years ago

@benosteen A lot has been fixed since this report. Is the panic still happening with latest (rpi-update) kernel?

benosteen commented 11 years ago

I can't comment on this bug as I'm not in a position to fire up a RasPi and monitor for any problems. I would say to close this - as it was such an early issue - and let someone reopen it in the unlikely event that the problem still persists.

On Saturday, 20 July 2013, popcornmix wrote:

@benosteen https://github.com/benosteen A lot has been fixed since this report. Is the panic still happening with latest (rpi-update) kernel?

— Reply to this email directly or view it on GitHubhttps://github.com/raspberrypi/firmware/issues/9#issuecomment-21291424 .