RobertCNelson / linux-dev

MIT License
150 stars 96 forks source link

USB related system freeze in 4.1.22-ti-r59 #43

Open spiderkeys opened 8 years ago

spiderkeys commented 8 years ago

We're currently trying to track down a really nasty issue, seemingly related to the USB interface, on the Beaglebone Black. I'll detail the process below:

Kernel version: 4.1.22-ti-r59

On kernel 4.1.8, I was able to run the camera for several hours without any problem. This is the version that we currently have deployed in our production environment and we've never received any complaints about it.

To make sure I isolated the problem from the actual beaglebone hardware, I ran the test process (start mjpg-streamer and wait until the system crashes) on a fresh, factory flashed BBB with 4.1.8, and it did not crash after several hours. I then inserted a 4.1.22 SD card on the same device and ran the tests again. The system always froze after 2-20 minutes.

Because I had read about past problems concerning freezes as a result of how power is supplied, I have also tried using various methods for supplying power to the board (from the mini USB port, using a 5V-2A supply on the rails, and using a custom 5V supply driven by 18650 li-ion batteries). This didn't seem to have any impact on results.

Coincidentally, I noticed this patch today on the 4.9 kernel: @590da07 Any relation?

I've also read on various message boards that there could supposedly be a silicon defect with the TI AM335x that causes similar problems; any info to confirm or refute this?

Any ideas what might be causing this? I can see that the problem might exist in one of three places: musb, v4l2, or uvcvideo, but my main suspicions lie with musb. I'm currently running the test again on the same hardware using your 4.4.23-ti-r51 build and have not had any problems yet (test 40 minutes underway).

Thanks much.

RobertCNelson commented 8 years ago

Hi @spiderkeys which camera are you using? I can add it to my test farm..

It's more then likely the musb layer. The am335x ES 1.x parts on the White had lots of fun errata, whereas the ES2.x used on the BeagleBone Black had things like DMA <-> musb fixed.

Couple things you can test, disable pm on the usb ( and cpuidle/freq ) and if your only using Black's, enable MUSB_DMA vs MUSB_PIO mode

On mainline v4.9.x-rc there was a msub-pm rewrite pushed, those 4 patches from today are for a regression from that rewrite..

Regards,

spiderkeys commented 8 years ago

Hi @RobertCNelson

Thanks for the prompt reply. We are using the Genius f100 webcam.

I assume the suggestions for testing are parameters in the kernel's menuconfig, or are there ways to change that behavior on an existing system? I can test it either way, but if it is done via recompiling the drivers, do you have any recommendations for a good workflow for incorporating modifications to drivers/kernel modules in the omap-image-builder process?