raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.
5.18k stars 1.68k forks source link

Raspberry Pi freezes roughly after 1 day and 15 hours (on average) when playing audio files (*.wav) #247

Closed hid3nax closed 10 years ago

hid3nax commented 10 years ago

Raspberry Pi (running latest Raspbian with all updates) freezes roughly after 1 day and 15 hours (on average) when playing audio files (*.wav).

The system is running headless. The system is running off a 3.5" HDD (powered separately). The system does not have any devices connected except USB HDD and ethernet cable. TP1 and TP2 Voltage is around 4.89 V. Analogue output is being used, routed to an amplifier. 'ifplugd' package was removed/purged to save some interrupts. The system is most up to date (apt-get update, upgrade). The kernel is latest version for now: Linux safira 3.10.25+ #622 PREEMPT Fri Jan 3 18:41:00 GMT 2014 armv6l GNU/Linux

The mpd player is used to play *.wav files. The music is being played only on time between 9AM till 8 PM. The Pi hangs only when playing music, on the 2nd day of uptime.

What has been done to eliminate the issue (didn't help anything of the listed): The mpd is being stopped and started nightly by a cron script. The music files are being played from USB HDD. I've moved it to SD card but that didn't help. Swap space has been increased to 200 MB. minfreekbytes in sysctl was increased to 16 MB. The following sysctl values were set: kernel.panic = 4 kernel.panic_on_oops = 3 Unfortunately, the Pi does not reboot after the hang up.

The only way to recover the Raspberry Pi again is to unplug and then plug the power.

After the reboot, neither syslog (syslog/messages/kern.log/auth.log or mpd.log) do not report anything.

The system is being graphed in Cacti at 5 minute intervals. The last values before the system hangs do not report anything unusual (memory/cpu/interrupts are in normal range).

If any further information or output is needed, please reply.

popcornmix commented 10 years ago

I'd be interested if you can test with different wav files. If you repeatedly play a 1 second wav file and it fails much more quickly, then it might suggest that something is being leaked on each play and we die when that is exhausted. Tracking the number of files played before it dies may be interesting (if it was a number like 512, then it may give a hint as to what is being exhausted).

hid3nax commented 10 years ago

What I discovered during today's hang up was that the wav files were placed on flash while the system itself was running off USB HDD. The Pi crashed today around midday.

This evening I drove to where the Pi is. Surprisingly, it was still playing music but the system was unaccessible (no response to ping, etc.)

Cron job was supposed to stop the music at certain time. (it has never failed to do that before) I decided to wait until that time. And as expected, the cron job didn't stop the music and it still played. I guess this means that the whole system is crashed/frozen while the mpd daemon (which resides in ram) is still alive and is playing wav files from flash which might also be accessible.

I guess this is something related to USB, Ethernet. Any hints on how to troubleshoot further?

FWIW, I'll also try to "count" the files played, as well as try to play 1 sec wav file till infinity.

Thanks.

popcornmix commented 10 years ago

Can you measure the voltage when system is under load (perhaps when copying files on USB). Any errors in dmesg after it's gone bad? Can you narrow it down? i.e. unplug the USB drive and run from sdcard?

dos1 commented 10 years ago

I have had similar experiences with rootfs on USB HDD few months ago, without playing any music. When it happens while being logged in via ssh with htop running, it starts to show massive load values until it hangs completely.

I haven't noticed any consistency - it just happened around once per day. Sometimes it was working whole day, but then I found it broken after waking up next day. I managed to get watchdog to restart the device when it hangs, but after that it's not able to mount rootfs until the power is plugged off and on again, so after automatic reboot it was stuck in reboot loop.

Eventually I became tired of that, so now my Pi has been off for a while. I'll try to get some more details soon - I remember that there were some kernel messages on screen when I connected the monitor. Unfortunately, I haven't noted them down.

hid3nax commented 10 years ago

Thanks, I will measure the voltage when under load and report back.

Unfortunately, the system is headless and I can't see anything. SSH is unreachable after the hang up. But once I caught a moment when I accidently typed dmesg just 5 minutes before the hang up. It showed no errors. Last message in there was from the boot process.

dos1, it would be great if you'd find your Pi and test it. It's possible we're having the same issue.

hid3nax commented 10 years ago

4.84 V under load. 1 second wav file was playing for 1 day and 3 hours continiously but didn't hang up. I had to power off the Pi to swap for a 5V 1A PSU off an iPad.

popcornmix commented 10 years ago

We need the dmesg log after failure not before. I think you'll need to attach a screen or UART.

hid3nax commented 10 years ago

Unfortunately I was unable to capture the dmesg output after the failure. Display did not respond after the RPi got crashed. I have downgraded the kernel to this one from github: Linux safira 3.6.11+ #462 PREEMPT Mon Jun 3 22:15:00 BST 2013 armv6l GNU/Linux

It has been already 3 days and no crashes so far, with the same PSU and the same all configuration. It looks like there is the issue with the latest firmware.

hid3nax commented 10 years ago

Once again, confirming, looks like this is related to the firmware/kernel rather than the Power Supply or the hardware itself; It has just beaten 7 days and 20 hours of uptime/stability. Hope it won't crash anymore.

popcornmix commented 10 years ago

We can't do anything without a dmesg log. Can you check /var/log/messages. I believe that will contain dmesg logs from previous boots. If it goes back far enough, it may have information on your last crash. Also /var/log/dmesg.X should contain dmesg logs from the last few boots.

If not, then updating to latest kernel, and looking in /var/log after the crash may shed some light.

trasferetti commented 10 years ago

having the same problem on a system with a pair of relays, a temperature sensor and a USB wi-fi, besides the SD card running the last kernel. it seems to be a problem with the Broadcom BCM2835 SoC hanging up due to eventual lack of juice. I'm testing this out: http://blog.ricardoarturocabral.com/2013/01/auto-reboot-hung-raspberry-pi-using-on.html

hid3nax commented 10 years ago

If you're interested, I solved the problem completely by downgrading the kernel/firmware to " 3.6.11+ #462 PREEMPT Mon Jun 3 22:15:00 BST 2013 armv6l GNU/Linux". No more crashes, no more headache, no more driving miles away from home every 2 days to reboot the Pi.

I was NOT able to capture any error messages on the original/community kernel.

FWIW, this is definitely NOT a lack-of-power related problem. I highly suspect it's related to usb core/modules/drivers/etc

popcornmix commented 10 years ago

I'm glad you've found a workaround. We can't do anything to help without any logs showing how it crashed. I'll close this issue. If anyone has a related problem and can supply dmesg logs when it crashes, then please open a new issue.

braincrash commented 10 years ago

@hid3nax, can you give me the string from rpi-firmware? So I can test it on mine? Thanks

braincrash commented 10 years ago

@hid3nax, can you give me the string from rpi-firmware? So I can test it on mine? Thanks

J-e-f-f-A commented 10 years ago

I have a phone screening system that I've built that will either lock up after a few days, or I lose the USB accessories (Audio, Wifi and USB modem, which are all on a powered USB hub) and my program will cease to function since it's interfaces to the 'real world' have gone away...

I can also consistently get it to lock up just doing an 'aplay' of a wav file without forcing direct hardware access (IE: without the "-D plughw:0,0" option, it locks up most of the time.)

My only preventative measure thus far has been to re-boot it every couple of days to keep the system 'clean' and 'stable'.

I'm running Raspibian on a model B with the latest updates to raspibian and the latest firmware. (well, as of a week ago now).

This has been happening all along since I started this project a few months ago, and I just haven't had much time to troubleshoot it further. I had planned on trying a different Linux distribution, but haven't gotten around to that yet.

I'll try to force the issue (via aplay) and open a new ticket with my logs for analysis...

And perhaps after I create that ticket, I'll fall back to the kernel version mentioned by hid3nax so my system will be stable and reliable...

Jeff

hid3nax commented 10 years ago

@braincrash , what do you mean a string from firmware? At the moment I am running this firmware and it seems to be very stable (uptime has reached 95 days at one moment): uname -a Linux safira 3.6.11+ #462 PREEMPT Mon Jun 3 22:15:00 BST 2013 armv6l GNU/Linux cat /boot/.firmware_revision a1a99df049176671fdfd5b0f6629fc52e7c71d31

hid3nax commented 10 years ago

@J-e-f-f-A yes, this seems to be VERY related to USB. I have made an experiment with the latest firmware once: I played wav files from the SD card while the system was running from the USB drive. Surprisingly, the system has crashed (no ping, no ssh no nothing) but the music still continued to play.

My suggestion for you would be to try to "downgrade" to this firmware revision: (I have pasted the code in pastie.org because GitHub converts it to an URL: http://pastie.org/private/ylyjiw2nb3ir17j8sedvmg )

The command to do that is 'rpi-update ' Then reboot.

Please report back if that works for you stable! Thanks!

braincrash commented 10 years ago

@hid3nax, it's the commit hash, I think you have post it above, will check it :) Thanks.

trasferetti commented 10 years ago

found some workarounds here:

http://iqjar.com/jar/raspberry-pi-rebooting-itself-when-it-becomes-unreachable-from-outside-networks/

Pedro Ivo Trasferetti von Ah

pedro.ivo.trasferetti@gmail.com arroubapedroivo@gmail.com fone: 11 97662 5505

On Sat, Jun 21, 2014 at 10:47 AM, braincrash notifications@github.com wrote:

@hid3nax https://github.com/hid3nax, it's the commit hash, I think you have post it above, will check it :) Thanks.

— Reply to this email directly or view it on GitHub https://github.com/raspberrypi/firmware/issues/247#issuecomment-46754133 .

hid3nax commented 10 years ago

Probably that won't work. As the Pi's userland freezes completely. E.g. cron jobs do not execute, you cannot access any running services on it (web, ssh, etc).

J-e-f-f-A commented 10 years ago

My workaround is to hot-plug my USB devices every day (Modem, Wireless & Sound card) - then restart my program to re-connect to the modem after hot-plugging it. Note: Hot-plugging the USB hub that they are on does NOT work for some reason. I have to hotplug each device. Also note that I haven't done further testing on the 'hard' lockups I get if playing audio without specifying the HARDWARE device name... This is a 'live' phone screening system, so I don't want to break it, lol. Jeff On Jul 29, 2014 12:37 PM, "hid3nax" notifications@github.com wrote:

Probably that won't work. As the Pi's userland freezes completely. E.g. cron jobs do not execute, you cannot access any running services on it (web, ssh, etc).

— Reply to this email directly or view it on GitHub https://github.com/raspberrypi/firmware/issues/247#issuecomment-50502432 .

kprkpr commented 9 years ago

For know to all lastest commit of 3.6.11+ Rpi Hexxeh , they took me a lot of time for find out.. Maybe anyone are searching this in the web ;P https://github.com/Hexxeh/rpi-firmware/tree/8234d5148aded657760e9ecd622f324d140ae891

sudo rpi-update 8234d5148aded657760e9ecd622f324d140ae891

uname -a Linux raspberrypi 3.6.11+ #557 PREEMPT Wed Oct 2 18:49:09 BST 2013 armv6l GNU/Linux

pelwell commented 9 years ago

What are you saying about that commit? Is it good or bad?

kprkpr commented 9 years ago

Well..Im sharing this because up it talks about 3.6.11+ doesnt have this bugs, and I thought is easier to install having the commit and command

Before I tried with pastebin attached up, but they didnt worked for me

Im trying it right now, im not sure about it works, but if it's a bug of 3.10.y kernels, this may work (Trying with mpd,streaming, but I have same bug as that) Edit: wifi remains shutting down and rpi stops.. ignore it

ajohn370 commented 6 years ago

Having a similar issue with an rpi 3 b. System runs smoothly but it freezes hangs and I loose ssh and VNC access. Pi is simply displaying a flash webpage that displays the status of one of our systems. It is not monitoring the system Just a webpage that displays the systems status. It should not be pulling a lot of cpu power.

I thought the freezing was due to over heating because I would See temperature icon. The PI is in a temperature controlled room and I placed heat sinks on the chips. However the Pi kept freezing and Webpage would be up but it would not actively update the stats. I still would loose access to PI.

I under clocked the Pi down to 600 mhz now pi continuously updates the webpage but I still loose access to the Pi through SSH and VNC.

mirabilos commented 5 days ago

Same here, but without USB.

RPi 1B, only power and ethernet connected, running stock Debian/armel, but with upgraded raspi-firmware (1.20240424+ds-2).

On bullseye, it ran stable. After an upgrade to bookworm (because armel isn’t in LTS) I get occasional crashes… one when not actively doing anything (so perhaps a cronjob?), and now that I put AdGuard Home on it so the system is actually under load, about once every 1–2 days.