Closed The-Exnor closed 9 years ago
@f1vefour @CurlyMoo
So my test unit that has kodi stopped, so far did no crash or Panic. (3.18.8+, with all the updates as of yesterday). I notice some "lag" via SSH console that was not there prior to this Kernel, but so far it passed the stress tests i made (reaching 48h without a Panic)
The other unit (where i can still watch a movie) when alone, freezes/crashes Kodi. (so typically i have to kill/start kodi twice a day on this unit).
Can this be an issue with Kodi itself or any of the services supporting it?
Does this test unit run as download box with lots of IO?
@CurlyMoo No... its idling with TOP running... but i did test moving big files over Wlan and it did no crash.
Can you do a IO / CPU stresstest somehow for an extended period of time?
Yes... how do you propose i do that? Besides Kodi what else can use the CPU to the max?
I would suggest googling for a nice script...
http://stackoverflow.com/questions/2925606/how-to-create-a-cpu-spike-with-a-bash-command
For example.
So fulload() { dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null & }; fulload; read; killall dd should work
@CurlyMoo just started kodi on that unit and it crashed in less than 5 min.... started slughish then input become even more slow and finally froze...
@Smultie thnks mate going to try that one... i want to see if this is related to Kodi itself.
The script from @smultie just tests memory IO, not disk IO.
It stresses all cores, which was the goal, right?
The goal is disk io :)
I quote (you) Can you do a IO / CPU stresstest somehow for an extended period of time?
It does 50% of that.
Why don't you just post a script to be sure we do the right thing¿?
Because i would have to search for it as well.
Fair enough ;)
Found a nice Linux program for that. its called "stress"
Here is the output of 60 seconds of running. (oh and no crashes since i've stopped Kodi):
stress -v -t 60 -c 1 stress: info: [3357] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd stress: dbug: [3357] using backoff sleep of 3000us stress: dbug: [3357] setting timeout to 60s stress: dbug: [3357] --> hogcpu worker 1 [3358] forked stress: dbug: [3357] <-- worker 3358 signalled normally stress: info: [3357] successful run completed in 60s
Here is a 180 second test with IO and CPU:
stress: info: [3375] dispatching hogs: 1 cpu, 4 io, 0 vm, 0 hdd stress: dbug: [3375] using backoff sleep of 15000us stress: dbug: [3375] setting timeout to 180s stress: dbug: [3375] --> hogcpu worker 1 [3376] forked stress: dbug: [3375] --> hogio worker 4 [3377] forked stress: dbug: [3375] using backoff sleep of 9000us stress: dbug: [3375] setting timeout to 180s stress: dbug: [3375] --> hogio worker 3 [3378] forked stress: dbug: [3375] using backoff sleep of 6000us stress: dbug: [3375] setting timeout to 180s stress: dbug: [3375] --> hogio worker 2 [3379] forked stress: dbug: [3375] using backoff sleep of 3000us stress: dbug: [3375] setting timeout to 180s stress: dbug: [3375] --> hogio worker 1 [3380] forked
stress: dbug: [3375] <-- worker 3376 signalled normally stress: dbug: [3375] <-- worker 3378 signalled normally stress: dbug: [3375] <-- worker 3380 signalled normally stress: dbug: [3375] <-- worker 3377 signalled normally stress: dbug: [3375] <-- worker 3379 signalled normally stress: info: [3375] successful run completed in 180s
Some results from my side:
xbian@xbian ~ $ stress --cpu 4 --io 4 --vm 4 --hdd 4 --timeout 1m stress: info: [7602] dispatching hogs: 4 cpu, 4 io, 4 vm, 4 hdd stress: FAIL: 7602 <-- worker 7617 got signal 9 stress: WARN: 7602 now reaping child worker processes stress: FAIL: 7602 <-- worker 7605 got signal 9 stress: WARN: 7602 now reaping child worker processes stress: FAIL: 7602 failed run completed in 7s xbian@xbian ~ $ stress --cpu 4 --timeout 1m stress: info: [7627] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd stress: info: [7627] successful run completed in 60s xbian@xbian ~ $ stress --io 4 --timeout 1m stress: info: [7659] dispatching hogs: 0 cpu, 4 io, 0 vm, 0 hdd stress: info: [7659] successful run completed in 60s xbian@xbian ~ $ stress --vm 4 --timeout 1m stress: info: [7876] dispatching hogs: 0 cpu, 0 io, 4 vm, 0 hdd stress: FAIL: 7876 <-- worker 7879 got signal 9 stress: WARN: 7876 now reaping child worker processes stress: FAIL: 7876 <-- worker 7880 got signal 9 stress: WARN: 7876 now reaping child worker processes stress: FAIL: 7876 failed run completed in 2s xbian@xbian ~ $ stress --hdd 4 --timeout 1m stress: info: [7884] dispatching hogs: 0 cpu, 0 io, 0 vm, 4 hdd stress: info: [7884] successful run completed in 60s xbian@xbian ~ $ stress --vm 4 --timeout 1m stress: info: [7920] dispatching hogs: 0 cpu, 0 io, 4 vm, 0 hdd stress: FAIL: 7920 <-- worker 7924 got signal 9 stress: WARN: 7920 now reaping child worker processes stress: FAIL: 7920 failed run completed in 18s xbian@xbian ~ $ stress --vm 3 --timeout 1m stress: info: [7932] dispatching hogs: 0 cpu, 0 io, 3 vm, 0 hdd stress: FAIL: 7932 <-- worker 7935 got signal 9 stress: WARN: 7932 now reaping child worker processes stress: FAIL: 7932 failed run completed in 3s xbian@xbian ~ $ stress --vm 2 --timeout 1m stress: info: [7938] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd stress: info: [7938] successful run completed in 60s
Note that --vm greater than 2 crashes.
Well system only crashed when and everytime i use Kodi... any ideas @CurlyMoo @f1vefour ?
Can you try installing previous versions of Kodi? Preferably 13.2?
Ok... you mean XBMC ;) . i dont think i need to go so far as that... previous to the update that started all this mess everything was fine. But i will try to install 13.2
@CurlyMoo So i burned an old img i had on backup (xbmc 13) and everything works fine. Does not crash/freeze or Panic (kernel is 3.14 i think ) but its not updatable and almost of the add-on i need are not compatible anymore :/
Now on the same rPi i've tested, again with the other SDcard, and current version available still freezes Kodi after some time. BUT without Kodi running it stays stable now.
Since i've runned all the stress tests i know, can it be some of the services that Kodi uses that are making this happen? (note that for this test i used no OC, so ARM @ 700 etc etc...).
And if you downgrade Kodi as i asked before?
@CurlyMoo well i didn't downgrade from the current img... How do i do that?
Searching is your friend: "Apt downgrade".
Done that... i need to know the name of the package to install...
dpkg --get-selections | grep xbian
@The-Exnor
try this apt-get install xbian-package-xbmc=14.1-1423177674
I've had the same issues with the most recent version of kodi over the last week or so, aswell as having CEC not responding properly (taking upto 10 minutes to after switching on the tv or to the pi's source before working) and crashing with both HD/SD content locally or over smb by causing a kernel panic if media has been paused by swapping source on the tv then switching back to kodi or even if kodi has been left idle for half our or so on pause which has perviously been ok till kodi updated had been running for over 4 months and had no issue previously.
I have now switched back to the older version of kodi above and everything seems to working fine for the last 24 hours no crashes or issues but not 100% sure yet will get back in a day or two, but think it could be an issue with kodi 4.2 doing something strange as i have a clone of the one i use for watching media on another pi running a download server with loads of I/O and quite a bit of cpu usage unpacking rar files and has kodi disabled on start up which has ran solidly with no crashes untill i swapped the pis round and started kodi and disabled the download services on the spare then it crashed and did the same after a few hours.
@CurlyMoo going to try on one of the units
@bairdy Thks mate, going to try this on the other unit, and yes i notice that the problems appear to have some relation with Kodi. With Kodi turned off the system can handle stress tests and even real big files move over LAN without issues (even on 3.18.8+ Kernel).
My question to @CurlyMoo and @f1vefour is if its possible that one of the necessary services/programs that are included might be causing the instability.
It still tells me IO is a likely cause. A new version of Kodi might raise the IO more then others?
@f1vefour
tim, can you just try this config ? https://github.com/xbianonpi/xbian-package-kernel/raw/master/extra-files/rpi-3.18.y/.config.try
@bairdy
the "idle" -> "resume" crashing won't be related - this looks as a regression (or new but leading to same problem) in 14.x. After @Smultie was asking I tested on imx6 and have the same against vanilla NFS (debian linux NFS3). for 16 months I remember imx6, this wasn't happening before.
(actually with my setup / devices I don't remember that even from RPI). only prove it is out of kernel is RPI1 image from Oct/Nov past year with kernel to be tested - that way we exclude two other significant factor - firmware, which, was for sure from 80% rewritten with introducing RPI2.
raspberry has (and always had) HDMI, CEC, all this code in FW (as blobs).
A short update from my side. A week ago I installed the 'stable' version of XBian via the download tool. After that I've updated everything except for the kernel (which is now still 3.17.7-ck2+). I haven't had any complete freezes except that Kodi doesn't respond to CEC after a while (and sometimes freezes). But a restart of Kodi fixes that. In the mean time I was able to download and copy stuff around.
Is there anything I can test from my side with this configuration?
@CurlyMoo @f1vefour @mk01 @bairdy
Well the revert to an older Kodi so far is working better... But it still get sluggish sometime (but not as bad as 14.2).
@CurlyMoo you say it must be related to I/O, but how can it be if i've tested all the possible stress scenarios i can think of and i was unable to reproduce either a crash or Panic... furthermore before the update that started all of this, both my units never had any issues of this type and i do push them hard. And are you referring to IO from what part? CPU to RAM; SoC to SDcard; IO access to USB controller; Data handling inside the CPU/GPU/logic of the SoC? I your theory is correct then i have 2 faulty rPi units from different batches... and its quite a coincidence they both got the same problem at the same time with the same software update.
My theory is that in some part of the updated software an unintentional part the code is creating this scenario. I don't think is Kodi alone because i still got random Panics running BTRFS scrubs (but not all the times i've run it...) with Kodi not running. But since the day that this situation occurred some other software parts where updated and Raspbian repos and some on Xbian and now Panics are extremely rare and so far only with Kodi running.
This is all very frustrating... Old imgs on xbmc (13.x) run without this issues, but if i stick with that i can't update the software...
I appreciate all the efforts you guys make. Sorry for the rant.
Because of all users reporting issues, (almost) all of them used it as a download box as well. I for example never had issues, but just using XBian for Kodi and have all files on a NFS share. So hardly any disk IO, just some network IO when i watch a movie.
@CurlyMoo
Ok... i also only use Xbian for Kodi alone (i do not use it as a download computer because i want it just to be a HTPC and thats it). I do use it for very high bit rate AVC (.mkv container with average 15Mbits/s AVC file and AAC file with DTS or Dolby audio stream) files and the units never ever (even with the update) crashed during a file play over NFS shares or USB drive. The crashes/freezes and Panics were all under NO network or USB data transfer situations.
I also transferred an 4GiB file twice from my NAS to both units to test that part (as i referred in a previous post) and no crash or Panic...
When i'm not seeing a movie/series the units stay On doing nothing more than running Kodi at idle.
I wish i could debug this better...
@bairdy "try this apt-get install xbian-package-xbmc=14.1-1423177674"
Mate thanks for this. i'm now using it for 2 days and no more kernel panics :)
Kodi still crash and burn sometimes but it restarts automatically (i assume this is the setting on the service(?)) but apart from that the OS is now, as far i can tell, stable.
Note for all that this unit is running with LZO settings and 3.18.8+ Kernel.
@CurlyMoo @mk01 @f1vefour
On my other unit i've updated last night (UK time of 22h) and now its stuck at boot loading the X libraries ... Any ideas on this?
@mk01 I am in the middle of moving and starting a new job, someone else will have to take this on for a while as I have no time. Sorry :(
@f1vefour are you leaving the project?
@The-Exnor "...take this on for a while" ....
He'll be back ;)
@bairdy @CurlyMoo @mk01 @f1vefour
Unit running xbian-package-xbmc=14.1-1423177674 now running for 3 days with ZERO Linux crashes/Panics (Kodi other issues still persist but the OS part appears to be gone)(Kernel 13.18.8+, LZO on 2nd partition).
As of this morning i've also reverted to this Kodi version on my bedroom unit... lets see if it holds
I'm not leaving, just have to take a bit of time off until things settle.
@f1vefour ok mate :)
So far no more Kernel Panics.
Tbh: I'm running the latest packages and haven't seen a kernel panic for ~ a week I think.
@CurlyMoo @f1vefour @Smultie @bairdy
Reporting ZERO panic or crashes from the OS so far. Still using Kodi 14.1. All other parts are updated.
@CurlyMoo @f1vefour and everyone else.
Reporting that still no more problems on the OS part... i think you can close this thread. Thks for all the help guys.
Again I've updated everything however the only thing that freezes is Kodi but my RPI is still running. I also have a dmesg which you can find here: http://pastebin.com/ntCweJQ6
Maybe it's good to know that I'm only using the RPi to watch movies which are stored on my USB HDD.
Kodi 14.1 does not freeze on me anymore but its performance is erratic compared to 13 (xbmc).
For me kodi freezes seeme related to screensaver. Since I disabled screensaver no more kodi problems. I had this issue http://forum.xbian.org/thread-2912.html but now (without screensaver) kodi is up since a couple of days
What resolved your issue @The-Exnor?
@f1vefour I don't have any conclusive idea... 8 days ago (or was it more?), I've reinstalled both units using LZO compression for the OS partition, disabled some services i don't use (LIRC, Ahvi), reverted Kodi to the one suggested above (14.1.x), fully updated to this day and no more Panics...
I speculate that the problem was never on the 3.18.8+ Kernel but in some process that was running on the background (was not Kodi because as i stated the Panics occurred even with Kodi not running), but since the original problem a lot of software was updated on a regular basis and i can only think that one of the programs/services/whatever that was updated, was in fact the culprit. Still Kodi is randomly sluggish compared to xbmc... and as of today i'm running 14.2 with the same issues.
@Fabio72 yep removing the screensaver was my 1st action back when i got the 1st few freezes... Kodi does not freeze/crash anymore but its performance is way below the one that xbmc (13) had. I have random "slowdowns" and sluggish input performance and i cant pin point the source of the problem... True is that it only happens when Kodi is running (system gets slow even on SSH with Kodi on )
@The-Exnor yes, also for me kodi has some performance issues. Slowdowns or short freezes navigating menus or browsing nfs. Playback is still fine. The only things I could see on dmesg are hrtimer: interrupt took 54000 ns but happens once a week, not more. And: Apr 4 09:47:16 xbian kernel: [262645.634151] INFO: rcu_preempt detected stalls on CPUs/tasks: {} (detected by 0, t=21002 jiffies, g=6630648, c=6630647, q=1052) Apr 4 09:47:16 xbian kernel: [262645.634187] INFO: Stall ended before state dump start but happened once
As requested by f1vefour here is as picture of Kernel panic situation that is recurrent on 2 units (http://forum.xbian.org/thread-2783-page-3.html) and (http://forum.xbian.org/thread-2827.html).
Picture (http://s10.postimg.org/y6r0r3mnd/Kernel.jpg)
Have you any idea @mk01
Thanks.