Closed GoogleCodeExporter closed 9 years ago
Which version of mp3 codec? Can you reproduce? Is the file local and is the
LocalPlayer plugin installed?
Original comment by trio...@btinternet.com
on 19 Feb 2013 at 9:29
It's looping currently... Can't reproduce at will but seems to happen more
frequently than with rc2 (luck?)
The mp3 codec is libmpg123-0 version 1.12.1-3.2ubuntu1 (latest)
file is local (on usb disk ntfs formatted)
I don't know what the LocalPlayer plugin is so I'm going to say no :)
I'm using Logitech Media Server Version: 7.7.2 - r33893
Here is the output of pstack (if that helps):
5814: /opt/squeezelite/squeezelite-i386 -z -o hw:CARD=D1,DEV=0 127.0.0.1
(No symbols found in )
(No symbols found in /usr/lib/i386-linux-gnu/libasound.so.2)
(No symbols found in /lib/i386-linux-gnu/libdl.so.2)
(No symbols found in /lib/i386-linux-gnu/librt.so.1)
(No symbols found in /lib/i386-linux-gnu/libc.so.6)
(No symbols found in /lib/i386-linux-gnu/libm.so.6)
(No symbols found in /lib/ld-linux.so.2)
(No symbols found in /usr/lib/i386-linux-gnu/libpulse.so.0)
(No symbols found in /usr/lib/i386-linux-gnu/libjson.so.0)
(No symbols found in /usr/lib/i386-linux-gnu/libpulsecommon-1.1.so)
(No symbols found in /lib/i386-linux-gnu/libdbus-1.so.3)
(No symbols found in /usr/lib/i386-linux-gnu/libxcb.so.1)
(No symbols found in /lib/i386-linux-gnu/libwrap.so.0)
(No symbols found in /usr/lib/i386-linux-gnu/libsndfile.so.1)
(No symbols found in /usr/lib/i386-linux-gnu/libasyncns.so.0)
(No symbols found in /usr/lib/i386-linux-gnu/libXau.so.6)
(No symbols found in /usr/lib/i386-linux-gnu/libXdmcp.so.6)
(No symbols found in /lib/i386-linux-gnu/libnsl.so.1)
(No symbols found in /usr/lib/i386-linux-gnu/libFLAC.so.8)
(No symbols found in /usr/lib/i386-linux-gnu/libvorbisenc.so.2)
(No symbols found in /usr/lib/i386-linux-gnu/libvorbis.so.0)
(No symbols found in /usr/lib/i386-linux-gnu/libogg.so.0)
(No symbols found in /lib/i386-linux-gnu/libresolv.so.2)
(No symbols found in /usr/lib/libfaad.so.2)
(No symbols found in /usr/lib/i386-linux-gnu/libvorbisfile.so.3)
(No symbols found in /usr/lib/i386-linux-gnu/libmad.so.0)
0x0087d416: _fini + 0x266936
Tried a "kill -5 <pid>" to generate a code but it doesn't seem like it worked...
Let me know if you want me to run with debugs
Original comment by danielbo...@gmail.com
on 19 Feb 2013 at 10:37
...generate a coRe
Next time I will try gcore
Original comment by danielbo...@gmail.com
on 19 Feb 2013 at 10:42
If you can reproduce, please try with -d all=sdebug (will produce lots of text,
but should show where the error is)
Original comment by trio...@btinternet.com
on 19 Feb 2013 at 10:48
Ok, running with debugs on until it loops again and will try to generate a core
next time.
Original comment by danielbo...@gmail.com
on 19 Feb 2013 at 11:01
Let it play all day yesterday with debugs on. No lock-up.
Trying today with no debugs to see if I can reproduce.
When it locked up, I might have been playing with the -p option. BTW, it
doesn't seem to change anything. No matter what value I specify, all 4 threads
seem to always be at 20 (according to htop).
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 3:28
-p option sets the real time priorty which will show up if the user you run it
as is allowed to use the real time scheduler class. This requires explicit
permissions configuration or for you to run it as root. You can see whether
the -p option worked by starting with -d output=debug
Original comment by trio...@btinternet.com
on 21 Feb 2013 at 6:23
Running with debugs did not produce any lock ups but running without did.
And unless I'm mistaking, the executable was on "pause" when it happened, this
time.
Plus, two threads were spinning: the first and third (by increasing order of
pid).
I managed to run strace and gcore on the two pids. Would you like me to upload
the cores anywhere?
The first strace generated this:
futex(0x8060618, FUTEX_LOCK_PI_PRIVATE, 1 <unfinished ...>
The second one, generated nothing.
I will re-enable the debugs now and try harder to get a lock up with them on.
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 7:50
Were you able to tell which thread was spinning at high rate? Probably the one
with no system calls? I would expect 4 threads to all be active at low rate
normally. Are you saying the other two were stopped? if so which system call
are they stopped in - is this a futex call? I wonder if the decode library is
not returning and hence holding a mutex permanently.
Please upload the core anyway - can't promise to look until the weekend.
Original comment by trio...@btinternet.com
on 21 Feb 2013 at 7:58
Out of the 4 threads, said pids 2276-2279. Only pids 2276 and 2278 were at 100%
each (dual core). The other two appeared "normal", say 0-10% cpu.
As mentioned in previous message, only thread 2276 had anything coming out of
strace:
futex(0x8060618, FUTEX_LOCK_PI_PRIVATE, 1 <unfinished ...>
The other thread (2278) returned a 0 byte file...
I will gzip the two cores and try to attach here. If it fails due to size, I'll
find a suitable place to upload them and supply the link.
No pressure, take a look when you can. I'm very happy to have squeezelite as it
stands. If I can help make it better, then good for us both (and all the others
too!)
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 9:20
Attachments:
Here is the second core (the one with the empty strace output).
Thanks,
Daniel
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 9:26
Attachments:
It just started looping with debugs on...
The file is 105MB in size but probably only the last two lines are relevant...
Here are the last lines of the file:
[17:06:50.364538] output_thread:1199 wrote 440 frames
[17:06:50.374368] output_thread:742 avail: 484 frames: 484 silence: 1
[17:06:50.374624] output_thread:1199 wrote 484 frames
[17:06:50.386075] output_thread:742 avail: 440 frames: 440 silence: 1
[17:06:50.387076] output_thread:1199 wrote 440 frames
[17:06:50.395327] output_thread:742 avail: 440 frames: 440 silence: 1
[17:06:50.395491] output_thread:1199 wrote 440 frames
[17:06:50.405468] output_thread:742 avail: 440 frames: 440 silence: 1
[17:06:50.405902] output_thread:1199 wrote 440 frames
[17:06:50.430343] output_thread:742 avail: -32 frames: -32 silence: 1
[17:12:31.325375] slimproto_stop:745 slimproto stop
Yes, I just stopped the playback and it started looping right away.
Let me know if you want more of the debug file.
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 10:21
That actually looks ok - its writing silence to the output and there are small
gaps between the timestamps which suggests its not running at 100%. Can you
look back earlier in the file to see any differnces?
Original comment by trio...@btinternet.com
on 21 Feb 2013 at 10:24
Ok, obviously I don't know what's important and what's not so I might as well
upload the whole mess to you :)
I hope you have a good storage limit for attached files... I split the file in
two parts and gzipped them. I will be able to upload each part separately (10MB
limit per post). so here goes...
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 10:39
Attachments:
...and the second one...
Original comment by danielbo...@gmail.com
on 21 Feb 2013 at 10:44
Attachments:
Just some more data for you:
I've been listening to music all day with debugs turned off (figured I gave you
enough for now!).
Everything was fine until it started playing a new song (about 5 seconds into
it) when squeezelite froze again (2 threads at 100% and 2 at 0%):
2467: 100% cpu
strace: 1 line: futex(0x8060618, FUTEX_LOCK_PI_PRIVATE, 1{waiting}
2468: 0% cpu
strace:
Process 2468 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
nanosleep({0, 100000000}, NULL) = 0
{last line repeated continuously}
2469: 100% cpu
strace: no output
2470: 0% cpu
strace: 1 line: futex(0x8060618, FUTEX_LOCK_PI_PRIVATE, 1{waiting}
When I killed squeezelite, there was a loud CLICK in the headphones.
Thanks,
Daniel
Original comment by danielbo...@gmail.com
on 25 Feb 2013 at 8:39
More data points (let me know if you want me to stop :)
Been listening to music all day, no lockups.
Just now, opening a terminal window and entering the command "man pidof" would
make squeezelite lock up in the usual manner. Got it to do it several times in
a row (5-6).
Going in Firefox and just clicking between a few tabs, the sound would
sometimes have a bit of a split-second pause and after a few tries, I got it to
lock up there too.
I swear it's the truth... :-)
Original comment by danielbo...@gmail.com
on 26 Feb 2013 at 9:10
Still can't see what it wrong - something is running in a hard loop and
producing no debug output. Are all the files mp3s? Could you try using
libmpg123 rather than libmad?
Original comment by trio...@btinternet.com
on 26 Feb 2013 at 9:40
Most of my files are in fact mp3s but I also have some flac files but I've
never had a lockup on a flac file (could just be the laws of statistics
though). I have both libmpg123 and libmad loaded. I will have to see how I can
use one instead of the other. I am now trying to increase the priority of the
playback thread so see if it could be related.
If you want me to try anything different (a version with increased debugs, a
special test version, etc.) please let me know. I will also try without the usb
dac to see if this changes anything. (I will figure out how to use libmpg123
instead of libmad first).
Original comment by danielbo...@gmail.com
on 27 Feb 2013 at 1:20
Switched to libmpg123 and noticed right away that %cpu was at 0-3% on all 4
threads instead to 2-8% with libmad.
Unfortunately, I got another lockup using libmpg123. Started noticing that the
lockups usually happen in the afternoon, after the computer's been on for 5+
hours. Looked at cron jobs, looked at logs and found nothing except for this in
syslog:
Feb 27 13:19:43 daniel-laptop pulseaudio[1828]: [alsa-sink] alsa-sink.c: Error
opening PCM device front:1: Device or resource busy
Feb 27 13:19:43 daniel-laptop pulseaudio[1828]: [pulseaudio] sink-input.c:
Failed to create sink input: sink is suspended.
I got these debugs today at 11:13, at 13:19 and at 13:51 (over and over for
about 2 seconds) and the lockup happened around 13:45 or so. Will try to
increase the output thread's priority as a next step. Unless you have better
suggestions.
Original comment by danielbo...@gmail.com
on 27 Feb 2013 at 6:55
I didn't realise you are using pulse audio - is there any option you have to
try without it. Others have had some problems with pulse - would be good to
understand if this is causing the problem in some way.
Original comment by trio...@btinternet.com
on 27 Feb 2013 at 7:01
Well, this is a standard install of Linux Mint13 so it comes with alsa and
pulse installed but I thought using -o "hw:CARD=D1,DEV=0" would circumvent most
of it. My normal system sounds are still coming out the laptop's speakers and
only the output of squeezelite is coming through the USB DAC. Perhaps these
error messages are related to the normal output? Doesn't seem to be at all
related to any squeezelite lockup.
Original comment by danielbo...@gmail.com
on 27 Feb 2013 at 8:47
Afraid I still can't correlate your data. The log files shows logging from all
threads at reasonable intervals. However you say two threads are running at
100% - if so with logging on doesn't strace produce something for those threads?
Original comment by trio...@btinternet.com
on 27 Feb 2013 at 9:53
I think that all logging simply grinds to a halt along with the music playback,
once the threads go into 100%/0% CPU. strace always shows what I've mentioned
in post #16.
I could try with increased priority and/or through the default sound output
(without the USB DAC) to try and isolate the problem?
I see in a slimdevices.com thread dated nov 2012 that some people (Mnyb) had
problems somewhat similar but not exactly with threads locking up (He saw
FUTEX_LOCK and FUTEX_UNLOCK whereas I only see the LOCK and it seems to hang
there).
I'll keep on using squeezelite and try to isolate cases where it doesn't lock
up and perhaps this will help you see what's going on...
Probably means nothing but when it's looping, a single "kill -TERM xx" will not
do it, I have to do it twice to get it to die.
If you think of anything for me to try, please let me know.
Thanks,
Daniel
Original comment by danielbo...@gmail.com
on 27 Feb 2013 at 10:11
FUTEX_LOCK should result in 0% cpu - it means that the calling thread as tried
to set a mutex, but it is already held by someone else. That's normal if
something else has the lock, what's not normal is this causing the thread to be
at 100% cpu - it should be 0....
Re post #16 can you post what the repeating strace entry is for the nanosleep
entry - is this once every 100ms or something different?
Original comment by trio...@btinternet.com
on 27 Feb 2013 at 10:30
Makes sense... I'm a programmer by trade (25 years) but I have little knowledge
(read none) of multi-thread programming...
It's hard to time the output of strace for the nanosleeps but I'd say that once
every 100ms sounds about right...
At one point, if you prefer we take this off-line, let me know and we can
communicate directly by email. You can use born_daniel at yahoo dot com if
you'd like.
Original comment by danielbo...@gmail.com
on 27 Feb 2013 at 10:40
I'm currently running squeezelite with output to default. It's so different
than when outputting to the USB DAC... First of all, 5 threads instead of 4 and
a cpu usage much higher (13+7+6+0+0=26% cpu) either paused or playing music!?
Just curious, what is your usual mode of utilization? what would be most
people's? (if you know). Am I using a rare configuration?
Maybe I'll try squeezelite on a different computer with the USB DAC...
If you feel like sharing some of your thoughts with me, I'd like to help
perhaps more than just supplying debugs. I'm used to debugging real-time,
multi-process complex systems (just not multi-threaded).
Original comment by danielbo...@gmail.com
on 28 Feb 2013 at 1:37
It's been over a week now of continuous use without a single lockup, whereas it
used to happen at least once a day previously.
I can only attribute it to a kernel upgrade (3.2.0-25 to 3.2.0-38).
Anyway, I think it is safe to simply close this issue now.
Thanks,
Daniel
Original comment by danielbo...@gmail.com
on 13 Mar 2013 at 12:27
Closing as discussed - appears to be specific issue with problem linux kernel
Original comment by trio...@btinternet.com
on 13 Mar 2013 at 8:36
Original issue reported on code.google.com by
danielbo...@gmail.com
on 19 Feb 2013 at 9:10