PromyLOPh / pianobar

Console-based pandora.com player
http://6xq.net/pianobar/
Other
1.74k stars 321 forks source link

Frequent hangs #657

Closed exarkun closed 6 years ago

exarkun commented 6 years ago

I see there are a number of issues in the tracker similar to this one. I don't know if my issue has the same underlying cause as any of them so I'm opening a new issue instead of attaching my comments/details to an existing issue.

Using pianobar on Ubuntu 17.10, either the packaged version (reports as 2017.08.30 - Ubuntu version 2017.08.30-1) or a debug-symbols-enabled version from source control (2e51a13fe816c0c0b02f7d7a19a4c739dcb66119), playback and the UI frequently hang.

Often Ctrl-C will get things moving again, but not always.

Here are stack traces from one recent hang of the custom-built version (gdb attached after the hang):

Thread 8 (Thread 0x7fb93b7fe700 (LWP 9121)):                                                                                                                  
#0  0x00007fb961370951 in __GI___poll (fds=0x7fb934001800, nfds=3, timeout=1499) at ../sysdeps/unix/sysv/linux/poll.c:29                                      
#1  0x00007fb949552451 in ?? () from /usr/lib/x86_64-linux-gnu/libpulse.so.0   
#2  0x00007fb949543e10 in pa_mainloop_poll () from /usr/lib/x86_64-linux-gnu/libpulse.so.0                                                                    
#3  0x00007fb9495444a0 in pa_mainloop_iterate () from /usr/lib/x86_64-linux-gnu/libpulse.so.0                                                                 
#4  0x00007fb949544530 in pa_mainloop_run () from /usr/lib/x86_64-linux-gnu/libpulse.so.0
#5  0x00007fb949552399 in ?? () from /usr/lib/x86_64-linux-gnu/libpulse.so.0                                                                                  
#6  0x00007fb9492f3ed8 in ?? () from /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-10.0.so                                                              
#7  0x00007fb9642537fc in start_thread (arg=0x7fb93b7fe700) at pthread_create.c:465
#8  0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95                                                                            

Thread 7 (Thread 0x7fb93ffff700 (LWP 9120)):                                                                                                                  
#0  0x00007fb96425a072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb9400e8b18)                                              
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88                                                                                                         
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fb9400e8b20, cond=0x7fb9400e8af0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fb9400e8af0, mutex=0x7fb9400e8b20) at pthread_cond_wait.c:655
#3  0x00007fb9620bac49 in ?? () from /usr/lib/x86_64-linux-gnu/libavfilter.so.6
#4  0x00007fb9642537fc in start_thread (arg=0x7fb93ffff700) at pthread_create.c:465
#5  0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7fb944e20700 (LWP 9119)):
#0  0x00007fb96425a072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb9400e8b18)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fb9400e8b20, cond=0x7fb9400e8af0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fb9400e8af0, mutex=0x7fb9400e8b20) at pthread_cond_wait.c:655
#3  0x00007fb9620bac49 in ?? () from /usr/lib/x86_64-linux-gnu/libavfilter.so.6
#4  0x00007fb9642537fc in start_thread (arg=0x7fb944e20700) at pthread_create.c:465
#5  0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7fb945621700 (LWP 9118)):
#0  0x00007fb96425a072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb9400e8b18)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fb9400e8b20, cond=0x7fb9400e8af0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fb9400e8af0, mutex=0x7fb9400e8b20) at pthread_cond_wait.c:655
#3  0x00007fb9620bac49 in ?? () from /usr/lib/x86_64-linux-gnu/libavfilter.so.6
#4  0x00007fb9642537fc in start_thread (arg=0x7fb945621700) at pthread_create.c:465
---Type <return> to continue, or q <return> to quit---
#5  0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7fb946623700 (LWP 9117)):
#0  0x00007fb96425a072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb9400e8b18)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fb9400e8b20, cond=0x7fb9400e8af0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fb9400e8af0, mutex=0x7fb9400e8b20) at pthread_cond_wait.c:655
#3  0x00007fb9620bac49 in ?? () from /usr/lib/x86_64-linux-gnu/libavfilter.so.6
#4  0x00007fb9642537fc in start_thread (arg=0x7fb946623700) at pthread_create.c:465
#5  0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7fb945e22700 (LWP 9116)):
#0  0x00007fb96425a072 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb9400e8b18)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7fb9400e8b20, cond=0x7fb9400e8af0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x7fb9400e8af0, mutex=0x7fb9400e8b20) at pthread_cond_wait.c:655
#3  0x00007fb9620bac49 in ?? () from /usr/lib/x86_64-linux-gnu/libavfilter.so.6
#4  0x00007fb9642537fc in start_thread (arg=0x7fb945e22700) at pthread_create.c:465
#5  0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7fb947440700 (LWP 9113)):
#0  0x00007fb961370951 in __GI___poll (fds=0x7fb94743f500, nfds=1, timeout=100) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007fb96280aabc in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#2  0x00007fb96280ab42 in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#3  0x00007fb96286ec67 in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#4  0x00007fb962740a84 in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#5  0x00007fb96278506c in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#6  0x00007fb962788dde in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#7  0x00007fb962740a84 in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#8  0x00007fb962743e5d in avio_read () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#9  0x00007fb962879290 in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#10 0x00007fb9627c6e87 in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#11 0x00007fb96287e12a in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#12 0x00007fb96287ed9c in ?? () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#13 0x00007fb96287fc85 in av_read_frame () from /usr/lib/x86_64-linux-gnu/libavformat.so.57
#14 0x000055b43e26d262 in play (player=0x55b43e483188 <app+104>) at src/player.c:307
#15 0x000055b43e26d6b3 in BarPlayerThread (data=0x55b43e483188 <app+104>) at src/player.c:415
---Type <return> to continue, or q <return> to quit---
#16 0x00007fb9642537fc in start_thread (arg=0x7fb947440700) at pthread_create.c:465
#17 0x00007fb96137cb5f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7fb96461c300 (LWP 3623)):
#0  0x00007fb9613728c0 in __GI___select (nfds=1, readfds=0x7ffd54f9b1a0, writefds=0x0, exceptfds=0x0, timeout=0x7ffd54f9b190)
    at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x000055b43e274f19 in BarReadline (buf=0x7ffd54f9b256 "", bufSize=2, mask=0x0, input=0x55b43e483400 <app+736>,
    flags=(BAR_RL_FULLRETURN | BAR_RL_NOECHO | BAR_RL_NOINT), timeout=1) at src/ui_readline.c:88
#2  0x000055b43e26b589 in BarMainHandleUserInput (app=0x55b43e483120 <app>) at src/main.c:203
#3  0x000055b43e26be68 in BarMainLoop (app=0x55b43e483120 <app>) at src/main.c:389
#4  0x000055b43e26c36c in main (argc=1, argv=0x7ffd54f9b458) at src/main.c:491
PromyLOPh commented 6 years ago

If I’m reading the stack traces correctly there’s two possible scenarios here:

1) Some kind of deadlock inside ffmpeg, see threads 3–7 spawned by ffmpeg waiting inside pthread_cond_wait. 2) Connectivity issues, i.e. network connection dropped and ffmpeg is waiting indefinitely for data (thread 2)

We might be able to eliminate option 2 by reintroducing timeouts. Can you apply this patch on top of 2e51a13 please? https://gist.github.com/2ce9b9a3ed9d575b8b986b99f64db181

exarkun commented 6 years ago

My network is prone to a lot of mini-disruptions. I've applied the patch, I'll see how it goes and report back.

exarkun commented 6 years ago

This morning, with the suggested patch applied, I had a run of pianobar which didn't hang for around 50 songs. This run without hangs is significantly longer than I've experienced with 2017.08.30. My network had an obvious hiccup somewhere around the 50 song mark. pianobar appeared to recover from this after a couple seconds. The next time it went to retrieve a playlist, though, it hung at that point and did not recover. Perhaps this is an unrelated bug? I hit Ctrl-C to try to make it recover to no effect. There's not much in the way of stack trace to report:

(gdb) thread apply all bt

Thread 1 (Thread 0x7f969defd300 (LWP 22901)):
#0  0x00007f969ac538c0 in __GI___select (nfds=1, readfds=0x7fff67a51b60, writefds=0x0, exceptfds=0x0, timeout=0x7fff67a51b50)
    at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x00005621b10c6fb5 in BarReadline (buf=0x7fff67a51c16 "", bufSize=2, mask=0x0, input=0x5621b12d5400 <app+736>, 
    flags=(BAR_RL_FULLRETURN | BAR_RL_NOECHO | BAR_RL_NOINT), timeout=1) at src/ui_readline.c:88
#2  0x00005621b10bd5d9 in BarMainHandleUserInput (app=0x5621b12d5120 <app>) at src/main.c:203
#3  0x00005621b10bdeb8 in BarMainLoop (app=0x5621b12d5120 <app>) at src/main.c:389
#4  0x00005621b10be3bc in main (argc=1, argv=0x7fff67a51e18) at src/main.c:491
(gdb) 
PromyLOPh commented 6 years ago

Hm, it’s odd that pianobar doesn’t react to ^C. I guess we could try to restore timeouts for CURL as well and see how that goes: https://gist.github.com/efaceeefc4b9d0a27e9dde3735546c5e

exarkun commented 6 years ago

Okay. Trying that patch today.

exarkun commented 6 years ago

Early in my pianobar use today, with the above patch applied, it hung like this:

(i) Receiving new playlist... Network error: Timeout was reached               

Just one thread again:

(gdb) thread apply all bt

Thread 1 (Thread 0x7fe9d541c300 (LWP 20020)):
#0  0x00007fe9d21728c0 in __GI___select (nfds=1, readfds=0x7ffde39d7190, writefds=0x0, exceptfds=0x0, timeout=0x7ffde39d7180)
    at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x000055cef8893108 in BarReadline (buf=0x7ffde39d7246 "", bufSize=2, mask=0x0, input=0x55cef8aa1400 <app+736>, 
    flags=(BAR_RL_FULLRETURN | BAR_RL_NOECHO | BAR_RL_NOINT), timeout=1) at src/ui_readline.c:88
#2  0x000055cef888a6a9 in BarMainHandleUserInput (app=0x55cef8aa1120 <app>) at src/main.c:203
#3  0x000055cef888af88 in BarMainLoop (app=0x55cef8aa1120 <app>) at src/main.c:389
#4  0x000055cef888b48c in main (argc=1, argv=0x7ffde39d7448) at src/main.c:491
(gdb) 
PromyLOPh commented 6 years ago

Well, that error is expected if you add timeouts. But does that means you were not able to interact with pianobar (i.e. select different station) afterwards?

exarkun commented 6 years ago

Yea, makes sense that sometimes the playlist fetch would fail with a timeout now. What's the expected behavior after this? Is it expected that pianobar will re-try the playlist download? Or, as it sounds like you're suggesting, does it just mean pianobar remains useable and will start playing tracks from a channel if I just use th UI to select one?

Unfortunately I can't recall for sure exactly what I did when that session hung. I may have hit Ctrl-C since I've observed that to "unstick" pianobar in the past. Or, just as likely, I may have left it alone to be sure to have a chance to attach gdb. So, sorry about that, I'll be more clear about what interactions are possible in the future.

PromyLOPh commented 6 years ago

What's the expected behavior after this? Is it expected that pianobar will re-try the playlist download? Or, as it sounds like you're suggesting, does it just mean pianobar remains useable and will start playing tracks from a channel if I just use th UI to select one? If a playlist fetch fails pianobar will just idle right now. You should be able to select a new (or the same) station and play that. Obviously a real patch needs some kind of retry mechanism.

exarkun commented 6 years ago

Encountered a playlist download timeout this morning. The UI did remain responsive and I was able to re-select the station that was playing and play resumed.

PromyLOPh commented 6 years ago

The branch “timeout” contains the (hopefully) final fix, including retries. Can you give it a try before I merge it into master?

exarkun commented 6 years ago

Thanks. I've used the branch for a couple days now and it seems to be behaving well with respect to timeouts and play-resumption.

PromyLOPh commented 6 years ago

Merged into master. Thanks again!

Referencing a few issues that might be duplicates of this one: #658, #654, #623, #610

ghost commented 6 years ago

Pianobar is awesome, thanks! Do you think we could get a new release with these fixes?

PromyLOPh commented 6 years ago

Sure, tagged 2018.06.22.