sentriz / gonic

music streaming server / free-software subsonic server API implementation
ircs://irc.libera.chat/#gonic
GNU General Public License v3.0
1.5k stars 105 forks source link

jukebox locks and shuffling #411

Closed niawag closed 7 months ago

niawag commented 8 months ago

gonic version: 0.16.1 if from docker, docker tag: latest arm64

Hi, I'm using gonic to stream music to DSub and to play music locally with the jukebox and it's really awesome ! I'm having an issue with the jukebox mode: after some time (don't know how much) my phone lose the connection to the server. I can still stream songs to my phone but I can't control the server. If it's playing songs it keeps playing them till the end of the playlist. It seems it occurs when a big playlist (a few hundred songs) is played but I may not have noticed it other times.

Gonic logs (docker logs gonic) are empty but working (the logs are populated if i run a scan from the webui). I activated mpv logs and can't find anythin relevant, as I've said, the songs keep being played locally ant mpv logs show that.

Here is my docker compose setup:

gonic:
    container_name: gonic
    platform: linux/arm64 #necessary because on coreelec, the os is 32 bits but only arm64 images works
    image: sentriz/gonic:latest
    environment:
      - TZ=Europe/Paris
      - GONIC_HTTP_LOG=false
      - GONIC_JUKEBOX_ENABLED=true
      - GONIC_PLAYLISTS_PATH=/playlists
      - GONIC_MULTI_VALUE_ALBUM_ARTIST=multi
      - GONIC_JUKEBOX_MPV_EXTRA_ARGS="--log-file=/data/mpv_logs"
    ports:
      - 3080:80
    volumes:
      - ./gonic/data:/data
      - /storage/mnt/multimedia/mp3/_Playlists/:/playlists
      - /storage/mnt/multimedia/mp3/:/music:ro
      - ./gonic/cache:/cache 
      - /run/pulse/:/run/pulse/

I don't recall having this issue with gonic 0.15, I'm willing to test it but I don't know how to store the logs as they seem to disappear from docker logs after a bit of time. Do you know what I could do ?

sentriz commented 8 months ago

strange . are you running behind a reverse proxy?

niawag commented 8 months ago

Yes, gonic is behind a traefik proxy running on another server. The config is pretty simple:

http:
    routers:
      gonic-rtr:
        rule: "Host(`gonichostname.wtf`)"
        entryPoints:
          - https
        service: gonic-svc
        tls:
          certResolver: dnsovh
    services:
      gonic-svc:
        loadBalancer:
          servers:
            - url: "http://192.168.0.99:3080"

Le dim. 12 nov. 2023, 19:23, Senan Kelly @.***> a écrit :

strange . are you running behind a reverse proxy?

— Reply to this email directly, view it on GitHub https://github.com/sentriz/gonic/issues/411#issuecomment-1807203045, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLLWLQPN4ER5PYKDGQOZQ3YEEHZ3AVCNFSM6AAAAAA7IFQLASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXGIYDGMBUGU . You are receiving this because you authored the thread.Message ID: @.***>

sentriz commented 8 months ago

@lxea any ideas on this guy?

niawag commented 7 months ago

Hi, I'm trying to debut it on my side but it's a bit hard as I don't know how long it may take before DSub loses the connection to the jukebox. I haven't noticed anything strange in mpv logs (no mention of error, bug, warn...) but there is a lot of info in the logs and it's like looking for a needle in a haystack. I noticed that, once bugged, DSub keeps displaying the same song with an impossible timer. A song lasting 3 minutes can display a play time over 30 minutes (and counting). I'll keep trying to understand the cause of that bug, I thought it was missing items in the playlist but no.

sentriz commented 7 months ago

when you notice that DSub has a counter which is way too big, maybe you can try ask gonic what the current jukebox status is. in a browser, try vist

http://GONIC_IP:PORT/rest/jukeboxControl.view?u=USERNAME&p=PASSWORD&c=browser

you should see a "position" parameter

image

and this is the value DSub uses, maybe that will be insightful

niawag commented 7 months ago

The rest api gives correct results when everything is ok but stop responding when connection to jukebox is gone. It weirdly doesn't seem to timeout (test with hostname, ip:port on browser and ip:port with curl). Here is the results from curl:

curl -vvv "http://192.168.0.99:3080/rest/jukeboxControl.view?u=niawag&p=mypasswordredacted&c=browser"
*   Trying 192.168.0.99:3080...
* Connected to 192.168.0.99 (192.168.0.99) port 3080 (#0)
> GET /rest/jukeboxControl.view?u=niawag&p=mypasswordredacted&c=browser HTTP/1.1
> Host: 192.168.0.99:3080
> User-Agent: curl/7.81.0
> Accept: */*
>

Absence of timeout seems normal for curl (https://unix.stackexchange.com/questions/94604/does-curl-have-a-timeout): connection is ok, and it's stuck at waiting for an answer. The rest (ahah) of the rest api seems ok as I can trigger a scan using .../rest/startScan.view...

The playlist I'm getting (with echo '{"command": ["get_property", "playlist"]}' | /opt/bin/socat - "$jukebox_socket" | sed 's/{/\'$'\n{/g' > mpv.playlist) is normal, even when jukebox connection is gone. The list is ok and the song playing is the right one. I can also play/pause/prev/next with the ipc socket so mpv seems to be ok.

I also tried to change the playlist position of the song that where on when the jukebox lost its connection (after a container restart). It played it without issue, no jukebox connection loss.

Is there a way to get more information about the status of jukeboxControl.view as it seems that it might be related to my issue ? Is there a list of rest API commands that I could try out ?

niawag commented 7 months ago

Two times in a row it lost the connection after the 26th track of the playlist (same playlist but suffled each time). I went directly to the 25th track and it didn't break after the 26th so it might be after a certain amount of tracks.

sentriz commented 7 months ago

hmm maybe try strace the gonic PID when its not responding. see if it's locked up or trying to connect to a socket and not responding

or I can try reproduce it and do the same too

niawag commented 7 months ago

Thanks I'll try that and report the results back. Just to be sure I'm not doing anything wrong:

niawag commented 7 months ago

The bug does not seem to be intercepted by strace, this is the result of strace -p 7 -t -o data/strace_gonic_pid_bug.log (before and after the bug, no change): 11:26:13 futex(0x1266828, FUTEX_WAIT_PRIVATE, 0, NULL <detached ...>

sentriz commented 7 months ago

hey i just pushed two commits related to this. could you try again with the :nightly tag?

niawag commented 7 months ago

Thanks grabbed and testing. A bit soon to know if the problem is fixed but another rose: the tracklist in dsub is not in sync with the tracks played on the server when it's shuffled. The tracklist is one track ahead of the server. And if you shuffle the tracklist on one phone and open dsub on another phone, the tracklist is shuffled completely differently. If you click on a track from phone 2, the track that'll played is the one with the same position on phone 1.

niawag commented 7 months ago

The problem is still there, connection with jukebox is not possible after a bit of time. jukeboxControl.view is not responding.

sentriz commented 7 months ago

I can reproduce the shuffling issue. that should be an easy fix

the locking is really annoying though. i was only able to reproduce it once and it took like 30 mins

I might try adding some mechanism to print the stacktraces of each goroutine when the jukebox is unresponsive. to see where it's stuck. I thought it was the jukebox mutex but maybe its the mpvipc library

niawag commented 7 months ago

Good for the shuffling problem.

About the ipc socket I know that it still work because I can play/pause (and other controls ) using it with a remote or bash scripts. I don't know if it's enough to discard a problem on the ipcmpv library though.

sentriz commented 7 months ago

shuffling problem should be fixed. or at least using it with dsub is a little more sane now. nightly release is building

as for the locks, i think the next time your or i can reproduce it, pressing Ctrl+\ in the console should print the state of all current goroutines which should help a lot with debugging

not sure if that's the case with docker though

niawag commented 7 months ago

Hi, the issue is still there at the 25-26th song, I've send the SIGQUIT signal to the container to emulate ctrl+\ key using this command and it works: docker kill --signal="SIGQUIT" gonic Here are the generated logs: gonic.log

sentriz commented 7 months ago

nice thanks for the log, I'll check it out. is the shuffling fixed?

niawag commented 7 months ago

Well, I'm thanking you for putting this much effort to solve this!

I can't know about the suffling issue for now as I'm not home, I can debug it remotely as dsub shows a track position of 14:38/2:40 when the connection is lost but I can't listen to the song remotely. I should be able to test it tonight.

niawag commented 7 months ago

The shuffling issue is gone, thanks!

Edit : the shuffling issue is still there but it occurred after I added an album to the playtrack and shuffled again

sentriz commented 7 months ago

this shuffling stuff is a little strange. i have question posted about it here https://github.com/opensubsonic/open-subsonic-api/discussions/71

niawag commented 7 months ago

Yeah, it's strange! I'm not completely sure about it but I think that versions prior v16 were OK. No loss of jukebox connection nor shuffling problems. If you want I can test it again to help out.

sentriz commented 7 months ago

very strange indeed, there were no jukebox changes from v0.15.2->v0.16.0

sentriz commented 7 months ago

ok i think i finally have the shuffling sorted. it seems to work in all the cases i've tried. would you mind trying again? thanks!

https://github.com/sentriz/gonic/actions/runs/6985093069

niawag commented 7 months ago

Waiting for the build to finish and test it. I thought the ipc socket changed between v15 and 16, was it earlier? In v15 the socket was a random name, it's always the same name since v16

sentriz commented 7 months ago

youre right the socket name changed, but no functionality that i know of changed

niawag commented 7 months ago

Testing the last build now and it seems to be working, I'll test it more tomorrow

sentriz commented 7 months ago

ah i think i figured it out. when the jukebox starts, it creates an event listener so that it can know when mpv says it's okay to seek through a track

but it looks like these listeners as supposed to be throwaway, not long lived. each one comes with a buffer size of 256 messages. when the next track in a playlist is played, it emits about 10 events (1 of which is hopefully the seekable event). and 256/10 is 25.6! to it makes sense it would hang after 25 tracks

i've pushed a commit that makes these listeners on-demand, and throws them away shortly after. hopefully this is the fix

let me know how it goes (and about the suffling too) thanks! https://github.com/sentriz/gonic/actions/runs/6990989776

niawag commented 7 months ago

A huge thanks, everything seems to be working great, jukebox connection and shuffling. That was some nasty bug, congrats on finding and solving it!

sentriz commented 7 months ago

very nice! and thanks very much for the $10, and also your help and patience in debugging this - much appreciated :sunglasses: :+1: closing then for now, but feel free to reopen if you notice any other weird behaviour