LMS-Community / slimserver

Server for Squeezebox and compatible players. This server is also called Lyrion Music Server.
https://lyrion.org
Other
1.2k stars 297 forks source link

CometD: stalled subscription notifications #530

Closed timbell closed 4 days ago

timbell commented 3 years ago

Testing CometD subcriptions to server status and player status notifications using libcometd.js from the material-skin plugin.

When this code path is followed on the server long-polled notifications are stalled until the next long polling timeout (60s) occurs. This is due to the remove_connection call in sendHTTPResponse.

Reproducing the problem is a bit tricky. I've managed to do it reasonably reliably using the attached test web page as follows:

NB the page refreshing is crucial to getting transport('long-polling') to be set on the $httpClient used to service the /slim/subscribe player status call.

From what I can see, the problem could be fixed in a few different ways:

The problem was reproduced with LMS running on Windows 10 (version 8.1) and Linux (version 7.9.1).

michaelherger commented 3 years ago

Is this something @CDrummond has seen, too? Are you able to reproduce this with any browser?

CDrummond commented 3 years ago

Can't say I have noticed this issue, and I do refresh the page quite often to ensure the skin reloads. I clear cache and do a full refresh/reload, so perhaps this is why I do not see it?

I have noticed, but probably not related to this issue, that notifications can be 'slow' (i.e. take a few seconds to show), which is why Material manually requests a status update after most actions.

timbell commented 3 years ago

Reproduced with Chrome + Firefox on Windows and Chrome on Android. I'm pretty sure it's not a client issue - the page refreshing just seems to be a reasonably reliable way to get the $delayedResponse = 0 code path to be followed on the server.

What I think is happening is that the $httpClient instances persist across requests and each one is associated with a particular socket. (I could be way off here - new to Perl.) When the page is opened we get, say, socket 1 used for /meta/connect long polling requests and socket 2 used for other requests (/slim/subscribe). Refreshing the page and connecting again can lead to socket 2 being used for /meta/connect long polling requests and socket 1 being used for /slim/subscribe requests.

So, the $httpClient that was initially being used for the long-polling requests (and so has transport('long-polling') set) is now being used for the /slim/subscribe requests. And that's how we get to execute the $delayedResponse = 0 code path. (There could be other ways to get there too.)

michaelherger commented 3 years ago

Thanks for your test files, @timbell. I believe I have been able to reproduce the issue: sometimes after reloading a web UI using long-polling, it would take a time out on the long-poll before the connection would re-establish. Therefore after a page reload we can have up to 60 seconds before the notifications are reaching the web UI. Is this correct? Are there any other limitations I haven't been able to experience?

timbell commented 3 years ago

Yes, @michaelherger, that's right. (There may be other ways that can cause the $delayedResponse = 0 code path in Slim::Web::Cometd::handler to be followed, however.)

A simple fix might be to clear the transport in Slim::Web::Cometd::handler like this:

    # reset transport (especially for the long-polling case)
    $conn->[HTTP_CLIENT]->transport(  ' '  );

    my $clid;
    my $events = [];
    ...
github-actions[bot] commented 2 weeks ago

:warning: This issue is stale because it has been open for 720 days with no activity. Please chime in if you want to keep it alive.

github-actions[bot] commented 4 days ago

:no_entry: This issue was closed because it has been inactive for 14 days since being marked as stale.