nextcloud / spreed

🗨️ Nextcloud Talk – chat, video & audio calls for Nextcloud
https://nextcloud.com/talk
GNU Affero General Public License v3.0
1.63k stars 436 forks source link

Multiple sessions from different devices broken #8416

Closed SystemKeeper closed 1 year ago

SystemKeeper commented 1 year ago

How to use GitHub


Steps to reproduce

  1. Join "Conversation 1" in Web
  2. Join "Conversation 1" in Talk-iOS
  3. Web gets redirected to "Conversation not found"

Expected behaviour

Both should be able to join.

Actual behaviour

The web session is redirected to "Conversation not found" image

Talk app

Tested on sermo.

nickvergessen commented 1 year ago

Tried reverting #8265 on sermo but that did not fix it.

nickvergessen commented 1 year ago

Reverting https://github.com/nextcloud/server/pull/34934 seems to fix it 🙊

cc @juliushaertl any clue about potential side effects of your PR?

nickvergessen commented 1 year ago

PS: only reverting the first commit is enough to fix the problem. 😿

SystemKeeper commented 1 year ago

That PR also got backported to 25.0.2 RC2, so the error should be seen on c.nc.c as well, shouldn’t it?

juliusknorr commented 1 year ago

I still see this happening with the patch reverted (just that the original bug of https://github.com/nextcloud/spreed/pull/8265 happens more often and before that) and at least currently can't imagine what kind of side effect that might be.

One thing I noticed when stepping through the session joining with multiple browsers was that it seems to delete the oc_talk_session entry once the second browser joins the session.

This seems to be triggered by https://github.com/nextcloud/spreed/blob/3d49db3b713b7fd6379b288a379f30401400cd88/lib/Controller/RoomController.php#LL1355C49-L1355C49 returning null, which then fetches any session when getting the previous participant/session in https://github.com/nextcloud/spreed/blob/3d49db3b713b7fd6379b288a379f30401400cd88/lib/Controller/RoomController.php#L1369.

According to PHPdoc null is trying to load any session, which I think is probably wrong here.

Small patch that seemed to work on my testing, but I'm not fully sure if that makes sense:

diff --git a/lib/Controller/RoomController.php b/lib/Controller/RoomController.php
index 958cbb23e..35086ad1c 100644
--- a/lib/Controller/RoomController.php
+++ b/lib/Controller/RoomController.php
@@ -1345,6 +1345,7 @@ class RoomController extends AEnvironmentAwareController {
        /**
         * @PublicPage
         * @BruteForceProtection(action=talkRoomPassword)
         *
         * @param string $token
         * @param string $password
@@ -1365,7 +1366,7 @@ class RoomController extends AEnvironmentAwareController {
                $previousSession = null;
                if ($this->userId !== null) {
                        try {
-                               $previousParticipant = $this->participantService->getParticipant($room, $this->userId, $sessionId);
+                               $previousParticipant = $this->participantService->getParticipant($room, $this->userId, $sessionId ?? '');
                                $previousSession = $previousParticipant->getSession();
                        } catch (ParticipantNotFoundException $e) {
                        }
juliusknorr commented 1 year ago

In addition it might make sense to re-add the @UseSession annotation to the joinRoom and leaveRoom controller methods to ensure that the session is locked during those operations. This might help with keeping the session state consistent between multiple concurrent requests.

The original root cause of this was resolved in the server PR.

The only scenario I could reproduce still was that when quickly switching between rooms in one browser window sometimes the signaling API request was still sent with the old room id even though the leave and join room requests already happened, but that sounds more like a frontend bug:

Screenshot 2022-12-02 at 10 28 13

This then leads to a "The conversation does not exist" page.

nickvergessen commented 1 year ago

Did something improve with the latest server changes @SystemKeeper @mahibi ?

SystemKeeper commented 1 year ago

Did something improve with the latest server changes @SystemKeeper @mahibi ?

Is sermo running on latest master? Then no, still getting User or session was removed from the conversation, redirecting.

SystemKeeper commented 1 year ago

For me it looks like it's this commit: https://github.com/nextcloud/spreed/commit/5f060687c8a0c77f1a61bf9a3462d4bfa4fe055f

When I run aac298148e1cac88c1437e016f2ffe4f7e24d8d0, multisession is fine, but with 5f060687c8a0c77f1a61bf9a3462d4bfa4fe055f I get duplicated session error.

nickvergessen commented 1 year ago

That would be interesting. Thanks fur debugging, I will have a look again. Maybe we did a mistake on the room/participant object

SystemKeeper commented 1 year ago

Before https://github.com/nextcloud/spreed/commit/5f060687c8a0c77f1a61bf9a3462d4bfa4fe055f this check succeeded and returned a participant (the sessionId was null, so was the sessionId of the participant) image

But after https://github.com/nextcloud/spreed/commit/5f060687c8a0c77f1a61bf9a3462d4bfa4fe055f this check fails (because this->userCache is not initialized at that point) and results in a query, which then leads to the duplicate session problem.

nickvergessen commented 1 year ago

The problem actually seems to be the room controller. It tries to get participants when it shouldn't. I fixed it now with some explanation in https://github.com/nextcloud/spreed/pull/8592

So yeah, the difference is that Room::getParticipant was using the cached version from themanager->getRoomForUserByToken call in stable25, while in master participantService->getParticipant was loading a new object with sessionId === null which meant any previous session would be used and then removed.

That being said. We should potentially try to write the participant from manager->getRoomForUserByToken into the userCache of participantService to save work afterwards.