nextcloud / spreed

🗨️ Nextcloud Talk – chat, video & audio calls for Nextcloud
https://nextcloud.com/talk
GNU Affero General Public License v3.0
1.63k stars 436 forks source link

Signaling error does not disappear in some case, although joined #11216

Open SystemKeeper opened 10 months ago

SystemKeeper commented 10 months ago

How to use GitHub


Steps to reproduce

  1. Something happens that triggers a signaling error
  2. Join a conversation

Expected behaviour

At least after successful joining the error should disappear

Actual behaviour

image

I can see

image

but not sure if this related.

Talk app

Talk app version: 18 on c.nc.c

bentuna commented 9 months ago

I think we have the same issue here in NC 28.0.1 + Talk 18.0.1. In our case the Signaling Error is shown when the browser tab with talk is inactive for too long (multiple hours). When you then click or do anything in talk, the error does not disappear and you cannot start a call. The only thing that makes it disappear and work normal again, is to reload the page.

fancycode commented 9 months ago

Maybe similar to https://github.com/nextcloud/spreed/issues/9382?

SystemKeeper commented 9 months ago

Maybe similar to #9382?

That indeed sounds very similar, might be the same or same root cause?

Today I had the same error again and could investigate a bit. When being in that state the properties of signaling.js look like this:

image

What can be observed is that socket is null, this most probably is set in the reconnect method:

https://github.com/nextcloud/spreed/blob/20f4b76d28c475246e0a350279171c494bb94312/src/utils/signaling.js#L676.

Given that the setTimeout to reconnect was indeed executed, the only thing that can happen where we don't create a new socket is when we are waiting for a settings update, e.g. _pendingUpdateSettingsPromise is set:

https://github.com/nextcloud/spreed/blob/20f4b76d28c475246e0a350279171c494bb94312/src/utils/signaling.js#L690-L701

And you can observe in the first screenshot, that there's indeed a promise which looks like it will never be fulfilled and therefore we never try to connect to the external signaling server again.

The only part where we set the promise is at:

https://github.com/nextcloud/spreed/blob/20f4b76d28c475246e0a350279171c494bb94312/src/utils/signaling.js#L1529-L1544

This triggers an updateSettings call at:

https://github.com/nextcloud/spreed/blob/20f4b76d28c475246e0a350279171c494bb94312/src/utils/webrtc/index.js#L134-L138

So something must have gone wrong when getting the signaling settings. Looking back at the console I can see the following:

image

So what happens:

  1. Token expired, so we want to reconnect after updating the settings
  2. Update settings is triggered but this fails
  3. Since we never resolve the promise for updating the settings (=> never try to retry updating the settings), we never try to reconnect to the signaling server
fancycode commented 9 months ago

Yep, that could be the reason. Nice analysis!

So it should be sufficient to retry getting the settings if it failed? This should probably also handle the case where the user is no longer logged into Nextcloud and then redirect to the login page.

SystemKeeper commented 9 months ago

So it should be sufficient to retry getting the settings if it failed?

I guess that would make the most sense, yes. I did not check that, but it could also work that we just resolve the promise in all cases. That should trigger a reconnect, we should again have an expired token, which should trigger a settings update? 🤔