lynckia / licode

Open Source Communication Provider based on WebRTC and Cloud technologies
http://lynckia.com/licode
MIT License
3.1k stars 1.01k forks source link

Handsome random delay in subscription #1502

Open ashar02 opened 4 years ago

ashar02 commented 4 years ago

Description @jcague I am using Ubuntu 16 OS. Took the code from #1489 commit. I am facing a problem where remote subscription takes handsome time to appear. Some time subscription event comes quite late or sometime after offer sdp; answer sdp takes quite sometime. I am able to compare it with some previous code like commit #1417 and there is not such issue there and subscription is quite quick to appear.

Steps to reproduce the issue:

  1. Join the room with first participant.
  2. Join the room with second participant and switch the browser tab to first participant. And you will see this issue.
  3. Even to make it more prominent, you may increase the participants.

Describe the results you received: Subscribed audio/video appears quite late. If participant increases then it may even take more than 30 seconds to appear.

Describe the results you expected: It should appear on remote in less then 3-4 seconds after successful publishing.

Additional information you deem important (e.g. issue happens only occasionally): It happens 7 times out of 10.

Licode commit/release where the issue is happening Used the commit #1489.

Additional environment details (Local, AWS, Docker, etc.): Tested it on local VM as well as IBM / Liquidweb servers.

Let me know if you need additional information to investigate it?

ashar02 commented 4 years ago

@jcague are you able to look into this so far?

jcague commented 4 years ago

I've tried to reproduce it, but it does not happen to me. I just tested it locally and in remote servers.

ashar02 commented 4 years ago

@jcague It comes on local and remote both. Point to note here is that it produces on the tab where you are already publishing not where you recently joined.

ashar02 commented 4 years ago

@jcague with above hint, are you able to produce it? Should I make a video and share with you as evidence?

ashar02 commented 4 years ago

Here is the comparison of latest code vs older.

Latest code is running on local machine. See the last column and time difference between start of subscription and when it was ready.

Screenshot 2019-12-05 at 12 42 32 PM Screenshot 2019-12-05 at 11 12 22 AM

Older code running on internet. See the last column and time difference between start of subscription and when it was ready.

Screenshot 2019-12-05 at 11 15 22 AM

In new code it is quite noticeable time when subscriber comes in room. Can you please look into this? @jcague @lodoyun

ashar02 commented 4 years ago

@jcague is #1527 related to this problem or some thing else?

jcague commented 4 years ago

it might help reducing this issue in the future once we start sending SDP offers from Erizo, which we are now experimenting with.

ashar02 commented 4 years ago

@jcague I checked licode from head #1543 on Ubuntu 16. Delay problem still exists. The new user who joins the room see the subscriptions immediately but the old users on the room sees the new joiner after random delay of 3 - 7 seconds.

vpoddubchak commented 4 years ago

@ashar02 did you try to test without nICEr ? It was switched ON by default in #1455. So, if you use #1489, you can switch back to libnice and test. (In versions after #1505 you can't do it )

lodoyun commented 4 years ago

@ashar02 Can you try #1548 ? We're still trying to get to the root of the issue but that PR greatly reduces the chances of the delay in my setup.

ashar02 commented 4 years ago

@lodoyun okay, let me try and get back to you shortly!!

ashar02 commented 4 years ago

@lodoyun okay, let me try and get back to you shortly!!

ashar02 commented 4 years ago

@lodoyun I tried #1548 with head of master on ubuntu 16. But at my end there is no effect!!

lodoyun commented 4 years ago

Ok, I can see and reproduce this now. While I was more or less on the right track, I misread part of the original issue. It looks like connections can take longer to complete sometimes. A big contributor to this delay seems to be the delay introduced when getting callback from promises in ErizoJS that are resolved in ErizoAPI. We will have to dig deeper into that. A think you can try is to use SinglePC, we only support it on Chrome at the moment but it should solve the issue since we're only creating one connection per client as opposed to one connection per stream. In order to do that you have to set

config.erizoController.allowSinglePC = true;  // default value: false
config.erizoController.maxErizosUsedByRoom = 1;  // default value: 100

in your licode_config.js file and add ?singlePC= 1 in the Basic example url. Let me know if that fixes the problem.

ashar02 commented 4 years ago

@lodoyun yes I tried it just now. And yes it works 85% of the times correctly with no delay. But few times it takes (same 4- 6 seconds) time. Great to know it!! Thanks.

ashar02 commented 4 years ago

@lodoyun I tested singlepc=1 in details. When 5 or more participants joins a room then singlepc effect a lot on video quality (freezes or pixelation) as compared to when singlepc=0. Have you found any fix of this problem using singlepc=0?

vpoddubchak commented 4 years ago

I deployed basic example and open it in 1 tab (just 1 publish) Connection takes about 5 sec. Here is a log:

2020-04-17 17:18:31.644  - DEBUG: ErizoController - Channel - message: token received
2020-04-17 17:18:31.654  - INFO: RPCPublic - message: consumed token, tokenId: 5e99bab7cbba2b377dad54f8, roomId: 5e8440faca1c8424e8ddf0ba, serviceId: 5e843e0be46c52be2761e3e0
2020-04-17 17:18:31.656  - INFO: ErizoController - message: getting singlePC configuration, singlePc: false, global: 
2020-04-17 17:18:31.656  - DEBUG: ErizoController - Client - message: Adding listeners to socket events, client.id: cbe42aa2-d290-4c09-9236-75fd1b263b8b
2020-04-17 17:18:31.656  - INFO: ErizoController - message: client connected, clientId: cbe42aa2-d290-4c09-9236-75fd1b263b8b, singlePC: false
2020-04-17 17:18:31.672  - INFO: ErizoController - Client - message: addPublisher requested,  streamId: 106616982584734960, clientId: cbe42aa2-d290-4c09-9236-75fd1b263b8b state: erizo, data: true, audio: true, video: true, label: DzH1GJ5xqzF9glQ0E7kiTfvVvMBh7AmT05IS, screen: , attributes: [Object], metadata: [Object], muteStream: [Object], minVideoBW: 0, maxVideoBW: 300, mediaConfiguration: default, singlePC: false 
2020-04-17 17:18:31.672  - INFO: RoomController - message: addPublisher,  clientId cbe42aa2-d290-4c09-9236-75fd1b263b8b,  streamId: 106616982584734960 state: erizo, data: true, audio: true, video: true, label: DzH1GJ5xqzF9glQ0E7kiTfvVvMBh7AmT05IS, screen: , attributes: [Object], metadata: [Object], muteStream: [Object], minVideoBW: 0, maxVideoBW: 300, mediaConfiguration: default, singlePC: false type: publisher
2020-04-17 17:18:31.672  - INFO: EcCloudHandler - message: createErizoJS, agentId: ErizoAgent
2020-04-17 17:18:31.675  - INFO: EcCloudHandler - message: createErizoJS success, erizoId: a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8, agentId: 6950198f-610e-2c76-a8dd-be2c684ed6a0, internalId: 0
2020-04-17 17:18:31.676  - INFO: RoomController - message: addPublisher erizoJs assigned,  erizoId: a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8, streamId: 106616982584734960 type: publisher
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:31.683  - INFO: ErizoJSController - Start checking uptime, interval 1800
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:31.684  - INFO: ErizoJSController - addPublisher, clientId cbe42aa2-d290-4c09-9236-75fd1b263b8b streamId 106616982584734960
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:31.684  - INFO: Client - Constructor Client cbe42aa2-d290-4c09-9236-75fd1b263b8b
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:31.684  - INFO: Client - message: getOrCreateConnection, clientId: cbe42aa2-d290-4c09-9236-75fd1b263b8b, singlePC: false
2020-04-17 17:18:31.684  - INFO: Connection - message: constructor, id: cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1
2020-04-17 17:18:31,685  - INFO [0x7f3c607e4780] WebRtcConnection - id: cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1,  message: constructor, stunserver: , stunPort: 0, minPort: 0, maxPort: 0
2020-04-17 17:18:31.685  - INFO: Client - message: Adding connection to Client, clientId: cbe42aa2-d290-4c09-9236-75fd1b263b8b, connectionId: cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1
2020-04-17 17:18:31.685  - INFO: ErizoJSController - message: Adding publisher,  clientId: cbe42aa2-d290-4c09-9236-75fd1b263b8b,  streamId: 106616982584734960 state: erizo, data: true, audio: true, video: true, label: DzH1GJ5xqzF9glQ0E7kiTfvVvMBh7AmT05IS, screen: , attributes: [Object], metadata: [Object], muteStream: [Object], minVideoBW: 0, maxVideoBW: 300, mediaConfiguration: default, singlePC: false, publicIP: 192.168.1.216, privateRegexp: [Object] type: publisher
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:31.686  - INFO: Connection - message: addMediaStream, connectionId: cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1, mediaStreamId: 106616982584734960
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:31,687  - DEBUG [0x7f3c607e4780] ErizoAPI.WebRtcConnection - id:cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1, message: promiseResolver
2020-04-17 17:18:31,687  - DEBUG [0x7f3c607e4780] ErizoAPI.WebRtcConnection - id:cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1, message: promiseResolver finished
2020-04-17 17:18:33.012  - WARN: RoomController - message: addPublisher ErizoJS timeout,  streamId: 579716768919730400,  erizoId: ErizoJS_undefined,  retries: 0,  type: publisher
2020-04-17 17:18:33.012  - WARN: ErizoController - Client - message: addPublisher of removed publisher, messageType: undefined,label: rx3enWQwAisnfrhnNMuV2KvTlDZmV0LyGuRI, clientId: 81550d24-abe7-4486-848b-7ee90e38e1ad, streamId: 579716768919730400
2020-04-17 17:18:33.012  - WARN: RoomController - message: addPublisher already in progress, clientId: 81550d24-abe7-4486-848b-7ee90e38e1ad, streamId: 579716768919730400  type: publisher
2020-04-17 17:18:34.872  - INFO: EcCloudHandler - message: deleting erizoJS, erizoId: dd72b6a9-9f1c-0499-c50e-fb86462f6cf9
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:36,660  - DEBUG [0x7f3c607e4780] ErizoAPI.WebRtcConnection - id:cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1, message: eventsCallback
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:36.660  - INFO: Connection - message: WebRtcConnection status update,  id: cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1, status: 101 type: publisher
[erizo-a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8] 2020-04-17 17:18:36,662  - DEBUG [0x7f3c607e4780] ErizoAPI.WebRtcConnection - id:cbe42aa2-d290-4c09-9236-75fd1b263b8b_a8fd26e5-7e40-cfa7-0485-938d8a8e5eb8_1, message: eventsCallback finished
2020-04-17 17:18:36.664  - INFO: ErizoController - Client - message: addPublisher, label: DzH1GJ5xqzF9glQ0E7kiTfvVvMBh7AmT05IS, state: PUBLISHER_INITIAL, clientId: cbe42aa2-d290-4c09-9236-75fd1b263b8b, streamId: 106616982584734960

I think something wrong start happening there:

WARN: RoomController - message: addPublisher ErizoJS timeout, streamId: 579716768919730400, erizoId: ErizoJS_undefined, retries: 0, type: publisher

vpoddubchak commented 4 years ago

I did additional tests and those my results:

  1. When I do first publish in the room, it takes ~5 sec between messages [publish.... and [23423423423423443, ... But when I do publish by second peer it takes less than 1 sec.

  2. When I have 1 publisher and do enter by second with publish and subscribe (by default), it is less than 1 sec between messages [subscribe.... and [true, .... But when I subscribe with "onlySubscribe=true" then there is 3-5 sec between those messages.

I think something wrong with message queue on server side. Any ideas?

vpoddubchak commented 4 years ago

A bit more information: When I use single ErizoJS (config.erizoAgent.maxProcesses = 1;) First publish in room takes ~5 sec, All next less 2 sec When I use multiple ErizoJS (config.erizoAgent.maxProcesses = 10;) Each publish in room takes ~9 seconds When I use multiple ErizoJS with all pre-runed (config.erizoAgent.maxProcesses = 10; config.erizoAgent.prerunProcesses = 10;) Each publish in room takes ~5 sec

@ashar02 make sense for your case ?

ashar02 commented 4 years ago

@vpoddubchak For the time being I am using singlePC true and maxErizo to 1. So publish and subscription takes around 3 to 5 seconds in all cases which is much better than taking 10+ seconds for subscriptions.