mganeko / browser_mcu

Browser MCU Series, node.js Web/Signaling Server, and headless browser as WebRTC MCU
14 stars 1 forks source link

Video lag issue, when there are around 40 participants. #1

Open zaidiqbalsoftech opened 2 years ago

zaidiqbalsoftech commented 2 years ago

Hi,

Thanks for the idea, have implemented it on my side and have used it for around 6 months it was working quite well until we discovered a very disturbing issue which forced us to quit this architecture and move to some open source SFU.

We faced the video lag issue, when there are around 40 participants lets say, and I debugged it for around 10 days and found some of the root causes of the issue that made me think that we can't move forward with this architecture, kindly share your thoughts on this.

This issues even exists on the demo application as well.

One of the main issues I found was that the call was running in a tab in headless browser and that tab is a process and that is running on a core in cpu but when number of videos increase the canvas drawing gets increases and a time comes when the core of the tab get to its maximum and everything starts to cause issues like video lag.

I tried to do various things to tackle it: Dividing the call on multiple tabs to use more than one cpu cores to get full potential of the cpu and one way was to open child tab but the issue was that the child tab is not a separate process it shares the same cpu core as of it's parent so it didn't help then there was another way to open child tab in a way that it runs in a separate tab but it can't access the data of the parent tab so the child tab was a separate process but the major issue was that now it can't access the streams in parent tab and parent tab can't access the streams in child tab. I tried another thing to share streams between parent child which was to make local peer connections between tabs which made it possible to share the streams between parent and child but it was a very tight structure, anything happened with the local peer connection and whole thing will start causing issues so I didn't pursue that. So in short this parent child tab things didn't helped.

Using web workers to achieve kind of multi threading , it had the same issue it can't access the tab's streams but I tried another thing like I tried to move the canvas rendering to the workers for that I need to send the streams data somehow to the worker so I uses one of the browser's api to share data from tab to web worker, I basically captured frames from video and starting sending it to web worker but this was quite overhead as well because there were many streams and each streams have many frames and it caused a lot of issues because of that. So it didn't worked as well.

So at the end I have to move away from this architecture and use some other stuff. It was a really good idea to use headless browser to achieve the MCU concept in our own tech stack but I think browsers have their own limitations.

Kindly share your thoughts on this, thanks @mganeko

Here are the server specs: OS: Linux Cores: 32 cores

mganeko commented 2 years ago

Thank you for trying this repo. MCU is heavy task, so I do not recommend to use it for more than 25 participants. I think SFU is good option.

zaidiqbalsoftech commented 2 years ago

Ok I think it should be mentioned in the docs somewhere. I actually tried it on a big application, worked for around 2 months and after some time stuck into this issue.

It even cause issue for 5 participants as well not in the example but in my app and the reason is that in example only one canvas is being used but in my app i create n canvases based upon the participants screen dimensions.

MCU is heavy task, and the best solutions for these things are c++ based solutions but browsers are definitely not good choice for these things, like you said it can be used to some extent for few participants but the main purpose of MCU is to achieve large number of participants which with this approach can't be achieved.

zaidiqbalsoftech commented 2 years ago

Regarding SFU: I tried sfu with headless approach as well but the CPU usage was too high, lot higher than MCU, and the reason was the encoding/decoding of streams on the server.

As in SFU the outgoing streams from server increase exponentially so the encoding/decoding of streams eat the CPU resources exponentially.

So even for SFU the best approach is the c++ based solutions. They have customized the webrtc so that the server doesn't encode/decode the stream packets it just routes them as it is.