Tawk should combine video streams on the server

karth295 commented 7 years ago

This is more of a long term issue, but it'll be good to keep it in the back of our minds.

Essentially we need a real time version of this: https://trac.ffmpeg.org/wiki/Create%20a%20mosaic%20out%20of%20several%20input%20videos

Right now every client downloads every stream separately, which is really inefficient in terms of bandwidth. We should package several (or all) streams together.

toomim commented 7 years ago

Is it actually less inefficient to send multiple streams? If there are 4 streams, is it not true that a combined stream would be 4x each individual stream?

karth295 commented 7 years ago

Probably depends on whether you keep the quality of the original video streams (which you don't need to do).

Regardless, I bet:

You'd significantly decrease the CPU load of clients decoding streams. Rather than decoding N streams, you would decode 1.
We'd reduce bizarre issues like #11.
The server could theoretically stream in different bandwidths, so a subscriber with a bad connection doesn't tank the quality for everybody else (that's a problem right now, I think).

toomim commented 7 years ago

I think dynamically adjusting the quality of video streams would be rad!

This would be especially useful for putting a video chat on a public website, where you don't want to stream the high-res version to everyone who just opens the page... and definitely not to google robots... so being able to downsample the video/audio on the server would be great.

I'm kinda uncertain about these reasons though:

I don't know that decoding N small streams is any less CPU load than decoding 1 N-size stream either. This presumes that there's a lot of overhead per video, rather than per pixel. I'd expect the overhead to be mostly per pixel.
It might solve issue #11... or it might not. We don't know what issue #11 is caused by yet.
Really? I haven't seen a subscriber with a bad connection tank the quality for everyone else on the current tawk.space. I think the server would just send them packets, while it sends everyone else packets, and if their packets drop, that shouldn't affect other people's packets being received as far as I can see.

So overall, dynamically adjusting the video streams sounds great, but multiplexing doesn't seem that necessary, but maybe it's more efficient, but we'd probably have to test it... we run the risk of implementing something that isn't actually more efficient.

toomim commented 7 years ago

This is called an MCU: https://en.wikipedia.org/wiki/Multipoint_control_unit They use more CPU. https://bloggeek.me/webrtc-multiparty-video-alternatives/ We probably need one with downsampling in order to scale up to 100s of people. https://webrtcglossary.com/mcu/

karth295 commented 7 years ago

For 100s of people we should spread the work across multiple machines. We can have clients upload their stream to 1 server, and everybody download a stream (containing multiple videos) from each server.

I wonder if GPUs will be more efficient for all this video processing.

invisible-college / tawk.space

Tawk should combine video streams on the server #15