alangecker / bigbluebutton-docker

merged into https://github.com/bigbluebutton/docker
GNU Lesser General Public License v3.0
98 stars 33 forks source link

Which component has stateless and can be autoscaling? #38

Closed congthang1 closed 4 years ago

congthang1 commented 4 years ago

Its very great to have components docker separated. I wonder if each of them can be put in an autoscaling tools like AWS ASG. Anyone has idea on this? Thanks

alangecker commented 4 years ago

I have no clue about autoscaling tools, how they work and what requirements to the containers they have.

The only two containers which cause any significant load (and therefore useful for scaling/loadbalancing) are freeswitch and kurento, both maintain some state.

Does that mean it is already out for the "autoscaling"?

congthang1 commented 4 years ago

Yes to be able to use autoscaling, the app need to have state somewhere like in a database or Redis etc. Like a webserver, its state is on database so when it autoscale to 2 servers connect to the same database, client will just get same thing. For freeswitch and kurento if we find somehow to save its state somewhere even on a disk (as they can share volume using NFS) then it can be autoscale. 2 freeswitch servers can run at sametime using a state outside of it. When high demand it will be add more Freeswitch without interrupt the conference, and it will reduce if less demand. This way if working we can save a lot of cost as we dont need a big server at start and most of time waiting for the conference has cost less. AWS ECS has system to autoscale container (Fargate or ECS cluster).

alangecker commented 4 years ago

Hmm, for me this doesn't seem reasonable...

  1. extracting the state from freeswtich and kurento in an eternal database use a huuuge task, which include conceptional redesigns, rewriting big parts C/C++ code and adding streaming exchange between the instances, so for example one person connected to a conference on freeswitch container A can also hear/listen a person connected in freeswitch container B
  2. Problem distributing/redirecting udp ports to the individual containers must be solved somehow.
  3. There is a reason why it is recommended to use bare metal servers instead of virtualized cloud services / VMs: The latency can't be guarenteed.

Do you see it differently? I would close the issue otherwise. and maybe can you also imagine working on that?

cjhille commented 4 years ago

We probably aren't the first ones to think of this. Unfortunately I have not found a proper discussion about this, yet. But to @alangecker points:

Regarding 1: Apparently Freeswitch can already be configured in a way, that it runs on Kubernetes and is horizontally scalable (multiple pods with a load balancer). See: https://stackoverflow.com/questions/45462821/running-freeswitch-on-google-container-engine For Kurento I have only been able to find a POC that it can be run on K8S, however without autoscaling. See: https://github.com/Kurento/Kubernetes/blob/master/kurento-media-server.yaml#L11

Regarding 2: I can't speak to AWS but with Kubernetes the distribution of traffic is achieved by a "service" abstraction, that routes the traffic accordingly. Session affinity for UDP routing can also be achieved, so the packets arrive at the proper instances. Maybe it's even possible for the instances to share the data of their ingress traffic, depending on where it is stored (to be transcoded later on by multiple instances).

Regarding 3: Increased latency (introduced by additional load balancing and routing) may be the biggest downside. However you can add bare metal worker nodes to a K8S cluster and not be encumbered by virtualization. I'm currently running the BBB-docker setup on a ESXi virtualized ubuntu (4 core,16GB ram) and haven't come across noticeable latency issues. Maybe a slightly worse latency is even acceptable in a scalable system that can handle 100+ video streams as opposed to the system entire breaking down under load.

IMHO it would be a huge achievement to enable an autoscaled BBB setup, since even the minimal server requirements are comparatively high (to e.g. Jtisi) and grow even higher with a larger user base. If the system was able to autoscale (e.g. with k8s) the baseline cost could be as low as $30-$40 per month for a single k8s-node system, but still scale to support 100+ concurrent users if need be (using a cluster-autoscaler).. thinking of educational entities or larger NGOs on a low budget.

congthang1 commented 4 years ago

Seems its not a simple task, but the separation of components give me a chance to deploy a new BBB sever faster. Say I can give user 2 minutes to schedule a conference. In that time, the script must finish its setup and make the conference available. On AWS fargate (similar as k8s) we just need to pull a image and run it without worry about capacity.