microsoftgraph / microsoft-graph-comms-samples

Microsoft Graph Communications Samples
MIT License
206 stars 224 forks source link

Application Hosted Media Bots VM autoscaling #368

Open franciscojgonzalez opened 3 years ago

franciscojgonzalez commented 3 years ago

Describe the issue We have successfully developed and deployed an application-hosted bot within an Azure Service Fabric cluster. Our bot is consuming Audio/Video events in a similar way than PolicyRecordingBot. We are now starting to analyse how to set up an autoscaling policy. Our concern is related to the nature of the call meetings. if I am not wrong, once a call has been set up, that call will be handled by the same VM till the call ends. But the resources consumed during a call might vary. Let me share an example:

Imagine we set up a meeting with X participants. The call starts, bot joins and none of the participants are sharing video. Everything is fine at that moment. But at some point, people starts to share Video, and the CPU usage increases. The more people sharing, the more CPU, and we could reach the limit supported by that single VM. At that point we have experienced different issues related to video/audio frames loss due to probably that instance type is not able to handle so many events when they come. On the other hand, Service Fabric may auto-scale at that point, and add a new VM, but the call would remain in the initial VM which is not able to handle it (actually any VM of that type might handle the call).

Expected behavior Cluster VMs should scale in order to be able to handle any type of call.

Additional context Is there any way of scaling a single VM in terms of CPU/Memory while a service is running in order to avoid these kind of issues, or should we just be aware of the limitations of the node types configured for our cluster before setting up calls?

mosoftwareenterprises commented 2 years ago

@franciscojgonzalez did you ever find a way to deal with this?

franciscojgonzalez commented 2 years ago

No. I mean, we set up an autoscaling policy (programatically, with our own criterias) in terms of adding or removing nodes, but once a call starts to be handled in one node, we assume that call will live with the resources from that node.