How to perform load testing in PolicyRecordingBot?

kokosda commented 1 year ago

Hi PolicyRecordingBot experts,

What is the best way when approaching load testing the bot? As far as I understand bot's design, it receives webhooks and data on 2 endpoints, HTTP and NET.TCP, from MS Teams bot. From the test automation perspective, we have to buy MS Teams subscription for hundreds of users, create those users, make some Selenium coding that simulates users calling each other. Well, that sounds like an overkill, doesn't it?

Is there a way to workaround all of this and, for instance, send some media stream to the TCP endpoint and measure actual CPU utilization? What issues are arising with this approach if it's even achievable?

InDieTasten commented 1 year ago

For a test that's true to reality, that's the way, and even then you are most likely violating some terms with your Selenium automation.

I have a different approach I can suggest. Give your application registration permission to join calls on it's own (rather than receiving calls via Policy), and join into a meeting.

You can join as many bots as Teams allows you to do. If one meeting hits the participant limit, create a second meeting.

Generally, you can load test the stress induced by audio, participant video and vbss quite well using this approach.

However, when it comes to processing call notifications, there is a difference between one large meeting and many small ones. For every instance of your bot in the meeting, your bot will receive that each event occurring in the meeting.

Eg: 100 bots are in the meeting. 1 participant mutes themselves. 100 requests are sent to your bot in an instant. Even just joining 100 bots is a challenge in itself, as every bot joining notifies all existing instances about the new participant. You can work out the math yourself. In reality, you rarely have that many bots in the same call, though it can also happen.

Let me know, if you need further advice.

kokosda commented 1 year ago

@InDieTasten , thank you. I want everything to be automated. If I go with MS Teams, it means I have to do some manual work each time I want to run a load test. For instance, I have to actually log in to MS Teams client, make a call. This is not a solution for my case.

If I could just stream some audio file to the net.tcp endpoint, it would be much easier to check and scale and wouldn't require any interaction with MS Teams client.

Another way I see to approach this is to create Microsoft Graph client in unit tests, create calls via Graph API, and try to stream that audio file from within the unit test. But I still don't fully realize if it is possible as documentation is vague and cumbersome. In theory, as I read it, it is possible to stream some data to a manually created call (via Graph API) but I'm still investigating how to do it in practice.

InDieTasten commented 1 year ago

@kokosda You want to mock the Microsoft Teams Platform then. That's an idea I always disregarded as too much effort, but given how it would be a clean solution without hundreds of licensed test users maybe it would be fair to give this a try.

I will think about how that could look like in detail.

If you make any progress yourself, please keep me posted.

mosoftwareenterprises commented 1 year ago

What specs are you expecting your bot to achieve? How many concurrent calls? Audio & Video? What spec machine?

I would love to come up a with a way to do this load testing, I havent yet.

InDieTasten commented 1 year ago

@mosoftwareenterprises Media streaming really isn't that big of a deal, since VBSS and audio streams are well compressed. Participant video / webcam takes most of the bandwidth.

From my experience it's not so much the media that's causing trouble. Much more often, it's the JSON parsing for notifications coming in, when you have particularly large meetings with hundreds of people joining in a short time frame.

Azure F-Series vCores can handle between 1 and 15 calls depending on what's happening on the call. I know it's rough, but that's how it works. The nature of how Teams meetings are used have a huge impact on performance, which makes it really difficult to predict load patterns.

InDieTasten commented 1 year ago

I feel like getting all the Graph API stuff mocked is doable. However, I'm not sure how the SDK together with the Windows Media Platform is working. AFAIK, there is a lot of proprietary media processing going on for the media streams and mocking these might be difficult?

Any ideas on this @1fabi0 ?

InDieTasten commented 1 year ago

@kokosda How far did you get with your plans of automating tests so far?

mosoftwareenterprises commented 1 year ago

I have been looking into trying to mock the API with https://github.com/microsoft/m365-developer-proxy which gives some of what I want, and as you say @InDieTasten it seems to be the participant parsing that is the slowest part, hopefully I can prove this soon.

@kokosda did you make any progress with the testing ideas?

oxygennik2009 commented 1 year ago

@mosoftwareenterprises Media streaming really isn't that big of a deal, since VBSS and audio streams are well compressed. Participant video / webcam takes most of the bandwidth.

From my experience it's not so much the media that's causing trouble. Much more often, it's the JSON parsing for notifications coming in, when you have particularly large meetings with hundreds of people joining in a short time frame.

Azure F-Series vCores can handle between 1 and 15 calls depending on what's happening on the call. I know it's rough, but that's how it works. The nature of how Teams meetings are used have a huge impact on performance, which makes it really difficult to predict load patterns.

Hi @InDieTasten, I am trying to do the load test right now and I wonder if you could give me an advice on that please. So I'm using D4v2 with 8 vCpu and ssulzer here (https://github.com/microsoftgraph/microsoft-graph-comms-samples/issues/222#issuecomment-625407757) states that the bot can handle at least 50 audio sessions per vCPU, means 8*50 = 400 audio sessions.

My current bot does this: 1) Receiving notification 2) Processing notification 3) Answering the call 4) Subscribing to AudioMediaReceived event -> here I just Dispose() the stream and that's it.

Setup: 100 users in the policy

It takes me around 50-60 calls in one meeting (1 bot per call id and thus 1 call id per user) to kill the machine. The CPU load gets around 70% and then Azure starts dropping the calls.

Is it really like this? Or am I doing something incorrectly and I definitely should see the ability to join 400 users (with each of them having a dedicated bot) in one large meeting?

Should meetings be handled via the grouping mode? E.g. 1 bot instance per N users?

InDieTasten commented 1 year ago

@oxygennik2009

So I'm using D4v2 with 8 vCpu

I mentioned F-Series (compute-optimized) family, as these machines are focused on CPU intensive tasks, which is probably the limiting factor when it comes to teams bots.

ssulzer here (https://github.com/microsoftgraph/microsoft-graph-comms-samples/issues/222#issuecomment-625407757) states that the bot can handle at least 50 audio sessions per vCPU, means 8*50 = 400 audio sessions.

I think 50 calls per core is pretty optimistic and as mentioned earlier in this thread dependant on how many notifications are being sent to these calls. If it's just audio and no state changes in the calls (like mutes, unmutes, participants joining/leaving, participants turning on/off webcams, and so on), maybe you can reach 200-400 calls with your setup, given you let them connect slowly one after the other, and don't do any further processing.

Azure starts dropping the calls.

I don't think it's Azure dropping the calls. It's more likely the Teams Platform is dropping call legs, because your VM is saturated and can't keep up with events and new calls anymore.

It takes me around 50-60 calls in one meeting (1 bot per call id and thus 1 call id per user) to kill the machine. The CPU load gets around 70% and then Azure starts dropping the calls.

Every call instance of your bot will act as a participant in that meeting. Every time a new participant / bot joins the meeting, all previously present participants are notified about that new participant. This causes exponential load on your bot, as your bot will be bombarded with notifications. Eg: 100 users with policy in the meeting + 100 policy recording applications in the meeting. If another participant + application connects, your application receives 200 requests. A participant mutes himself: 100 requests.

Should meetings be handled via the grouping mode? E.g. 1 bot instance per N users?

Depends on your needs. If you are dealing with large meetings it could be a cost-effective solution to utilize participant capacity and record all users in the meeting with a single call on the bot end. But maybe you want to have recordings individual to any participant under policy, with at least approximately the same call start and end timestamps. Dividing an "aggregate" recording back into parts relevant to individual participants will take extra thought and engineering.

oxygennik2009 commented 1 year ago

@InDieTasten Thank you for the help!

oxygennik2009 commented 1 year ago

@InDieTasten Is it possible to unsubscribe from mute/unmute events?

1fabi0 commented 1 year ago

@oxygennik2009 short answer: no

microsoftgraph / microsoft-graph-comms-samples

How to perform load testing in PolicyRecordingBot? #653