microsoftgraph / microsoft-graph-comms-samples

Microsoft Graph Communications Samples
MIT License
211 stars 237 forks source link

CommsSerializer high CPU usage #739

Open ROMICO-GmbH opened 4 months ago

ROMICO-GmbH commented 4 months ago

Hello, we are using CommsSerializer to deserialize notifications for compliance recording bot. This method however can take up to a few seconds. Can we do something to reduce this time or optimize? When there is a lot of calls it results in quite high cpu usage.

private CommsNotifications GetGraphNotifications(string rawData)
{
    CommsNotifications result = null;
    try
    {
        #region "Initialization"
        if (string.IsNullOrEmpty(rawData))
        {
            return result;
        }
        #endregion

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();

        result = NotificationProcessor.ExtractNotifications(rawData, new CommsSerializer());

        stopwatch.Stop();
        _logger.LogDebug("GetGraphNotifications execution time: {0} ms", stopwatch.ElapsedMilliseconds);
    }
    catch (Exception ex)
    {
        ErrorAction(GetCurrentMethod(), ex);
    }
    return result;
}
InDieTasten commented 4 months ago

I believe the .NET 6 version has improved performance characteristics for parsing notifications. I'm pretty sure it already uses the assembly info cache to reduce / eliminate reflection lookups, but apart from upgrading to .NET 6, you really can only throw more CPU at the problem. If you are interested in horizontal scaling and .NET 8, I recommend looking at our sample.

Trying to build a custom parser is not easy. If you find a way to parse the schema efficiently, do let me know though. The couple of attempts we made ended up failing due to the schema being so difficult to parse and lots of hardcoding fragile to future updates in the schema.

1fabi0 commented 4 months ago

The root cause of the Problem is the vague build CommsNotification-Object, the use of the object purely relies on AdditionalData, but other objects do also have Additional Data. The Additional Data object fields causes the Serealizer to first read the type field of the json objects in the notifications, then dynamically search for the class and create an instance of it. To improve this situation, the notifications would need to be more type-static, field types in the json should be inferred from field names and the type fields should not be included/needed for parsing notifications. In practice, this means replacing AdditionalData with actual possible properties. While you can do this yourself, it makes your code very vulnerable to updates from Microsoft, which may rename, add, update, or change fields by simply sending you other requests without warning, because the additional data is flexible in responding to these changes.