Bug: Metrics thread safety issue

jessepoulton-mydeal commented 3 months ago

Expected Behaviour

I am able to add metrics from multiple threads without issue.

Current Behaviour

Metrics throws an exception:

System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
   at System.Collections.Generic.List`1.Enumerator.MoveNext()
   at System.Linq.Enumerable.TryGetFirst[TSource](IEnumerable`1 source, Func`2 predicate, Boolean& found)
   at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable`1 source, Func`2 predicate)
   at AWS.Lambda.Powertools.Metrics.Metrics.AWS.Lambda.Powertools.Metrics.IMetrics.AddMetric(String key, Double value, MetricUnit unit, MetricResolution metricResolution) in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\src\AWS.Lambda.Powertools.Metrics\Metrics.cs:line 103
   at AWS.Lambda.Powertools.Metrics.Metrics.AddMetric(String key, Double value, MetricUnit unit, MetricResolution metricResolution) in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\src\AWS.Lambda.Powertools.Metrics\Metrics.cs:line 302
   at AWS.Lambda.Powertools.Metrics.Tests.Handlers.FunctionHandler.<>c.<<HandleMultipleThreads>b__2_0>d.MoveNext() in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\tests\AWS.Lambda.Powertools.Metrics.Tests\Handlers\FunctionHandler.cs:line 51
--- End of stack trace from previous location ---
   at System.Threading.Tasks.Parallel.<>c__53`1.<<ForEachAsync>b__53_0>d.MoveNext()
--- End of stack trace from previous location ---
   at AWS.Lambda.Powertools.Metrics.Tests.Handlers.FunctionHandler.HandleMultipleThreads(String input) in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\tests\AWS.Lambda.Powertools.Metrics.Tests\Handlers\FunctionHandler.cs:line 49
   at AWS.Lambda.Powertools.Common.MethodAspectAttribute.WrapAsync[T](Func`2 target, Object[] args, AspectEventArgs eventArgs) in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\src\AWS.Lambda.Powertools.Common\Aspects\MethodAspectAttribute.cs:line 90
   at AWS.Lambda.Powertools.Metrics.MetricsAspectHandler.OnException(AspectEventArgs eventArgs, Exception exception) in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\src\AWS.Lambda.Powertools.Metrics\Internal\MetricsAspectHandler.cs:line 133
   at AWS.Lambda.Powertools.Common.MethodAspectAttribute.WrapAsync[T](Func`2 target, Object[] args, AspectEventArgs eventArgs) in C:\Users\jesse.poulton\repos\powertools-lambda-dotnet\libraries\src\AWS.Lambda.Powertools.Common\Aspects\MethodAspectAttribute.cs:line 96
   at Xunit.Record.ExceptionAsync(Func`1 testCode) in /_/src/xunit.core/Record.cs:line 76

Code snippet

To make this test fail reliably you need to set the PowertoolsConfigurations.MaxMetrics to a low number like 1

Add this method to the FunctionHandler.cs in the metrics tests project:

    [Metrics(Namespace = "ns", Service = "svc")]
    public async Task<string> HandleMultipleThreads(string input)
    {
        await Parallel.ForEachAsync(Enumerable.Range(0, Environment.ProcessorCount * 2), async (x, _) =>
        {
            Metrics.AddMetric("MyMetric", 1);
            await Task.Delay(1);
        });

        return input.ToUpper(CultureInfo.InvariantCulture);
    }

and this test to FunctionHandlerTests.cs:

    [Fact]
    public async Task When_Metrics_Add_Metadata_FromMultipleThread_Should_Not_Throw_Exception()
    {
        // Arrange
        Metrics.ResetForTest();
        var handler = new FunctionHandler();

        // Act
        var exception = await Record.ExceptionAsync(() => handler.HandleMultipleThreads("whatever"));
        Assert.Null(exception);
    }

Possible Solution

You could either add additional locks to the Metrics.AddMetric method:

void IMetrics.AddMetric(string key, double value, MetricUnit unit, MetricResolution metricResolution)
{
    if (string.IsNullOrWhiteSpace(key))
        throw new ArgumentNullException(
            nameof(key), "'AddMetric' method requires a valid metrics key. 'Null' or empty values are not allowed.");

    if (value < 0) {
        throw new ArgumentException(
            "'AddMetric' method requires a valid metrics value. Value must be >= 0.", nameof(value));
    }

    lock (_lockObj)
    {
        var metrics = _context.GetMetrics();

        if (metrics.Count > 0 &&
            (metrics.Count == PowertoolsConfigurations.MaxMetrics ||
             metrics.FirstOrDefault(x => x.Name == key)
                 ?.Values.Count == PowertoolsConfigurations.MaxMetrics))
        {
            _instance.Flush(true);
        }

        _context.AddMetric(key, value, unit, metricResolution);
    }
}

Or you could possibly use concurrent collections under the hood and the Interlocked class to handle incrementing counts and swapping objects when a flush is requried.

Steps to Reproduce

Unit test that shows the issue is in the Code Snippet section. Ensure the MaxMetrics is set to a low value like 1, otherwise the test will inconsistently pass/fail.

Powertools for AWS Lambda (.NET) version

latest

AWS Lambda function runtime

dotnet6

Debugging logs

No response

boring-cyborg[bot] commented 3 months ago

Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #dotnet channel on our Powertools for AWS Lambda Discord: Invite link

hjgraca commented 3 months ago

Hey @jessepoulton-mydeal thank you for reporting the issue and the possible solution. I will be looking at this today and keep you posted. If you have the time and want to contribute feel free to create a pull request.

hjgraca commented 3 months ago

@jessepoulton-mydeal Fix released. Metrics 1.6.2

aws-powertools / powertools-lambda-dotnet