aws / aws-lambda-dotnet

Libraries, samples and tools to help .NET Core developers develop AWS Lambda functions.
Apache License 2.0
1.58k stars 477 forks source link

Upgrade of Amazon.Lambda.Serialization.SystemTextJson from 2.2.0 to 2.4.2 breaks deserialization of base64 to byte[] #1728

Closed brendonparker closed 7 months ago

brendonparker commented 7 months ago

Describe the bug

In upgrading from dotnet6 to dotnet8 I upgraded the Amazon.Lambda.Serialization.SystemTextJson from 2.2.0 to 2.4.2

After the upgrade, my lambdas which deserialized payloads that were deserializing base64 strings to byte[] started to fail.

Expected Behavior

Should be able to deserialize base64 string as a byte[]. Or at least document on how to configure around this. No breaking changes.

Current Behavior

Amazon.Lambda.Serialization.SystemTextJson.JsonSerializerException: Error converting the Lambda event JSON payload to type Company.Product.CloudWatchLogEventWrapper: The JSON value could not be converted to byte[].
 ---> System.Text.Json.JsonException: The JSON value could not be converted to byte[].
   at Amazon.Lambda.Serialization.SystemTextJson.Converters.ByteArrayConverter.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue)
   at System.Text.Json.Serialization.Metadata.JsonPropertyInfo`1.ReadJsonAndSetMember(Object obj, ReadStack& state, Utf8JsonReader& reader)
   at System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue)
   at System.Text.Json.Serialization.Metadata.JsonPropertyInfo`1.ReadJsonAndSetMember(Object obj, ReadStack& state, Utf8JsonReader& reader)
   at System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadFromSpan[TValue](ReadOnlySpan`1 utf8Json, JsonTypeInfo`1 jsonTypeInfo, Nullable`1 actualByteCount)
   at System.Text.Json.JsonSerializer.Deserialize[TValue](ReadOnlySpan`1 utf8Json, JsonSerializerOptions options)
   at Amazon.Lambda.Serialization.SystemTextJson.DefaultLambdaJsonSerializer.InternalDeserialize[T](Byte[] utf8Json)
   at Amazon.Lambda.Serialization.SystemTextJson.AbstractLambdaJsonSerializer.Deserialize[T](Stream requestStream)
   --- End of inner exception stack trace ---
   at Amazon.Lambda.Serialization.SystemTextJson.AbstractLambdaJsonSerializer.Deserialize[T](Stream requestStream)
   at lambda_method1(Closure, Stream, ILambdaContext, Stream)
   at Amazon.Lambda.RuntimeSupport.HandlerWrapper.<>c__DisplayClass8_0.<GetHandlerWrapper>b__0(InvocationRequest invocation) in /src/Repo/Libraries/src/Amazon.Lambda.RuntimeSupport/Bootstrap/HandlerWrapper.cs:line 54
   at Amazon.Lambda.RuntimeSupport.LambdaBootstrap.InvokeOnceAsync(CancellationToken cancellationToken) in /src/Repo/Libraries/src/Amazon.Lambda.RuntimeSupport/Bootstrap/LambdaBootstrap.cs:line 185

Reproduction Steps

In my case I am subscribing to CloudWatch log events and parsing logs.

POCOs

public class CloudWatchLogEventWrapper
{
    [JsonPropertyName("awslogs")]
    public AwsLogs AwsLogs { get; set; }
}

public class AwsLogs
{
    [JsonPropertyName("data")]
    public byte[] Data { get; set; }

    public CloudWatchLogEvent ToCloudWatchLogEvent()
    {
        if (Data == null || Data.Length == 0) return null;

        using var ms = new MemoryStream(Data);
        using var decompress = new System.IO.Compression.GZipStream(ms, System.IO.Compression.CompressionMode.Decompress);
        using var rs = new MemoryStream();
        decompress.CopyTo(rs);

        var jsonString = Encoding.UTF8.GetString(rs.ToArray());
        return JsonSerializer.Deserialize<CloudWatchLogEvent>(jsonString);
    }
}

Lambda

public class PayloadCollector
{
    public async Task Process(CloudWatchLogEventWrapper evnt, ILambdaContext context)
    {
        // This should have a CloudWatch Logs trigger hooked up with the filter pattern:
        // { $.fields.PayloadType != "" }
        // More help: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/FilterAndPatternSyntax.html

        var payload = evnt.AwsLogs.ToCloudWatchLogEvent();
        if (payload == null) return;

        // Do stuff here
    }
}

Payload

{"awslogs":{"data":"H4sIAAAAAAAACr2Wy2rbQBSG93mKYLoMeM6cObfuDHWzaSHB2YVQbEsNBjs2kt0QQt69v9Skzap2MbVlpLlp5tM/5zLPZ+f4DZbr+8tmvdsMPp4PhtPHdricrmbVdLhp1tX1rt7VV816Xrftuhlc/H5jsm3q6ap7JadchqkMKQ9vP3wZ3YwnN3csNqvmM4/acylVPavrWYUSJam+55i/m2j8o37Ytpjotm967u9956LqpmenCC9u6qpsbMbkRbMzhYRliuRmlMktmaoTMamYvC7Rz7RdrOp2O111n0gYGyUVJS78bgxGtNP7ulty0Le+XOwFihSpSORInCinJElQdjA6gFLKVhgVSkRKBwBJORbIJIqTQh8xokCHsAKTc7BHMc/mIalkbMUhQHZCIPr/QCYSkRVILh6cFLUAJB7oIBECo3Troa5lP5AeuWX/BnSAUZ8WSE8BpJnEUoYnucG3MCVcnLkEvMydJERUzXJSjgNsSEWOAsKimQhhqJeFJXnp7Rg2LoYRWCSJMIdpRnE/UDCfEIhOAlQYBISAj1CIp2XM6ilh9xAHECULqRcxuL34IUDH2ZBDG2Vc4drFaSSS4jCrKBlgRaFXQVbpGjXZfoWQbI5UyBQKGCH0WUinFmJhn8koGxWYvBZsI0gF//1u333KsUAQxJghUJfPiiHUJjieSydbigyXo0KQixAaDgHSI4EsOQykc/WgwsLi0AmBKWOIgo4KQMkFOSUdpNDfU0d/v3s9l7x23zxt+iGfRjejb1/Hk8nocvx2dFk/PtRN1+luwp23hfNbZ7ubtfNmsdku1g+fF8tt3fw54Qwm15Pzq+nTcj2t2l+L3529/AQ2x6aWjAkAADc4OTUwNDI0MTEiLAogICAgICAgICJ0aW1lc3RhbXAiOiAxNzEyOTQwNDYxMzU3LAogICAgICAgICJtZXNzYWdlIjogIiIKICAgIH0sCiAgICB7CiAgICAgICAgImlkIjogIjM4MTk5ODQ4Nzc1NTk5MjY5NDg4NTg5MzA2OTkyOTg5Nzk5MjU5OTE1NTExMTk1MTcxMjI1OTY0IiwKICAgICAgICAidGltZXN0YW1wIjogMTcxMjk0MDQ2MTY1NCwKICAgICAgICAibWVzc2FnZSI6ICIiCiAgICB9LAogICAgewogICAgICAgICJpZCI6ICIzODE5OTg0ODc3NTU5OTI2OTQ4ODU4OTMwNjk5Mjk4OTc5OTI1OTkxNTUxMTE5NTE3MTIyNTk2NSIsCiAgICAgICAgInRpbWVzdGFtcCI6IDE3MTI5NDA0NjE2NTQsCiAgICAgICAgIm1lc3NhZ2UiOiAiIgogICAgfSwKICAgIHsKICAgICAgICAiaWQiOiAiMzgxOTk4NDg3NzU1OTkyNjk0ODg1ODkzMDY5OTI5ODk3OTkyNTk5MTU1MTExOTUxNzEyMjU5NjYiLAogICAgICAgICJ0aW1lc3RhbXAiOiAxNzEyOTQwNDYxNjU0LAogICAgICAgICJtZXNzYWdlIjogIiIKICAgIH0sCiAgICB7CiAgICAgICAgImlkIjogIjM4MTk5ODQ4Nzc1NjIxNTcwMjMzNzg3ODM3NjE2MTMxMzM0OTc4MTg4MTU5NTU2Njc3MjA2MzkxIiwKICAgICAgICAidGltZXN0YW1wIjogMTcxMjk0MDQ2MTY1NSwKICAgICAgICAibWVzc2FnZSI6ICIiCiAgICB9LAogICAgewogICAgICAgICJpZCI6ICIzODE5OTg0ODc4MTgyMTE3NzM5ODk3OTM1MDg0OTQ3ODI2NDY1Nzk4NDQwNDA1NTMzOTc2MjA0MCIsCiAgICAgICAgInRpbWVzdGFtcCI6IDE3MTI5NDA0NjE5MzMsCiAgICAgICAgIm1lc3NhZ2UiOiAiIgogICAgfSwKICAgIHsKICAgICAgICAiaWQiOiAiMzgxOTk4NDg3ODE4MjExNzczOTg5NzkzNTA4NDk0NzgyNjQ2NTc5ODQ0MDQwNTUzMzk3NjIwNDEiLAogICAgICAgICJ0aW1lc3RhbXAiOiAxNzEyOTQwNDYxOTMzLAogICAgICAgICJtZXNzYWdlIjogIiIKICAgIH0sCiAgICB7CiAgICAgICAgImlkIjogIjM4MTk5ODQ4NzgxODQzNDc4MTQ0MTc3ODgxNDcyNjE5ODAwMzc2MjU3MDUyNDE2ODQ1NzQyNDU4IiwKICAgICAgICAidGltZXN0YW1wIjogMTcxMjk0MDQ2MTkzNCwKICAgICAgICAibWVzc2FnZSI6ICIiCiAgICB9LAogICAgewogICAgICAgICJpZCI6ICIzODE5OTg0ODc4NzM5NjM2MzY5ODYxMjAwNjYzNDg2MjE5NDIyNjE0NjQ5NDQzMTgzNDg2NjA3MSIsCiAgICAgICAgInRpbWVzdGFtcCI6IDE3MTI5NDA0NjIxODMsCiAgICAgICAgIm1lc3NhZ2UiOiAiIgogICAgfSwKICAgIHsKICAgICAgICAiaWQiOiAiMzgxOTk4NDg3ODc2NDE2NzE4OTU3OTU4NDM0ODk0MTkwODcxMjcxNDU2MjY0MDg0MDA2NTA2NTUiLAogICAgICAgICJ0aW1lc3RhbXAiOiAxNzEyOTQwNDYyMTk0LAogICAgICAgICJtZXNzYWdlIjogIiIKICAgIH0sCiAgICB7CiAgICAgICAgImlkIjogIjM4MTk5ODQ4Nzg3Njg2MjczMzg2MTkyOTA0NzM1NzAyMTU4NTYzNjkwOTIzMTMxNDEyNjExNDg4IiwKICAgICAgICAidGltZXN0YW1wIjogMTcxMjk0MDQ2MjE5NiwKICAgICAgICAibWVzc2FnZSI6ICIiCiAgICB9LAogICAgewogICAgICAgICJpZCI6ICIzODE5OTg0ODc4NzcwODU3NDEzMTM5MTQzNTM1ODg0MzY5NDI4MTk2MzU3MTQ5MjkxODU5MTkwNSIsCiAgICAgICAgInRpbWVzdGFtcCI6IDE3MTI5NDA0NjIxOTcsCiAgICAgICAgIm1lc3NhZ2UiOiAiIgogICAgfQogICAgXSwKICAgICJtZXNzYWdlVHlwZSI6ICJEQVRBX01FU1NBR0UiLAogICAgIm93bmVyIjogIjg4NzUzODI2NDk4MyIsCiAgICAic3Vic2NyaXB0aW9uRmlsdGVycyI6IFsKICAgICJTUVMgUGF5bG9hZHMiCiAgICBdCn0="}}

Possible Solution

A. Enhance the ByteArrayConverter to handle/detect base64 encoded string rather than only allow array of numbers. B. Allow for configuring the Default lambda serializer, so this converter can be removed.

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

Amazon.Lambda.Serialization.SystemTextJson 2.4.2

Targeted .NET Platform

.NET 8

Operating System and version

AmazonLinux / Lambda

brendonparker commented 7 months ago

I'm able to work around this by changing my byte[] to string. But not ideal that it may break someone else.

public class AwsLogs
{
    [JsonPropertyName("data")]
    public string Data { get; set; } // This used to be byte[]

    public async Task<CloudWatchLogEvent> ToCloudWatchLogEventAsync()
    {
        if (Data == null || Data.Length == 0) return null;

        using var ms = new MemoryStream(Convert.FromBase64String(Data));
        using var decompress = new System.IO.Compression.GZipStream(ms, System.IO.Compression.CompressionMode.Decompress);

        return await JsonSerializer.DeserializeAsync<CloudWatchLogEvent>(decompress);
    }
}
brendonparker commented 7 months ago

So at this point, I don't know if there is anything that you want to do to "fix" :) Seems like people may have varying ways they want to handle this behavior

IMO - it should use the default (not add a special ByteArrayConverter) and require people that want otherwise to specify JsonConverter attributes on their props that are byte[]

brendonparker commented 7 months ago

Another work around was to use a custom Json Serializer:

class CustomLambdaJsonSerializer : DefaultLambdaJsonSerializer
{
    public CustomLambdaJsonSerializer() : base(CreateCustomizer) { }

    private static void CreateCustomizer(JsonSerializerOptions options)
    {
        foreach (var converter in options.Converters.OfType<ByteArrayConverter>().ToArray())
        {
            options.Converters.Remove(converter);
        }
    }
}
ashishdhingra commented 7 months ago

@brendonparker Good afternoon. Thanks for reporting the issue. Looks like the issue is with your custom implementation of CloudWatchLogEvent POCO class. Per Using Lambda with CloudWatch Logs, the event returns awslogs:data as encoded string. Hence the POCO class Amazon.Lambda.CloudWatchLogsEvents.CloudWatchLogsEvent is modeled accordingly, with a helper method DecodeData() to return decoded data. The decoded string should essentially be in the format specified in Using Lambda with CloudWatch Logs, so do you have a custom class for decided CloudWatchEvent? I'm unsure what needs to be changed in this library as you also mentioned variable use cases for different users.

Thanks, Ashish

brendonparker commented 7 months ago

@ashishdhingra The issue is not with my custom implementation.

The issue is that the System.Text.Json serializer natively will decode a json payload containing a property with a base64 string to a byte[] property.

However, since a custom ByteArrayConverter has been added, it breaks when it runs into the base64 string. This is a new/breaking change that was introduced at some point. The previous version handled this conversion. The new version throws an exception.

ashishdhingra commented 7 months ago

@brendonparker ByteArrayConverter was added in commit b49dfae373538aad2811340cd2a03b478cb59318 3 years ago. Just as a note, AWS documentation mentions awslogs:data as encoded string.

Since you mentioned it worked previously and it caused issues after the introduction of ByteArrayConverter, I will review this with the team.

Thanks, Ashish

normj commented 7 months ago

Hey @brendonparker since the ByteArrayConverter was added 3 years ago we can't take it now because that would also be breaking change. My guess is we could potentially make the ByteArrayConverter more flexible and handle the incoming JSON value being a string going to a byte[] and perform the base 64 decoding. Not sure when we would get to that work but if you are interested in contributing a PR I be happy to review it.

brendonparker commented 7 months ago

Yes. That approach should work.

I’m largely unimpeded as my work around is that I can change my type from byte[] to string and handle the conversion. But I can work on a PR so that other later don’t get bit by this.

ashishdhingra commented 7 months ago

The changes have been released in Amazon.Lambda.Serialization.SystemTextJson version 2.4.3. Thanks for your contribution.

github-actions[bot] commented 7 months ago

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.