aws / aws-sdk-net

The official AWS SDK for .NET. For more information on the AWS SDK for .NET, see our web site:
http://aws.amazon.com/sdkfornet/
Apache License 2.0
2.07k stars 862 forks source link

DynamoDB Object persistence Model does not respect `required` keyword #3276

Open mstern02 opened 7 months ago

mstern02 commented 7 months ago

Describe the bug

C# 11 introduced the required keyword to signal to the compiler that class members must be initialized using an object initializer.

This is also used by System.Text.Json.JsonSerializer to perform data validation at runtime.

using System;
using System.Text.Json;

var json = """{"Email": "onefish@example.com", "DisplayName": "Fish One"}""";

// This will fail if Email or DisplayName are missing above
var user = JsonSerializer.Deserialize<User>(json);

if (user is null) {
    return;
}

// No complaints about dereferencing a possible null-reference
Console.WriteLine(user.DisplayName.ToUpper());

// Class has two properties, Email and DisplayName, that can not be null.
class User {
    public required string Email {get; set;}
    public required string DisplayName {get; set;}
}

When retrieving a POCO from DynamoDB via DynamoDBContext this modifier is ignored, making it possible to create invalid objects accidentally.

Take the following DynamoDB table storing users with their email and a display name.

Email (pk, string) DisplayName (string)
onefish@example.com Fish One
twofish@example.com null
redfish@example.com empty
var client = new AmazonDynamoDBClient();
var context = new DynamoDBContext(client);

var users = await context.ScanAsync<User>([]).GetRemainingAsync();

// It *should* be safe to assume that DisplayName is not null and .ToUpper() will not fail with a NullReferenceException
Console.WriteLine(string.Join('\n', users.Select(user => user.DisplayName.ToUpper()));)

// Same as in previous example
class User {
    [DynamoDBHashKey]
    public required string Email {get; set;}
    public required string DisplayName {get; set;}
}

Creating a User instance via DynamoDBContext.ScanAsync<User>([]) can create an invalid object if a property marked required does not exist on the corresponding item in DynamoDB, or is null, thus breaking expected guarantees.

Expected Behavior

In the code snippet above I would expect the DynamoDB SDK to fail fast when encountering invalid values, instead of passing them along, which can cause the kind of unexpected behavior that required / #nullable is supposed to guard against.

Current Behavior

The SDK creates invalid objects at runtime.

Reproduction Steps

  1. Create the following DynamoDB Table:
Email (string, pk) DisplayName (string)
onefish@example.com Fish One
twofish@example.com null
redfish@example.com no value
  1. Apply the following patch to the main branch using git apply:
diff --git a/testapp/Program.cs b/testapp/Program.cs
new file mode 100644
index 00000000000..c12fd3b4a30
--- /dev/null
+++ b/testapp/Program.cs
@@ -0,0 +1,16 @@
+using Amazon.DynamoDBv2;
+using Amazon.DynamoDBv2.DataModel;
+
+var client = new AmazonDynamoDBClient();
+var context = new DynamoDBContext(client);
+
+var users = await context.ScanAsync<User>([], new() {OverrideTableName = Environment.GetEnvironmentVariable("DDB_TABLE")}).GetRemainingAsync();
+
+Console.WriteLine(string.Join('\n', users.Select(user => user.DisplayName.ToUpper())));
+
+// Class has two properties, Email and DisplayName, that should never be null.
+class User {
+    [DynamoDBHashKey]
+    public required string Email {get; set;}
+    public required string DisplayName {get; set;}
+}
\ No newline at end of file
diff --git a/testapp/testapp.csproj b/testapp/testapp.csproj
new file mode 100644
index 00000000000..db7d421377c
--- /dev/null
+++ b/testapp/testapp.csproj
@@ -0,0 +1,20 @@
+<Project Sdk="Microsoft.NET.Sdk">
+
+  <ItemGroup>
+    <ProjectReference Include="..\sdk\src\Services\S3\AWSSDK.S3.NetStandard.csproj" />
+    <ProjectReference Include="..\sdk\src\Services\DynamoDBv2\AWSSDK.DynamoDBv2.NetStandard.csproj" />
+  </ItemGroup>
+
+  <ItemGroup>
+    <PackageReference Include="AWSSDK.IdentityManagement" Version="3.7.300.58" />
+    <PackageReference Include="AWSSDK.SecurityToken" Version="3.7.300.59" />
+  </ItemGroup>
+
+  <PropertyGroup>
+    <OutputType>Exe</OutputType>
+    <TargetFramework>net8.0</TargetFramework>
+    <ImplicitUsings>enable</ImplicitUsings>
+    <Nullable>enable</Nullable>
+  </PropertyGroup>
+
+</Project>
  1. Run the example
cd ./testapp
DDB_TABLE="<name of table>" dotnet run

It will fail when trying to invoke ToUpper() on a null value.

Possible Solution

3277

Additional Information/Context

I consider this a bug, since System.Text.Json.JsonSerializer upholds these guarantees at runtime, whereas Amazon.DynamoDBv2.DataModel.DynamoDBContext does not.

AWS .NET SDK and/or Package version used

AWSSDK.DynamoDBv2 Git commit 23b74540be451eda8bb789980b4863bd8cc4241f (HEAD at time of writing)

Targeted .NET Platform

.NET 8

Operating System and version

Windows 11 22H2

mstern02 commented 7 months ago

Further testing has also revealed another bug the associated PR would fix:

When trying to read a bool or number from DDB when it does not exist will cause the dafault value (false / 0) to be returned.

Email (string, pk) MFARequired (bool) LoginAttemps (number)
onefish@example.com true 2
twofish@example.com empty empty
public class UserAuthConfig {
  [DynamoDBHashKey]
  public required string Email {get; set;}
  public required bool MFARequired {get; set;}
  public required int LoginAttempts {get; set;}
}

In this example the current implementation will read {"Email": "twofish@example.com", "MFARequired": false, "LoginAttempts": 0}, instead of throwing an error, which seems like it could cause unforeseen bugs 😄

Dreamescaper commented 7 months ago

Required modifier does not disallow setting null (for nullable types or if nullable reference types are disabled).

Here's a perfectly valid code setting nulls:

var tst1 = new TestNullable { Value = null };
var tst2 = new TestNonNullable { Value = null };

class TestNullable
{
    public required string? Value {get; set; }
}

#nullable disable
class TestNonNullable
{
    public required string Value {get; set; }
}
mstern02 commented 7 months ago

Required modifier does not disallow setting null (for nullable types or if nullable reference types are disabled).

That is true (and good to know 😉 ), but required still makes a difference when using System.Text.Json.JsonSerializer.Deserialize:

using System;
using System.Text.Json;

var json = """{}""";

// Deserialized: User{ Email = null }
var optUser = JsonSerializer.Deserialize<OptionalUser>(json);

// Deserialization Error due to missing required property
var reqUser = JsonSerializer.Deserialize<RequiredUser>(json);

Console.WriteLine("{0}, {1}", optUser.Email, reqUser.Email);

#nullable disable

class OptionalUser {
    public string Email {get;set;}
}

class RequiredUser {
    public required string Email {get;set;}
}

It just seems unfortunate that the standard JsonSerializer upholds the assumption that required properties are not null, while DynamoDBContext does not 🤔

Dreamescaper commented 7 months ago

the standard JsonSerializer upholds the assumption that required properties are not null

It doesn't. It fails because Email property is missing altogether. But it works fine if you pass the following json: var json = """{"Email": null}""";

Anyway, my point is that it only makes sense to implement this feature along with handling nullable annotations as well.

Also, that would be a breaking change (unless it's opt-in by some config flag). So probably woudn't be approved until the major version bump (although not up to me to decide 😉 ).

ashishdhingra commented 7 months ago

Needs reproduction. This is more likely a feature request since it's requesting to support a new language feature.