dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.58k stars 10.06k forks source link

Collections of non-nullable types are not checked for null in ASP.NET Core APIs #57234

Open martinmine opened 3 months ago

martinmine commented 3 months ago

Is there an existing issue for this?

Describe the bug

My ASP.NET Core 8 Web API has a controller with a POST endpoint, it accepts the following DTO:

public class RequestDtoObject
{
    public List<string> Names { get; set; }
    public string Name { get; set; }
}

This is set up in an action that just returns the object:

[HttpPost]
public IActionResult EchoResponse([FromBody] RequestDtoObject requestDto)
{
    return Ok(requestDto);
}

When Name or Names is set to null, the API returns a bad request indicating which field is null, which makes sense as these properties are non-nullable (the project has Nullable enabled). However, if I send in a null in the Names list, the Names property will contain a null value despite the fact it is a list of non-nullable strings. The following request object is accepted:

{
  "names": ["string", null],
  "name": "string"
}

Expected Behavior

I expect a bad request response when requests has a list of non-nullable types with a null-value in them, similar to how validation works on other non-nullable properties.

Steps To Reproduce

https://github.com/martinmine/NotNullApi

Exceptions (if any)

No response

.NET Version

8.0.303

Anything else?

No response

martinmine commented 3 months ago

Could this behavior be explained by the fact that nullability metadata isn't accessible for reference types when used with generics? From https://github.com/dotnet/runtime/pull/54985:

The nullability for generic type T should be tracked by the user, a field declaration List<string?> list or List list will have type parameter nullability info but nullability for other List API calls should be tracked by the user. For example for list.Add(T item) the nullability of item will be evaluated as follows:

T is Nullable if the declaration has ? i.e. GenericType<T?> (even if T is value type it is nullable value type) T is Nullable for GenericType when nullability context enabled and the concrete type of T is ref type or nullable value type. (In this case the nullability of parameter T of List.Add(T item) will be Nullable for both List and List<string?> instance, we might want 4th nullability state for this scenario) CC @terrajobst T is NotNull for GenericType when concrete T is non-nullable value type T is Unknown for GenericType when nullability context disabled and the concrete type of T is ref type For open generics the behavior is the same as ref type of T

terrajobst commented 3 months ago

I believe in ASP.NET Core you need to change the JSON serializer options to opt-into this behavior by using the recently added JsonSerializerOptions.RespectNullableAnnotations .

builder.Services.AddControllers().AddJsonOptions(options =>
{
    options.JsonSerializerOptions.RespectNullableAnnotations = true;
});
martinmine commented 3 months ago

I tried out RespectNullableAnnotations and it had no effect on dtos with null-values in List<string?>. I guess that configuration applies to the JSON serializer itself while the model validation is as-is.

terrajobst commented 3 months ago

@eiriktsarpalis Is this not supported?

/cc @dotnet/area-system-text-json

eiriktsarpalis commented 3 months ago

It isn't, unfortunately.

RespectNullableAnnotations only works for properties, fields and constructor parameters because these are the only locations where nullability annotations are representable in IL (compare with the APIs available in the System.Reflection.NullabilityInfoContext class). Fundamentally, there is no way to distinguish between the List<string>, List<string>? and List<string?> types in isolation using reflection. Some more information can be found on the remarks section for that property.

terrajobst commented 3 months ago

I'm confused by this. When we designed NullabilityInfo we made it possible to retrieve nullability information from nested contexts (e.g. array elements and generic parameters), which includes the string? argument to List<T> in your example.

This sounds more like a limitation in JSON where we don't plumb this through?

terrajobst commented 3 months ago

Just read the "Risks" section in the API review item:

Because of restrictions on how NRT's are implemented and the architecture used by JsonSerializer (contracts keyed on System.Type which is nullable-agnostic), it is currently not possible to determine nullability annotations for root-level types, generic properties or collection elements. For example, it is not possible to distinguish between Person and Person? or List and List<Person?> as root-level values. This could lead to user confusion so we should try to document this concern as clearly as possible.

I now vaguely remember that I was in support of this feature regardless "because it still handles most of the common cases". Maybe I was wrong?

eiriktsarpalis commented 3 months ago

When we designed NullabilityInfo we made it possible to https://github.com/dotnet/runtime/issues/29723 (e.g. array elements and generic parameters), which includes the string? argument to List in your example.

I don't think that's how NullabilityInfoContext works. The type functions as a cache for resolving nullability information for nested members (because nullability data is encoded contextually and require reading information from the parent type, however that's merely implementation detail of C# codegen).

This sounds more like a limitation in JSON where we don't plumb this through?

It's not an issue of STJ or NullabilityInfoContext per se, but more fundamentally related to how NRTs are represented in runtime reflection. List<string> and List<string?> are equal types, and if you're passing either as a type parameter into a generic method there is no recoverable nullability metadata. It's for this reason that there is no NullabilityInfoContext.Create overload accepting Type parameters.

Assuming we were building a serializer that was 100% source generated we would be having a different conversation (Roslyn can distinguish between nullable and non-nullable type symbols), however STJ is hybrid and consistency between the two worlds is an explicitly stated goal of the library.

I now vaguely remember that I was in support of this feature regardless "because it still handles most of the common cases". Maybe I was wrong?

Nullability information is still preserved for most contextual data points. Yes, it will miss the cases where the top-level value is null but the nullability annotations in JsonSerializer already account for this case. I think the real value of this feature lies in validating individual properties.

This leaves out collection elements as the odd outlier. One possibility I was considering at the time was exposing an API that explicitly declares, say, the string in List<string> to be non-nullable. However such a feature would necessarily also apply the same enforcement to List<string?> instances, so I figured out that it's orthogonal to the concept of "respecting nullability annotations".

terrajobst commented 3 months ago

@eiriktsarpalis

I don't think that's how NullabilityInfoContext works.

NullabilityInfo supports retrieving nested nullability information but you need to walk the root nullability. NullabilityInfoContext is merely a cache.

I think the problem for STJ is that we associate serialization handling on a per type basis. Since List<string> and List<string?> are the same type, there is no way to handle this differently. In the reflection-based handler the system would need to retrieve the nullability info for a property and then recursively walk that while it walks the signature and align them, which is what I meant by "plumbing it through".

We could logically do something similar to the compiler for the source generator: we'd have to store a set of flat values per-property that describes the nullability of the signature and plumb that through as additional context.

But yes, I can see that being somewhat non-trivial.

Assuming we were building a serializer that was 100% source generated we would be having a different conversation (Roslyn can distinguish between nullable and non-nullable type symbols), however STJ is hybrid and consistency between the two worlds is an explicitly stated goal of the library.

I don't think that this part is a problem. The information is available in both cases. It's just in the Roslyn case their symbol APIs make it a bit more convenient to retrieve because in reflection the nullability state is extrinsic to the type and needs to plumbed through as a secondary parameter.

Maybe we could help that case by offering to construct a type from a given type + nullability info. It would be a separate type universe that wraps the underlying System.Type and adorns it with nullability info. This makes plumbing it through easier, by introducing a similar problem to the Roslyn APIs: you can no longer compare types using ==/reference equality because not only would string? and string be different instances of System.Type, we probably wouldn't even intern types based on their nullability state so the same nullability state would likely return different instances.

eiriktsarpalis commented 3 months ago

I don't think that's how NullabilityInfoContext works.

NullabilityInfo supports retrieving nested nullability information but you need to walk the root nullability. NullabilityInfoContext is merely a cache.

Right, but the context can only root off a property, field or parameter. It can't be used if you try to call something like

JsonSerializer.Deserialize<List<string>>("[\"x\",\"y\",null]");

Assuming we were building a serializer that was 100% source generated we would be having a different conversation (Roslyn can distinguish between nullable and non-nullable type symbols), however STJ is hybrid and consistency between the two worlds is an explicitly stated goal of the library.

I don't think that this part is a problem. The information is available in both cases.

I don't think that's quite true. In the earlier snippet that fact that string in List<string> is non-nullable is non-recoverable information from the perspective of run-time reflection. In fact it's fully erased from IL.

terrajobst commented 3 months ago

I think the limitation of not supporting nullability info when calling Deserialize with a constructed type is probably fine.

There might be cases where we'd need this for infrastructure (such as ASP.NET's model binding from a parameter) but I'd imagine if we solve nullability info plumbing that we could make this work by allowing that state to be passed in from the higher level as a secondary parameter.

The fundamental problem isn't that nullability information isn't available at runtime (it is) it's that it's not a function of the type but of the property, field, or parameter. This makes certain operation (such as the generic method example) insufficient, but I think if we care to solve this it would merely require additional state to be passed in.

The only thing we can't make work is purely constructed generics that were never persisted in a signature of a property, field, or parameter. Which I think is fine.

terrajobst commented 3 months ago

(To be clear, I'm not trying to complain that STJ doesn't support this; I'm mostly trying to distill what is technically feasible and at what cost. I think it's still fine to say "we understand what it would take and we believe the added complexity isn't worth it. That's where I'm at right now, TBH).

eiriktsarpalis commented 3 months ago

I think the limitation of not supporting nullability info when calling Deserialize with a constructed type is probably fine.

Sure, but I think we'd want serialization to be consistent for List<string?> regardless of whether it sits at the root or is a nested property. You are correct that the nullablity of the element can be determined in the latter case, however we can't meaningfully use this because STJ keys all its converter caches on System.Type (something we probably won't ever be able to change also due to backward compatibility concerns).

terrajobst commented 3 months ago

Sure, but I think we'd want serialization to be consistent for List<string?> regardless of whether it sits at the root or is a nested property.

If it were technically possible to have that consistency, I'd definitely prefer that. But since we can't, I'd prefer the feature would work for nested annotations when possible -- assuming we're willing to fund this work. Since that's not how it works today we'd likely end up with an additional property to opt-in (to avoid breaking changes); so to some extent users opting into this have to accept that discrepancy, just like they have to now where the enforcement handles top-level nullability but not nested nullability.

However, I think it is fair to say that nullability information is a bit of a bolt-on in the CLR/C# type system. And in features like this it shows. That means we'll have to accept that there are cases where we can't provide a 100% fidelity with the user's expectations. We'll have to decide whether honoring the attributes is valuable enough to accept the visible seams.

eiriktsarpalis commented 3 months ago

I'd prefer the feature would work for nested annotations when possible -- assuming we're willing to fund this work.

It's something we could look into for .NET 10, but it requires a fundamentally different composition model for converters. Currently when serializing a property of type List<string> we look up the converter for typeof(List<string>) -- we would need to replace this with converters scoped to a particular property and this could nest to arbitrary depth (think supporting properties of complex generic type such as List<(string?, Dictionary<string, string?[]>)>).

I suspect exposing a flag simply declaring that List<string> has a non-nullable element type is going to be a solution that has greater mechanical sympathy to the existing edifice.

And in features like this it shows. That means we'll have to accept that there are cases where we can't provide a 100% fidelity with the user's expectations. We'll have to decide whether honoring the attributes is valuable enough to accept the visible seams.

There are gaps in what is supported to be sure, and we need to explicitly document these. At the end of the day though there is a lot of value coming from the supported scenaria: it's already paid dividends in JSON schema, the motivating use case for this work.

amcasey commented 3 months ago

@eiriktsarpalis Should this issue be in dotnet/runtime?