Expose top-level nullability information from reflection

terrajobst commented 5 years ago

With C# 8, developers will be able to express whether a given reference type can be null:

public void M(string? nullable, string notNull, IEnumerable<string?> nonNullCollectionOfPotentiallyNullEntries);

(Please note that existing code that wasn't compiled using C# 8 and nullable turned on is considered to be unknown.)

This information isn't only useful for the compiler but also attractive for reflection-based tools to provide a better experience. For example:

MVC
- Provides a way to automatically deserialize inputs to controller methods ("model binding")
- Would like to provide model validation so that the existing pattern would allow code to bail early
- Without it, customers would have to apply a custom attribute, such as [Required], or resort to additional null-checks
- Only needs top-level annotations, i.e. string? but not nested, such as IEnumerable<string?>
EF
- Provides a way to generate database schemas from user classes ("code first")
- Would like use nullable information to infer whether columns should be null or non-null (they already do that for nullable value types).
- Without it, customers would have to apply a custom attribute to repeat that information.
- Also only needs top-level annotations

The nullable information is persisted in metadata using custom attributes. In principle, any interested party can already read the custom attributes without additional work from the BCL. However, this is not ideal because the encoding is somewhat non-trivial:

Custom attribute might be generated. The custom attribute might have been generated (meaning is embedded in the user's assembly) or might use the to-be-provided attribute.
Encoded as a byte array. The tri-state is encoded as a linearized version of the constructed generic type.
Compressed. Right now, each member will have the attribute when nullability is turned on but this causes metadata bloat. We're working on a proposal that allows the containing member, type, or assembly to state a default to reduce the number of attribute applications.

It's tempting to think of nullable information as additional information on System.Type. However, we can't just expose an additional property on Type because at runtime there is no difference between string (unknown), string? (nullable), and string (non-null). So we'd have to expose some sort of API that allows consumers to walk the type structure and getting information.

Unifying nullable-value types and nullable-reference types

It was suggested that these APIs also return NullableState.MaybeNull for nullable value types, which seems desirable indeed. Boxing a nullable value type causes the non-nullable representation to be boxed. Which also means you can always cast a boxed non-nullable value type to its nullable representation. Since the reflection API surface is exclusively around object it seems logical to unify these two models. For customers that want to differentiate the two, they can trivially check the top-level type to see whether it's a reference type or not.

API proposal

namespace System.Reflection
{
    public sealed class NullabilityInfoContext
    {
        public NullabilityInfo Create(ParameterInfo parameterInfo);
        public NullabilityInfo Create(PropertyInfo propertyInfo);
        public NullabilityInfo Create(EventInfo eventInfo);
        public NullabilityInfo Create(FieldInfo parameterInfo);
    }

    public sealed class NullabilityInfo
    {
        public Type Type { get; }
        public NullableState ReadState { get; }
        public NullableState WriteState { get; }
        public NullabilityInfo? ElementType { get; }
        public ReadOnlyCollection<NullabilityInfo>? GenericTypeArguments { get; }
    }

    public enum NullableState
    {
        Unknown,
        NotNull,
        MaybeNull
    }
}

Sample usage

Getting top-level nullability information

private NullabilityInfoContext _nullabilityContext = new NullabilityInfoContext();

private void DeserializePropertyValue(PropertyInfo p, object instance, object? value)
{
    if (value == null)
    {
        var nullabilityInfo = _nullabilityContext.Create(p);
        var allowsNull = nullabilityInfo.WriteState != NullableState.NotNull;
        if (!allowsNull)
            throw new MySerializerException($"Property '{p.GetType().Name}.{p.Name}'' cannot be set to null.");
    }

    p.SetValue(instance, value);
}

Getting nested nullability information

class Data
{
    public string?[] ArrayField;
    public (string?, object) TupleField;
}
private void Print()
{
    Type type = typeof(Data);
    FieldInfo arrayField = type.GetField("ArrayField");
    FieldInfo tupleField = type.GetField("TupleField");

    NullabilityInfoContext context = new ();

    NullabilityInfo arrayInfo = context.Create(arrayField);
    Console.WriteLine(arrayInfo.ReadState);         // NotNull
    Console.WriteLine(arrayInfo.Element.ReadState); // MayBeNull

    NullabilityInfo tupleInfo = context.Create(tupleField);
    Console.WriteLine(tupleInfo.ReadState);                        // NotNull
    Console.WriteLine(tupleInfo.GenericTypeArgument[0].ReadState); // MayBeNull
    Console.WriteLine(tupleInfo.GenericTypeArgument[1].ReadState); // NotNull
}

Custom Attributes

The following custom attributes in System.Diagnostics.CodeAnalysis are processed and combined with type information:

[AllowNull]
[DisallowNull]
[MaybeNull]
[NotNull]

The following attributes aren't processed because they don't annotate static state but information related to dataflow:

[DoesNotReturn]
[DoesNotReturnIf]
[MaybeNullWhen]
[MemberNotNull]
[MemberNotNullWhen]
[NotNullIfNotNull]
[NotNullWhen]

@dotnet/nullablefc @dotnet/ldm @dotnet/fxdc @rynowak @divega @ajcvickers @roji @steveharter

terrajobst commented 3 years ago

The linker will warn if there's a typeof(NullableAttribute) in the code needed by the app (I assume this code here would do that). It's basically to guard against the cases where it wants to remove an attribute which the app might need.

No, this code checks by name because the framework doesn't define NullableAttribute nor NullableContextAttribute, they are emitted by the compiler and embedded in each assembly.

I was told that by default we don't trim user code, so presumably that would mean that the types as well as the attribute applications in user code would remain intact; that should address the 90% case for ASP.NET/EF scenarios.

When a user opts into trimming of their own assemblies, presumably there is a way to configure what to root?

vitek-karas commented 3 years ago

It's true that for now we default to not trim app code, only our frameworks. So it could still be an issue if the code ever needs this information on a framework type.

But we need to solve this going forward since we want to be able to fully trim applications - including the app's code.

safern commented 3 years ago

A reminder: last I checked, the mono linker removes nullability annotations when trimming assemblies. Since the functionality proposed here operates over runtime assemblies rather than reference assemblies, we'll need to figure out how to reconcile this. I don't know if there are any tracking issues for this on the mono side.

Also we set the flag for the compiler to not emit nullable metadata for non visible outside the assembly APIs; I don't know how interesting it would be for apps to figure out nullability for internal/private APIs of the framework, I guess that is a non-goal for this API but something to add on the docs?

GrabYourPitchforks commented 3 years ago

I don't know how interesting it would be for apps to figure out nullability for internal/private APIs of the framework

From what I saw, even nullability annotations on public APIs are being trimmed from System.Private.CoreLib and other assemblies.

vitek-karas commented 3 years ago

All nullable annotations are removed - because so far almost nothing needed it - and it's a notable size improvement doing this (there's SO MANY nullable annotations everywhere). Maybe we could only remove them on non-public items, but that would probably still not work correctly for the app's code itself.

safern commented 3 years ago

Maybe we could only remove them on non-public items

Right, but that wouldn't be the linker's job. The compiler has a switch to not emit metadata for non-public items, which we currently use for our assemblies.

terrajobst commented 3 years ago

@safern

I don't know how interesting it would be for apps to figure out nullability for internal/private APIs of the framework, I guess that is a non-goal for this API but something to add on the docs?

Franky I don't care how important that is for app authors -- my opinion is that we should do nothing to help support this :-)

I could understand the desire for public annotations, but even that feels fringe TBH.

@vitek-karas

All nullable annotations are removed - because so far almost nothing needed it - and it's a notable size improvement doing this (there's SO MANY nullable annotations everywhere). Maybe we could only remove them on non-public items, but that would probably still not work correctly for the app's code itself.

When you say "all" you mean including user code? That would be problematic, but I guess similar to serialization the areas where this matters could be isolated.

safern commented 3 years ago

my opinion is that we should do nothing to help support this :-)

Completely agreed. Just something worth including in the docs?

terrajobst commented 3 years ago

my opinion is that we should do nothing to help support this :-)

Completely agreed. Just something worth including in the docs?

We support the APIs we're documenting and we don't document private APIs, so I don't think we need to call this out explicitly. However, the linker docs should clarify what information is stripped from public APIs, such as custom attributes and nullable annotation because it's at least counter intuitive that these annotations are sometimes there and sometimes missing.

MichalStrehovsky commented 3 years ago

A reminder: last I checked, the mono linker removes nullability annotations when trimming assemblies.

More specifically, it only removes the attributes if the runtime is Mono, because the instruction to remove the attributes are embedded in Mono's CoreLib. We really need to address #48217 and have consistency between runtimes in this respect.

JamesNK commented 3 years ago

terrajobst commented 3 years ago

I have updated the proposal to address the feedback.

buyaa-n commented 3 years ago

Also we set the flag for the compiler to not emit nullable metadata for non visible outside the assembly APIs; I don't know how interesting it would be for apps to figure out nullability for internal/private APIs of the framework, I guess that is a non-goal for this API but something to add on the docs?

If the member is private or internal we could check if the module has the NullablePublicOnlyAttribute set and return NullableState.Unknown if the attribute is set

I have updated the proposal to address the feedback.

Thanks @terrajobst i would like to propose few updates to the proposal:


    public sealed class NullabilityInfoContext
    {
        public NullabilityInfo Create(ParameterInfo parameterInfo); // existing APIs
        ...
        public NullabilityInfo Create(MethodBase  methodBase); // add this overload for parsing nullability of a method return value
    }

    public enum NullableState
    {
        Undefined,   // for me sounds better than Unknown :)
        NonNullable, // NotNull is confusing with the `System.Diagnostics.CodeAnalysis.NotNull` attribute 
        Nullable,    // MaybeNull is confusing with the `System.Diagnostics.CodeAnalysis.MaybeNull ` attribute

       // probably add below states for the nullability states depending on other attributes
        MaybeNullWhen, // result depend on MaybeNullWhenAttribute within CustomAttributes
        NotNullWhen, // result depend on NotNullWhenAttribute in CustomAttributes 
        NotNullIfNotNull // this one probably redundant, result depend on NotNullIfNotNullAttribute  in CustomAttributes 
    }

jzabroski commented 3 years ago

The latest proposal Immo edited at the top looks good to me.

bartonjs commented 3 years ago

Video

[MaybeNullWhen] should count as a "maybe null" return answer.
We changed NullabilityInfo.GenericTypeArguments to a non-nullable array to match standard reflection API practices.
- We discussed the caching strategy, and since NullabilityInfo objects are expected to be freshly returned each time there's no parallel-caller mutation concern.
We renamed the MaybeNull member to Nullable
Once we renamed the member the type name for the enum seemed wrong, so we renamed that to NullabilityState.

namespace System.Reflection
{
    public sealed class NullabilityInfoContext
    {
        public NullabilityInfo Create(ParameterInfo parameterInfo);
        public NullabilityInfo Create(PropertyInfo propertyInfo);
        public NullabilityInfo Create(EventInfo eventInfo);
        public NullabilityInfo Create(FieldInfo parameterInfo);
    }

    public sealed class NullabilityInfo
    {
        public Type Type { get; }
        public NullabilityState ReadState { get; }
        public NullabilityState WriteState { get; }
        public NullabilityInfo? ElementType { get; }
        public NullabilityInfo[] GenericTypeArguments { get; }
    }

    public enum NullabilityState
    {
        Unknown,
        NotNull,
        Nullable
    }
}

danmoseley commented 3 years ago

Nit, it still says NullableState above.

RikkiGibson commented 3 years ago

I'm listening to the recording and I feel compelled to mention that if you want to handle this in combination with Nullable<T>, then the scenario [NotNull] int? Prop { get; } should give "not null" for its ReadState and [DisallowNull] int? Prop { get; set; } should give "not null" for its WriteState.

terrajobst commented 3 years ago

Nit, it still says NullableState above.

Fixed, thanks!

davidfowl commented 3 years ago

Is this happening in .NET 6? We'd like to take a dependency on it.

jeffhandley commented 3 years ago

Yes, #54985 is intended to be included in Preview 7.

dotnet / runtime