Expose top-level nullability information from reflection

terrajobst commented 5 years ago

With C# 8, developers will be able to express whether a given reference type can be null:

public void M(string? nullable, string notNull, IEnumerable<string?> nonNullCollectionOfPotentiallyNullEntries);

(Please note that existing code that wasn't compiled using C# 8 and nullable turned on is considered to be unknown.)

This information isn't only useful for the compiler but also attractive for reflection-based tools to provide a better experience. For example:

MVC
- Provides a way to automatically deserialize inputs to controller methods ("model binding")
- Would like to provide model validation so that the existing pattern would allow code to bail early
- Without it, customers would have to apply a custom attribute, such as [Required], or resort to additional null-checks
- Only needs top-level annotations, i.e. string? but not nested, such as IEnumerable<string?>
EF
- Provides a way to generate database schemas from user classes ("code first")
- Would like use nullable information to infer whether columns should be null or non-null (they already do that for nullable value types).
- Without it, customers would have to apply a custom attribute to repeat that information.
- Also only needs top-level annotations

The nullable information is persisted in metadata using custom attributes. In principle, any interested party can already read the custom attributes without additional work from the BCL. However, this is not ideal because the encoding is somewhat non-trivial:

Custom attribute might be generated. The custom attribute might have been generated (meaning is embedded in the user's assembly) or might use the to-be-provided attribute.
Encoded as a byte array. The tri-state is encoded as a linearized version of the constructed generic type.
Compressed. Right now, each member will have the attribute when nullability is turned on but this causes metadata bloat. We're working on a proposal that allows the containing member, type, or assembly to state a default to reduce the number of attribute applications.

It's tempting to think of nullable information as additional information on System.Type. However, we can't just expose an additional property on Type because at runtime there is no difference between string (unknown), string? (nullable), and string (non-null). So we'd have to expose some sort of API that allows consumers to walk the type structure and getting information.

Unifying nullable-value types and nullable-reference types

It was suggested that these APIs also return NullableState.MaybeNull for nullable value types, which seems desirable indeed. Boxing a nullable value type causes the non-nullable representation to be boxed. Which also means you can always cast a boxed non-nullable value type to its nullable representation. Since the reflection API surface is exclusively around object it seems logical to unify these two models. For customers that want to differentiate the two, they can trivially check the top-level type to see whether it's a reference type or not.

API proposal

namespace System.Reflection
{
    public sealed class NullabilityInfoContext
    {
        public NullabilityInfo Create(ParameterInfo parameterInfo);
        public NullabilityInfo Create(PropertyInfo propertyInfo);
        public NullabilityInfo Create(EventInfo eventInfo);
        public NullabilityInfo Create(FieldInfo parameterInfo);
    }

    public sealed class NullabilityInfo
    {
        public Type Type { get; }
        public NullableState ReadState { get; }
        public NullableState WriteState { get; }
        public NullabilityInfo? ElementType { get; }
        public ReadOnlyCollection<NullabilityInfo>? GenericTypeArguments { get; }
    }

    public enum NullableState
    {
        Unknown,
        NotNull,
        MaybeNull
    }
}

Sample usage

Getting top-level nullability information

private NullabilityInfoContext _nullabilityContext = new NullabilityInfoContext();

private void DeserializePropertyValue(PropertyInfo p, object instance, object? value)
{
    if (value == null)
    {
        var nullabilityInfo = _nullabilityContext.Create(p);
        var allowsNull = nullabilityInfo.WriteState != NullableState.NotNull;
        if (!allowsNull)
            throw new MySerializerException($"Property '{p.GetType().Name}.{p.Name}'' cannot be set to null.");
    }

    p.SetValue(instance, value);
}

Getting nested nullability information

class Data
{
    public string?[] ArrayField;
    public (string?, object) TupleField;
}
private void Print()
{
    Type type = typeof(Data);
    FieldInfo arrayField = type.GetField("ArrayField");
    FieldInfo tupleField = type.GetField("TupleField");

    NullabilityInfoContext context = new ();

    NullabilityInfo arrayInfo = context.Create(arrayField);
    Console.WriteLine(arrayInfo.ReadState);         // NotNull
    Console.WriteLine(arrayInfo.Element.ReadState); // MayBeNull

    NullabilityInfo tupleInfo = context.Create(tupleField);
    Console.WriteLine(tupleInfo.ReadState);                        // NotNull
    Console.WriteLine(tupleInfo.GenericTypeArgument[0].ReadState); // MayBeNull
    Console.WriteLine(tupleInfo.GenericTypeArgument[1].ReadState); // NotNull
}

Custom Attributes

The following custom attributes in System.Diagnostics.CodeAnalysis are processed and combined with type information:

[AllowNull]
[DisallowNull]
[MaybeNull]
[NotNull]

The following attributes aren't processed because they don't annotate static state but information related to dataflow:

[DoesNotReturn]
[DoesNotReturnIf]
[MaybeNullWhen]
[MemberNotNull]
[MemberNotNullWhen]
[NotNullIfNotNull]
[NotNullWhen]

@dotnet/nullablefc @dotnet/ldm @dotnet/fxdc @rynowak @divega @ajcvickers @roji @steveharter

rynowak commented 5 years ago

What's the behaviour with a value type? Does this include reflecting the effect of Nullable<> as well?

mattwar commented 5 years ago

Are these really static methods? Shouldn't they have a parameter then?

safern commented 5 years ago

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute. I.e

[NotNull] string? Property { get; set; }

That will have a nullable input but a non-null output.

Also what about generic constraints or interface implementations? Those can also have nullability information and to when getting the generic constraints or interface implementations we get an array of Type.

terrajobst commented 5 years ago

@rynowak

What's the behaviour with a value type? Does this include reflecting the effect of Nullable<> as well?

They would return Unknown right now. We could add an enum member NotApplicable that we return for non-reference types or throw InvalidOperationException. We could also handle those, but this feels wrong because nullable value types are actually different types.

@mattwar

Are these really static methods? Shouldn't they have a parameter then?

Sigh. No, I wrote up a sample project to test this and I used extension methods and then I didn't fix it when copy and pasting. Fixed.

@safern

We need to think about the higher-level attributes. I only thought about NullableAttribute so far.

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute.

This wouldn't change the API shape. People would have to go to PropertyInfo.GetGetMethod() and ask the corresponding MethodInfo directly.

Also what about generic constraints or interface implementations? Those can also have nullability information and to when getting the generic constraints or interface implementations we get an array of Type.

See intro. I've deliberately scoped them out.

chucker commented 5 years ago

We could also handle those, but this feels wrong because nullable value types are actually different types.

They are, but on a source level, C# abstracts the difference away. Maybe introduce something like ValueNull and ValueNotNull as additional enum values. That way, the distinction is still prominent, but it’s also not a runtime surprise (a.k.a. exception) that these won’t be handled.

MichalStrehovsky commented 5 years ago

Why not just make these extension methods and stash them away somewhere in the System.Diagnostics.CodeAnalysis namespace? Or better yet, in a NuGet package shipped out of the Roslyn repo - since their meaning is defined by the C# compiler?

These annotations don't have a meaning to the runtime and reflection itself doesn't respect them in MethodBase.Invoke or FieldInfo.SetValue - it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack. These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

E.g. we added a IsByRefLike property to System.Type, for ref struct because byref-like types have a meaning to the runtime. We didn't add any API for unmanaged struct, because that constraint only means something to the C# compiler.

roji commented 5 years ago

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute.

See https://github.com/dotnet/csharplang/issues/2384 for this at the language level. @terrajobst users could call PropertyInfo.GetGetMethod() and check there, but what would be the result for a call to GetNullableState() directly on a PropertyInfo which has different nullability settings for getter and setter?

it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack.

I think the idea here is that these attributes are meaningful to various reflection-based consumers at runtime (EF, ASP.NET, 3rd-party products). If it were only for the compiler an API would probably not be needed at all.

These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

They are different in that they are emitted by the compiler and interpreted by it. They are also potentially much more complex to interpret (a context-based scheme may be introduced to combat the bloat they introduce), hence the desire to encapsulate that interpretation logic.

tannergooding commented 5 years ago

These annotations don't have a meaning to the runtime and reflection itself doesn't respect them in MethodBase.Invoke or FieldInfo.SetValue - it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack. These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

These attributes are not only going to be respected by the C# compiler, they are also going to be understood and consumed by other nullable aware languages (such as F#).

I would also say that, regardless of whether or not the runtime treats these attributes specially, the overarching ecosystem ultimately will, which I believe is the important point here. These attributes (and others like the unmanaged constraint) convey additional metadata that is generally (but not always) enforced by some user code in the method and it is useful to have an easy/logical way of querying it so that you can perform the appropriate checks when/if needed.

We are already burning this information into all of our assemblies, because of its perceived benefit and exposing that metadata in a well-defined way in the reflection type system will only help fulfill that story and make it easier for users who want to utilize them.

MichalStrehovsky commented 5 years ago

I would also say that, regardless of whether or not the runtime treats these attributes specially, the overarching ecosystem ultimately will, which I believe is the important point here

To be clear, my question is not whether we need an API that provides this functionality - but whether such API should be on ParameterInfo and friends. Since this is adding non-virtual methods (not properties), my expectation is that this API is not going to do any caching - so it doesn't really need to be an instance method.

We could easily stash this API away in e.g. System.Reflection.TypeExtensions and even make it available on lower versions of NetStandard (doesn't have to be exclusive to .NET Core 3.0).

tannergooding commented 5 years ago

We could easily stash this API away in e.g. System.Reflection.TypeExtensions and even make it available on lower versions of NetStandard (doesn't have to be exclusive to .NET Core 3.0).

Nullable reference types is a C# 8 feature which is only officially supported on .NET Standard 2.1 and higher.

I think discussion on where in the reflection stack it goes is reasonable, although I think most people would expect and look for it to be an instance (or extension) method

MichalStrehovsky commented 5 years ago

Nullable reference types is a C# 8 feature which is only officially supported on .NET Standard 2.1 and higher.

Not that I want to make this my main argument (my main argument is to not pollute the instance member surface area with convenience APIs), but runtime reflection is not the only flavor of reflection. A .NET Standard 2.1 assembly can be loaded for inspection in all sorts of ways on downlevel runtimes (reflection only load is one way, and things like our own System.Reflection.MetadataLoadContext is another). Some people might find this API in the form of an extension method that runs downlevel still valuable.

terrajobst commented 5 years ago

@chucker

They are, but on a source level, C# abstracts the difference away. Maybe introduce something like ValueNull and ValueNotNull as additional enum values. That way, the distinction is still prominent, but it’s also not a runtime surprise (a.k.a. exception) that these won’t be handled.

Syntactically, you could make the argment that they look similar. But C# doesn't abstract them away. int? and int are different types and this has real observable language behavior. For example, this is valid:

public void M(int x) {}
public void M(int? x) {}

While this is not:

public void M(string x) {}
public void M(string? x) {}

In other words, you can't overload based on nullability of reference types (because they are fundamentally the same type at runtime). Also, the types have different members. So no, I don't think we should make the argument that "C# abstracts the differences away".

That being said, that doesn't mean we cannot expose the nullability information for nullable types. Boxing a nullable value type causes the non-nullable representation to be boxed. Which also means you can always cast a boxed non-nullable value type to its nullable representation. Given that the reflection API surface is exclusively around object, this might actually work just fine.

So I think I'm coming around agreeing with you 😄

terrajobst commented 5 years ago

@MichalStrehovsky

Why not just make these extension methods and stash them away somewhere in the System.Diagnostics.CodeAnalysis namespace? Or better yet, in a NuGet package shipped out of the Roslyn repo - since their meaning is defined by the C# compiler?

But they aren't specific to C#. In order for the C in CLR to be meaningful, other languages, so they support nullability, should use the same underlying encoding. Otherwise exchanging types no longer works. Hence, the reflection API can be considered general purpose.

These annotations don't have a meaning to the runtime and reflection itself doesn't respect them in MethodBase.Invoke or FieldInfo.SetValue - it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack. These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

Reflection is about inspecting the program at runtime. While I understand that there good reasons to separate concerns and C# semantics and runtime semantics are different, it totally makes sense to expose static information as people have an expectation that they can access that at runtime. And they already can, but it's error prone. As my examples above show, I'd argue these aren't contrived scenarios but fairly compelling and very much in line with expectation for the .NET development experience.

I don't like shipping core functionality like this on NuGet as this is yet another thing that can go out of sync between the various pieces. Also, given our strategy to stronger align C# with the release cycle of the .NET Core platform, this is also against what we want to achieve as a product.

To be clear, my question is not whether we need an API that provides this functionality - but whether such API should be on ParameterInfo and friends. Since this is adding non-virtual methods (not properties), my expectation is that this API is not going to do any caching - so it doesn't really need to be an instance method.

That is a good point -- I should probably mark these APIs as virtual so that other implementations (such as the new metadata based reflection layer) can provide specialized implementations.

We could easily stash this API away in e.g. System.Reflection.TypeExtensions and even make it available on lower versions of NetStandard (doesn't have to be exclusive to .NET Core 3.0).

I'd rather not. We don't want to optimize for out-of-band or downlevel. We want to foremost optimize for what's right for the .NET platform moving forward. I'd argue making them part of the reflection model to encapsulate them in a central way is the right thing to do.

Should we also see a need for this functionality downlevel, we could also ship a bridge pack that defines them as extension methods. But let's not default to frankendesigns just because.

Not that I want to make this my main argument (my main argument is to not pollute the instance member surface area with convenience APIs), but runtime reflection is not the only flavor of reflection

I'd argue that the reverse is true today: the reflection APIs are polluted with a ton of plumbing APIs when the vast majority of the consumers want simple methods to get stuff done. And moving forward I think reasoning about declared nullability will be important to many consumers of reflection.

terrajobst commented 5 years ago

@roji

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute.

See dotnet/csharplang#2384 for this at the language level. @terrajobst users could call PropertyInfo.GetGetMethod() and check there, but what would be the result for a call to GetNullableState() directly on a PropertyInfo which has different nullability settings for getter and setter?

Right now, this issue is still open so there is no spec. In general, at the metadata level the property itself has a signature (and thus a return type). Similar, the C# compiler emits the nullable attribute for the property itself, as well as for all accessors:

// public string? Name { get; set; }

[Nullable(1)]
public string Name
{
    [return: Nullable(1)]
    get
    {
        return this.<Name>k__BackingField;
    }
    [param: Nullable(1)]
    set
    {
        this.<Name>k__BackingField = value;
    }
}

So in other words I don't think exposing this on the property is a bad idea; depending on context it might just not be sufficient for the consumer.

chucker commented 5 years ago

you can't overload based on nullability of reference types (because they are fundamentally the same type at runtime)

That’s an interesting edge case, yes.

But I stand by my assertion — the syntax of string? wasn’t just picked because other languages do that, but also because, by and large, it expresses a sufficiently similar intent to int?. And if someone were to design .NET today, those subtle behavioral differences would have been avoided. They exist by and large for legacy reasons.

(IANA C# language designer, though.)

I'd argue that the reverse is true today: the reflection APIs are polluted with a ton of plumbing APIs when the vast majority of the consumers want simple methods to get stuff done.

Yup. I think higher-level additions to Reflection are welcome.

terrajobst commented 5 years ago

But I stand by my assertion — the syntax of string? wasn’t just picked because other languages do that, but also because, by and large, it expresses a sufficiently similar intent to int?

I know. I was in the LDM meeting when the feature was discussed and many times when it was actively designed. Originally, we also considered doing string and string! where string meant nullable and string! meant non-nullable. The lack of symmetry with nullable value types was urking us, but the higher order bit was that we all felt nullable would be the wrong default. At the same time, it was also clear from the beginning that we can't actually make string and string! or string? different types. Which is very different from how many other languages have designed it. So yeah, the result is a compromise between what is doable and what introduces the least amount of concepts.

MichalStrehovsky commented 5 years ago

But they aren't specific to C#. In order for the C in CLR to be meaningful, other languages, so they support nullability, should use the same underlying encoding.

Nullability as it was implemented is a tooling concern - the CLR is not involved. CLR and the metadata format provide a general purpose way to encode metadata and the tools can have an agreement on what it means. E.g. JetBrains has their own set of attributes that deal with nullability and they're in the same boat as far as the runtime is concerned. So does PostSharp or Entity framework. These are all things that already exist in the ecosystem.

It would be a different discussion if we gave these new annotations a meaning within the runtime, but as far as the runtime is concerned, the annotations that the C# compiler uses are no different from the JetBrains annotations.

The runtime reflection API surface area should ideally not be opinionated about things that don't have a meaning to the runtime. A way to achieve that is to put the various opinions into extension methods that people can opt into, depending on what they're doing.

chucker commented 5 years ago

The runtime reflection API surface area should ideally not be opinionated about things that don't have a meaning to the runtime.

As @terrajobst said, "Reflection is about inspecting the program at runtime."

Here's MS introducing the feature:

The classes in the System.Reflection namespace, together with System.Type, enable you to obtain information about loaded assemblies and the types defined within them, such as classes, interfaces, and value types.

It's not (just) about what's meaningful to the runtime. It's also about what's meaningful to you, the developer, during runtime. This is already the case with MemberInfo.GetCustomAttributes() — those are largely meaningless to the runtime, but may be very meaningful to you (for example, to help with serialization).

terrajobst commented 5 years ago

I think @MichalStrehovsky isnt' wrong; there is a balance to strike. If we put every little language convention into the reflection model, we'll create a mess. Also, not every language feature is relevant at runtime. However, I think we can agree that these things hold:

Nullable reference types will be a popular feature
There are sensible scenarios using the nullable annotations at runtime to make decisions
Getting them right is hard

Given all three things I'd argue that it is a sensible addition to the reflection API surface.

roji commented 5 years ago

+1 on having the new API recognize nullable value types. It's true the nullable reference and value types are completely different things under the hood, but it seems like a good idea not to surface that distinction where not strictly necessary. Ideally users shouldn't need to care more about this than absolutely necessary (see https://github.com/dotnet/csharplang/issues/1865 for an example issue for making nullable value types behave more like NRTs).

Also +1 on including the nullability interpretation logic in the reflection API as opposed to some extension method in a nuget package. This is where people will be looking for it, and if the representation ever changes it makes sense to couple this to the version of .NET Core.

Right now, this issue is still open so there is no spec. In general, at the metadata level the property itself has a signature (and thus a return type). Similar, the C# compiler emits the nullable attribute for the property itself, as well as for all accessors

Makes sense, although this could get very confusing to users, e.g. what does the property's nullability annotation even mean when it's at odds with the nullable on its getter/setter - but I guess that a language question rather than a question for this API. It's still probably a good idea to understand where the language design is heading and make sure the API aligns with that.

@MichalStrehovsky

It would be a different discussion if we gave these new annotations a meaning within the runtime, but as far as the runtime is concerned, the annotations that the C# compiler uses are no different from the JetBrains annotations.

I'd expect (or rather hope) that the Jetbrains annotations would eventually be replaced by the C# 8.0 ones. There's no real justification for the ecosystem to have multiple ways to say "this should not/cannot contain null", and had nullability been part of the original langauge the Jetbrains (and other) ways would probably not have existed. This is what justifies the reflection API being opinionated and exposing this specific representation.

davkean commented 5 years ago

The runtime reflection API surface area should ideally not be opinionated about things that don't have a meaning to the runtime. A way to achieve that is to put the various opinions into extension methods that people can opt into, depending on what they're doing

How far do you take that? Take optional values, params, constant fields, properties or events; no meaning to the runtime and only a tooling concern, but we still represent these.

terrajobst commented 5 years ago

Great point @davkean!

MichalStrehovsky commented 5 years ago

How far do you take that? Take optional values, params, constant fields, properties or events; no meaning to the runtime and only a tooling concern, but we still represent these.

Those are also represented with unique structures at the file format level. If we take those apis away, there's no way to get to them. Nullable annotations are custom attributes like any other. That's my line.

It's very common to look for the CompilerGeneratedAttribute to see if something was compiler generated. Would we put a top level IsCompilerGenerated property on FieldInfo and friends to make it discoverable and optimizable as well? Where do you draw the line?

roji commented 5 years ago

Where do you draw the line?

I think a key point - and what drove this issue in the first place - is that the current scheme of placing an attribute on every single member is causing significant bloat, and the compiler team is looking into a more elaborate and space-saving scheme (e.g. assembly-level and/or type-level defaults). That would make interpretation much more complicated, hence the need of an API.

Basically, if this were a simple boolean attribute on a member I don't think we'd be having this discussion.

I also think that if we end up going with a context-based interpretation, then some form of caching could be beneficial. For example, if we allow specifying default nullability at the type level for its members (to reduce repetition for its members), that information could be cached on Type to avoid performing constant expensive lookups. This would be another reason to put the methods in the reflection API rather than as extension methods.

terrajobst commented 5 years ago

@MichalStrehovsky

Those are also represented with unique structures at the file format level. If we take those apis away, there's no way to get to them. Nullable annotations are custom attributes like any other. That's my line.

That's a fair point.

It's very common to look for the CompilerGeneratedAttribute to see if something was compiler generated. Would we put a top level IsCompilerGenerated property on FieldInfo and friends to make it discoverable and optimizable as well? Where do you draw the line?

I don't think it's that common for runtime reflection to ask that question. It's more common in static analysis tools and AFAIK Roslyn does expose an API for that. If it were a common question to ask at runtime to, then yes, I would consider exposing a property on the reflection APIs as well.

I've outlined my bar above:

If it's popular feature

If there are sensible scenarios at runtime to make decisions

If reading the attributes can be error prone

During ProjectN we've learned the hard way that reflection is a very popular feature among .NET libraries and applications. I'm somewhat surprised that we're having the conversation if there is value in exposing a convenience API for things we expect many developers wanting to do. Having these APIs is the very reason why .NET is considered productive by developers.

In general, if the question could be answered by a one liner, I don't think we need a reflection API. But it's really not. Below is the code I wrote. The code currently assumes that the nullable state is encoded on the member directly, which won't be the case once we ship. Also, I'm sure you can find at least one bug in it, too 😉

Implementing GetNullableState()

```C# namespace System.Reflection { public enum NullableState { Unknown, NotNull, MaybeNull } public static class NullableExtensions { public static NullableState GetNullableState(this FieldInfo f) { return GetTopLevelNullableState(f.GetCustomAttributesData()); } public static NullableState GetNullableState(this PropertyInfo p) { return GetTopLevelNullableState(p.GetCustomAttributesData()); } public static NullableState GetNullableState(this EventInfo e) { return GetTopLevelNullableState(e.GetCustomAttributesData()); } public static NullableState GetReturnTypeNullableState(this MethodInfo m) { return GetTopLevelNullableState(m.ReturnTypeCustomAttributes.GetCustomAttributes(false).OfType()); } public static NullableState GetNullableState(this ParameterInfo p) { return GetTopLevelNullableState(p.GetCustomAttributesData()); } private static NullableState GetTopLevelNullableState(IEnumerable customAttributes) { foreach (var x in customAttributes) { if (x.AttributeType.FullName == "System.Runtime.CompilerServices.NullableAttribute" && x.ConstructorArguments.Count == 1) return TranslateObject(x.ConstructorArguments[0].Value); } return NullableState.Unknown; } private static NullableState GetTopLevelNullableState(IEnumerable customAttributes) { foreach (var x in customAttributes) { if (x.GetType().FullName == "System.Runtime.CompilerServices.NullableAttribute") { var field = x.GetType().GetField("NullableFlags"); if (field != null) return TranslateObject(field.GetValue(x)); } } return NullableState.Unknown; } private static NullableState TranslateObject(object o) { if (o is byte[] bs && bs.Length > 0) return TranslateByte(bs[0]); else if (o is byte b) return TranslateByte(b); else if (o is ReadOnlyCollection args && args.Count > 0) return TranslateObject(args[0].Value); else return NullableState.Unknown; } private static NullableState TranslateByte(byte singleValue) { switch (singleValue) { case 1: return NullableState.NotNull; case 2: return NullableState.MaybeNull; default: return NullableState.Unknown; } } } } ```

MichalStrehovsky commented 5 years ago

I don't think it's that common for runtime reflection to ask that question.

Here it's used in ASP.NET Core, here in EF, here in Newtonsoft JSON. There are also hits in CoreFX and CoreLib. Maybe mine and your definitions of "common" differ.

During ProjectN we've learned the hard way that reflection is a very popular feature among .NET libraries and applications. I'm somewhat surprised that we're having the conversation if there is value in exposing a convenience API for things we expect many developers wanting to do

To be clear, my concern is only about polluting the member surface area with something that could be in the System.Diagnostics.CodeAnalysis namespace. There are plenty other context-specific convenience methods we could turn into instance methods to make them discoverable (the one-liners that just check for the presence of an attribute), but we didn't, because asking on StackOverflow is discoverable enough. It seems like instead of focusing on that, the arguments here focus on "Michal, but this is a useful concept to expose", and now even "Michal, but reflection is useful". I'm not arguing on those.

davkean commented 5 years ago

Maybe mine and your definitions of "common" differ.

Good point, we should expose IsCompilerGenerated too.

It is not argument that Reflection does not represent the languages that consume them. Binder itself tries to implement C#/VB binding rules. GetCustomAttributes takes into account overridden "properties" - a concept that doesn't even exist at runtime, nor represented in metadata.

GSPP commented 5 years ago

Binder itself tries to implement C#/VB binding rules

I find this to be an egregious layering violation and a .NET 1.0 mistake. The reflection system has no business in doing binding.

steveharter commented 5 years ago

I'd like to see reflection be "policy free" as much as reasonable, so inferring "nullability" from NullableAttribute and\or Nullable<> seems like it should be a higher-level API than reflection. Or at least until the implementation and adoption has proven to be stable.

terrajobst commented 5 years ago

@steveharter

I'd like to see reflection be "policy free" as much as reasonable, so inferring "nullability" from NullableAttribute and\or Nullable<>

The proposal here isn't inference -- it is reading the attribute that the compiler put in the metadata. The point is to get the result correct and agree with the compiler semantics. And we're in a much better boat to make that efficient & correct than higher layers.

seems like it should be a higher-level API than reflection.

That layer doesn't exist today. In my book creating such a layer, meaning a new assembly and a new set of types has much higher bar than adding the very targeted methods I'm proposing here. How many features would we have to make such a component necessary and warranted? There isn't enough data to make that case. And I'm asserting that waiting for that data isn't actionable.

Or at least until the implementation and adoption has proven to be stable.

We've put in quite a bit of effort to expose this information in the BCL. That includes the attributes in System.Runtime.CompilerServices or the annotations in the public API surface. We've taken the bet as a platform and once we shipped we can't easily remove any of it. I don't see a reason why we'd suddenly draw the line at five additional methods and one enum. We'll just expose the platform's state.

jkotas commented 5 years ago

I think these should be static state-free decoder methods.

The point is to get the result correct and agree with the compiler semantics.

It is fine for these to be part of the platform, in System.Runtime if we wish, be kept in sync with the compilers, etc.

That layer doesn't exist today

These domain-specific layers do exist today, all over the place. For example, there is bool Marshal.IsTypeVisibleFromCom that reads a bunch of COM related attributes and gives you an answer.

One of the problems not mentioned here yet with stuffing everything possible into the reflection object model itself are hard to get right performance characteristics. Should the API implementations provide fast cache for the result - it means that every user of reflection has to pay because of the space for the cache needs to be allocated; or should the API implementations not cache the result and always crack it from scratch - it means that every other place that calls these APIs will end up caching this? There is no good answer.

When these APIs are state-free static decoders, it is pretty obvious what their performance is going to be and that one should better cache the result.

terrajobst commented 5 years ago

It is fine for these to be part of the platform, in System.Runtime if we wish, be kept in sync with the compilers, etc.

👍

Should the API implementations provide fast cache for the result - it means that every user of reflection has to pay because of the space for the cache needs to be allocated; or should the API implementations not cache the result and always crack it from scratch - it means that every other place that calls these APIs will end up caching this?

The idiomatic pattern in .NET for indicating uncached and potentially expensive operations is exposing them as a methods, which is why I didn't make them properties.

Should these APIs be called a lot, we may find ourselves in the position where we still want to cache the operation in order to speed up the rest of the stack. However, in that case I don't see how the proposed location makes makes that harder. The (internal) cache could still live wherever we want. Making them instance methods opens up the path for them to be stored as fields, but it doesn't have to be. It could still, for example, be a conditional weak table on the side. We could decide to make these non-virtual, in order to have more wiggle room.

jkotas commented 5 years ago

Should these APIs be called a lot, we may find ourselves in the position where we still want to cache

The performance characteritics should be part of the API design. When we fundamentally change the API performance characteristics, it just causes confusion. If certain callers end up calling these APIs a lot, they should be responsible for inventing their own scheme for caching.

the 80% scenario is just answering the question whether the given return value or parameter can be null as a whole

If this is going to be popular, I am pretty sure somebody comes up with good reasons why they want to answer the fine grained questions. What would the object model for answering the fine grained questions be? We should do an exercise what the object model for that would look like, and make sure there is natural transition from the 80% object model to the 99% object model. Are we going to add N more Nullable related methods to all reflection *Info to support the 99% object model?

terrajobst commented 5 years ago

The performance characteritics should be part of the API design.

Agreed, which is why I made them methods as I expect them to be not cached.

What would the object model for answering the fine grained questions be? We should do an exercise what the object model for that would look like, and make sure there is natural transition from the 80% object model to the 99% object model. Are we going to add N more Nullable related methods to all reflection *Info to support the 99% object model?

Also agreed; I have a design in my head. I'll update the issue description with more details. The basic idea would be to introduce a new concept in System.Reflection (e.g. TypeWithAnnotations) that can be created from the *info classes that allows you to walk a type signature and expose information that is provided by custom attributes (I don't think it would be an actual System.Type but a class that holds a type and the custom attributes from the original member info). It would have a property that exposes the nullability state. If we were to expose other information (such as dynamic or tuple names) they could just become properties on this new type too. This type would also cache the answers, but since the creation of the type itself won't be it wouldn't create any overhead for the core reflection model. But as I said, I don't expect to land this design for 3.0.

steveharter commented 5 years ago

I'd like to see reflection be "policy free" as much as reasonable, so inferring "nullability" from NullableAttribute and\or Nullable<>

The proposal here isn't inference -- it is reading the attribute that the compiler put in the metadata. The point is to get the result correct and agree with the compiler semantics. And we're in a much better boat to make that efficient & correct than higher layers.

Does the proposed API attempt to also tie in Nullable<> semantics?

jkotas commented 5 years ago

Is MethodBase.GetReturnTypeNullableState() even necessary? It is just a shortcut for MethodInfo.ReturnParameter.GetNullableState().

danmoseley commented 5 years ago

@steveharter @GrabYourPitchforks are either of you driving this (as owners of area-Reflection)? Do you need help here to complete this by 7/1?

jkotas commented 5 years ago

I do not think we should be trying to get these half-baked APIs in at the last minute. It should be fine for the few places that need to have the nullability decoder ring to use a local private copy of it for 3.0.

terrajobst commented 5 years ago

Video

Given that we don't have all the features in-place, we believe this feature doesn't make the cut for 3.0
What about the custom attributes, such as NullIfNotNullAttribute?
What about generic type parameters (e.g. for tuples)
Namespace choice: should this be in the System.Reflection namespace or should this go into a nested namespace?
There is the general question whether MVC or EF should even handle nullable annotations.
How do we square making runtime decisions with being able to change annotations
We should consider generalizing the feature to handle dynamic and tuple names.

terrajobst commented 5 years ago

@steveharter

Does the proposed API attempt to also tie in Nullable<> semantics?

Yes, that was the idea. See section "Unifying nullable-value types and nullable-reference types".

@jkotas

Is MethodBase.GetReturnTypeNullableState() even necessary? It is just a shortcut for MethodInfo.ReturnParameter.GetNullableState().

That's a good point. That didn't occur to me.

GrabYourPitchforks commented 4 years ago

Since the nullability annotations are still in flux, and since it's likely that consumers will want to consume this from downlevel frameworks, we should start this work in corefxlab and see where things go from there. That also allows us to iterate rapidly without waiting for a formal review to take place.

Edit: I also don't think System.Reflection is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

roji commented 4 years ago

Edit: I also don't think System.Reflection is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

Note that the original need for this arose from packages such as EF and MVC. In that sense it's just another piece of information retrieved about members, next to other things retrieved via reflection.

terrajobst commented 4 years ago

@GrabYourPitchforks

Since the nullability annotations are still in flux, and since it's likely that consumers will want to consume this from downlevel frameworks, we should start this work in corefxlab and see where things go from there. That also allows us to iterate rapidly without waiting for a formal review to take place.

We should design the API/feature for in-box first and then figure out what downlevel looks like.

Edit: I also don't think System.Reflection is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

I disagree. Reflection supports runtime as well as design-time metadata via MetadataLoadContext. The nullable annotations will be an important part of our public API surface, both for the platform as well as the larger ecosystem. While the annotations are in flux the encoding itself is stable at this point and we should centralize the handling for all consumers.

davkean commented 4 years ago

Edit: I also don't think System.Reflection is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

Reflection has lots of concepts that aren't runtime concerns; static method inheritance, default values, VB/C#-like binding (params), concepts of overridden properties. All needed for compilers.

jkotas commented 4 years ago

Number of these - VB/C#-like binding in particular that is largely outdated and unusable by now - are design mistakes of the past that we should not repeat.

terrajobst commented 4 years ago

On the binding site I very much agree, but I see nullable annotations as largely language independent. Yes, only C# has support for it right now but there is no reason why it can't be used by other languages or static analyzers. For example, R# has used side car files for years and lights up warnings in both C# and VB code. They could just read the metadata as well.

jkotas commented 4 years ago

there is no reason why it can't be used by other languages or static analyzers.

Agree. The reflection object model is largely unusable for building language compilers or static analyzers, so this API won't solve their problem.

There are number of C# language features that are in similar category: named tuple items, readonly, unmanaged constrains, ... . We need a principle that we are going to use to decide whether given C# language feature should be exposed in reflection object model. What should be this principle?

terrajobst commented 4 years ago

We need a principle that we are going to use to decide whether given C# language feature should be exposed in reflection object model. What should be this principle?

To me, it's a combination of "how hard is the decoding" and "how likely is it that someone needs to answer that question using reflection APIs". Today, most tools that read metadata don't use reflection because it was tied to runtime loading. With MetadataLoadContext my hope would be that more tools could use the (familiar) reflection APIs, so I wouldn't want to focus reflection on runtime-only scenarios.

Regardless, in this case we have both satisfied IMHO: decoding is non-trivial and there are runtime reflection-scenarios for serializers, OR mappers, and model binder. So I'd say that makes the bar.

RicoSuter commented 4 years ago

Most of the listed features and more (NRT, named tuples, xml docs) need to be reflected by serializers or schema generators (EF, json schema or open api). This is why i had to implement this myself (NRT, xml docs): https://github.com/RicoSuter/Namotion.Reflection

Ideally this would be provided by .NET so I dont have to maintain this myself. However it would be hard to add these APIs to existing types in a nice, because for example Type is not context aware and the same object for every occurrence (is this correct?) - where should the context specific NRT be stored? A sample: ASP.NET Core api explorer only provides the Type of the response - with only this object and no context (i.e. the return parameter or attributes) it’s impossible to retrieve the NRT info which is needed for the openapi spec.

I think these advanced reflection apis should be provided as a nuget package (not part of the .net core/std api) so that they can easily be updated/improved/fixed and are not tight to the runtime version (C#/NRT can also be used in a .NET Standard 1.0 or .NET Core 2.1 library).

roji commented 4 years ago

FYI just ran into the non-trivial problem of calculating nullability fo generic type parameters in EF Core (https://github.com/dotnet/efcore/issues/20718). ASP.NET currently ignores this case (https://github.com/dotnet/aspnetcore/blob/master/src/Mvc/Mvc.DataAnnotations/src/DataAnnotationsMetadataProvider.cs#L492), just to reiterate the need for this feature.

/cc @rynowak

dotnet / runtime