Syntactic sugar for late-initialization non-nullable auto-properties

roji commented 5 years ago

This attempts to solve the same issues as #2328, but via a far simpler mechanism.

Users have raised concerned with the warning produced by the compiler when an NRT field is uninitialized (e.g. https://github.com/dotnet/csharplang/issues/36#issuecomment-471470105, #2411). This happens in many late initialization scenarios where a non-nullable property may be nullable until it is initialized at some point after construction. In a nutshell, it seems that NRT uninitialized warnings are making it very difficult/verbose to write DTOs and similar late-initialized objects in the NRT world, to the point where people are likely to just turn nullability off.

Examples:

Late initialization scenarios, deserialization, non-constructor dependency injection. Users may not (and possibly cannot) access properties before they are injected, but once they are injected they are guaranteed to be non-nullable.
Explicit loading from a database. For example, a certain property may be loaded from the database - in which case it is never null - but accessing the property without first loading it is an error. This case is typical in Entity Framework Core.

These properties should never return null: if accessed without proper initialization/loading, they should typically cause InvalidOperationException to be thrown (it could be said that premature access is against the type's API contract).

It's currently possible to get around this problem via the following:

private Foo? _foo;

public Foo Foo
{
    get => _foo ?? throw new InvalidOperationException($"The property has not been initialized");
    set => _foo = value;
}

But duplicating this again and again gets old really quickly.

We could have syntactic sugar for auto-properties that produces the same thing:

[LateInitialization]
public Foo Foo { get; set; }

The attribute would:

Suppress the compiler's uninitialized NRT warning
Cause the compiler to generate the above getter (with the null check and exception).
Possibly accept the exception string to be passed to InvalidOperationException
Possibly be allowed on types, applying to all members of the type (i.e. for pure DTO types where all members are late-initialized). In that case it may make sense to also have a member-level opt-out (proposed by @eyalsk)
Hopefully have a name that everyone likes

This could also be a partial solution to the object initializer problem, although an ideal solution wouldn't involve anything at the definition, only proper initialization at the call site (this would of course require a complete redefinition of what object initialization is).

Note: as the discussion in this thread is long, here is a summary below from my point of view

roji commented 5 years ago

/cc @divega @ajcvickers

ajcvickers commented 5 years ago

Thanks @roji. I suspect that this kind of sugar will be heavily requested by EF Core users once they start using the feature.

iam3yal commented 5 years ago

@roji Might be a good idea to allow this at the class level in addition to individual properties and then in cases you want to opt out you could do the opposite [LateInitialization(false)] I mean, I imagine that people would have more properties that require the attribute than not at least for DTOs.

SilentSin commented 5 years ago

[LateInitialization] could be taken to imply an if (null) make new object; return pattern rather than an exception. Perhaps something like [InvalidUntilAssigned] or [InvalidIfNull] would be clearer?

Should it also put an exception in the setter if you try to set the value to null? Perhaps enabled with an optional parameter in the attribute?

roji commented 5 years ago

@eyalsk

@roji Might be a good idea to allow this at the class level in addition to individual properties and then in cases you want to opt out you could do the opposite [LateInitialization(false)] I mean, I imagine that people would have more properties that require the attribute than not at least for DTOs.

Makes sense to me... Maybe a member-level opting out of a type-level opt-in is a bit much, but a type-level opt-in definitely makes sense for pure DTO types.

roji commented 5 years ago

@SilentSin

[LateInitialization] could be taken to imply an if (null) make new object; return pattern rather than an exception. Perhaps something like [InvalidUntilAssigned] or [InvalidIfNull] would be clearer?

I think you're worried that "late" would be construed as "lazy"... I personally don't feel there's much risk of that - I don't recall having seen "late initialization" mean that in other contexts - but naming is always the hardest part :)

Should it also put an exception in the setter if you try to set the value to null? Perhaps enabled with an optional parameter in the attribute?

I don't think this makes sense - there's no implicit runtime checking for normal NRT properties, why should there be one for late-initialized ones? The general approach of the NRT feature is to rely on the compiler to avoid nulls where they don't belong.

Logerfo commented 5 years ago

I wouldn't expect the compiler team to develop something like that. Instead, I'd develop myself a Fody "weaver" to achieve the requested behavior. It's been helping me a lot since I found it.

roji commented 5 years ago

Fody and weaving are great, but the issue this is trying to address is likely to affect many, many users (anyone using DTOs or various late initialization scenarios), so it seems reasonable to expect a solution from the compiler, rather than rely on an external weaving solution.

ymassad commented 5 years ago

It seems to me that this feature simply replaces the infamous NullReferenceException with InvalidOperationException.

Instead of doing this, I think that frameworks such EF should support entities that don't have a parameterless constructor and that work with constructors to initialize values.

YairHalberstadt commented 5 years ago

@ymassad The advantage is that the exception is thrown early, and lets you know exactly where and how it occured.

iam3yal commented 5 years ago

@ymassad The difference between NullReferenceException and InvalidOperationException (or any other exception that isn't thrown by the CLR for that matter) is that the former isn't explicitly thrown by the framework's code but the platform whereas the latter is explicit and is thrown only by the framework or your own code which makes a big difference because it's closer to the source of the problem.

roji commented 5 years ago

It seems to me that this feature simply replaces the infamous NullReferenceException with InvalidOperationException.

Agree very much with what @YairHalberstadt and @eyalsk wrote above - an early InvalidOperationException from the correct place is much better than a NullReferenceException thrown from some random place in the code where a null inadvertently ended up.

Instead of doing this, I think that frameworks such EF should support entities that don't have a parameterless constructor and that work with constructors to initialize values.

That simply doesn't always work. For example EF sometimes have to load two entities which reference each other - there's simply no way to initialize that kind of circular graph via constructors.

Another scenario is an object that has some I/O operation as part of its initialization. This can't happen in the constructor (since constructors can't be async), so another initialization methods is needed. It seems wrong to make the relevant fields nullable just because they're initialized late, and force all access to perform null checks, because we know they'll never be null (assuming the type was properly initialized).

gulshan commented 5 years ago

I think, C# should change the meaning of Object initializer syntax and lift it from a syntactic sugar to an actual object constructor. This will be a breaking change, but breakage can be minimized if designed carefully. There can be a flag to enable previous behavior. And it will actually solve this problem, instead of some makeshift solution.

roji commented 5 years ago

@gulshan this proposal is more about DTO types and other late initialization scenarios, and not only about object initializer syntax. However, I agree in general that the current object initializer feature doesn't intersect well with the new nullability feature (see the last paragraph in my original issue description).

HaloFour commented 5 years ago

@gulshan

This will be a breaking change, but breakage can be minimized if designed carefully.

This will break massive amounts of code, thus there's no chance it would be considered.

There can be a flag to enable previous behavior.

The team has been quite vocal about being against introducing flags/dialects to the language. The barrier for acceptance there is extraordinarily high.

HaloFour commented 5 years ago

I'm a little suspicious of this proposal. It feels more like a library concern than a language concern, although I'm not sure how it could be solved as such.

But the motivation seems to be around the NRT feature, and one of the design goals of that feature was to not affect the behavior of the code at all. So having a related feature which would change the behavior seems inappropriate.

gulshan commented 5 years ago

@HaloFour As object initialization became the default .net way of initializing objects, surely any change in the meaning of object initializer syntax will affect a massive amount of code. But I think, lifting it to valid object constrctor will break a lot less code. Any common example code that will break? I think at least we should explore and discuss.

HaloFour commented 5 years ago

@gulshan

But I think, lifting it to valid object constrctor will break a lot less code. Any common example code that will break?

So, if there's a constructor that smells sorta like the initializer it will automatically convert to calling that? What if only some of the properties match? What if there are multiple overloads? How is the compiler going to do the matching at all, especially given that the naming convention between properties and parameters is different.

In the end, at best, it results in silently changing the meaning of a lot of existing code, which is a very bad thing. It would be very difficult to reason if, when and how property initializers are promoted to constructor parameters vs. remaining property setters.

I think at least we should explore and discuss.

As long as it changes the meaning of existing code there's really nothing to discuss.

roji commented 5 years ago

@HaloFour

I'm a little suspicious of this proposal. It feels more like a library concern than a language concern, although I'm not sure how it could be solved as such.

I'm not sure what you mean by library concern...

This proposal comes from my experience after both annotating actual codebases and working on EF Core's support for nullability. My concern is that the uninitialized warnings introduced in the new nullability feature makes it very difficult/verbose to write DTOs or other late-initialized types - a very common scenario. Current options are to either manually wrap a nullable field as I showed above, or to suppress the uninitialized warning - which is very undesirable (could very easily lead to suppressing the warning on other fields which are not late-initialized, etc.).

But the motivation seems to be around the NRT feature, and one of the design goals of that feature was to not affect the behavior of the code at all. So having a related feature which would change the behavior seems inappropriate.

This is a completely opt-in feature (developers would need to use the new attribute), so a change in behavior seems to be acceptable. It's also not part of the NRT feature - more like a supporting feature designed to make null-aware code easier and less verbose to write.

Regardless, how do you see this problem? Do you agree that there's an issue here to be solved, and if so, do you have another solution in mind?

HaloFour commented 5 years ago

@roji

I agree there's a problem.

For normal late-initialized types I think the problem needs to be solved through better flow analysis that can take property assignment into consideration. The compiler needs to keep track of which properties must be initialized with a non-null value and which properties have been initialized and warn if the object in another manner or escapes before all properties are initialized.

As for frameworks like EF or JSON.NET that construct these objects via reflection or other means this definitely gets a lot trickier. Ultimately I think those libraries should take into account NRTs and should throw if it can't properly initialize the type. This is akin to model validation with MVC, for instance. Of course that would take time to percolate out, and would likely need to be an opt-in feature at least initially.

In either case, the NRTs warnings within the POCO is a problem that I think needs to be addressed without requiring additional language changes.

roji commented 5 years ago

For normal late-initialized types I think the problem needs to be solved through better flow analysis that can take property assignment into consideration. The compiler needs to keep track of which properties must be initialized with a non-null value and which properties have been initialized and warn if the object in another manner or escapes before all properties are initialized.

It sounds like you're thinking of something that's a bit similar to #2328, where a non-nullable property is somehow allowed to remain null within the scope of a single method as long it's properly assigned within that method. There seem to be quite a few issues with that approach: as I wrote in #2328, it's very common for instantiation and population to happen in different methods, and I don't see how that could be tracked by the compiler.

This seems to also imply that the compiler would never warn for non-nullable properties which aren't initialized in a constructor - which seems like a pretty vital part of the NRT feature.

As for frameworks like EF or JSON.NET that construct these objects via reflection or other means this definitely gets a lot trickier. Ultimately I think those libraries should take into account NRTs and should throw if it can't properly initialize the type. This is akin to model validation with MVC, for instance. Of course that would take time to percolate out, and would likely need to be an opt-in feature at least initially.

I'm not sure we're talking about the same thing... There's no actual problem with those libraries - the question is how ordinary users are supposed to write the types which EF or JSON.NET populate.

In either case, the NRTs warnings within the POCO is a problem that I think needs to be addressed without requiring additional language changes.

But what would that mean, unless you're proposing to drop those warnings altogether somehow?

HaloFour commented 5 years ago

@roji

it's very common for instantiation and population to happen in different methods, and I don't see how that could be tracked by the compiler.

I assume that there are going to be tons of edge cases with NRTs in general where the compiler can't track nullability, especially across a boundary like a method call. Only thing I can suggest for that is yet-more-attributes, e.g. one that would denote that a method can accept a partially-initialized object and will result in that object being initialized. But I don't think that the compiler will ever be able to solve all of these cases and we're going to have to adopt new patterns going forward.

This seems to also imply that the compiler would never warn for non-nullable properties which aren't initialized in a constructor - which seems like a pretty vital part of the NRT feature.

I'm not implying that would apply universally. Perhaps only for POCOs that have parameterless constructors, or POCOs that are specifically annotated with an attribute to indicate that they are late-initialized.

There's no actual problem with those libraries - the question is how ordinary users are supposed to write the types which EF or JSON.NET populate.

Yes there is: those libraries haven't been updated to support NRTs. If they were, and could enforce the validation of the types as they were being created, there'd be no need for this proposal at all.

Those libraries are the better place for this as they can manage the policy. For example, how do you propose this feature work with EF POCO properties that are optionally update on create? They'd be nullable on initialization and allowed to be nullable when passed to EF. If EF tries to read that property it would throw that exception. But those properties would always be non-nullable when read from EF.

But what would that mean, unless you're proposing to drop those warnings altogether somehow?

It would mean that the warnings would move from the constructor to the code that is constructing/initializing that type.

Joe4evr commented 5 years ago

it's very common for instantiation and population to happen in different methods

Do you have anything to back this claim, because I sure have never seen that.

svick commented 5 years ago

@HaloFour

But the motivation seems to be around the NRT feature, and one of the design goals of that feature was to not affect the behavior of the code at all. So having a related feature which would change the behavior seems inappropriate.

Wouldn't the same objection apply to Simplified parameter null validation code (https://github.com/dotnet/csharplang/issues/2145), which is a championed proposal?

HaloFour commented 5 years ago

@svick

That seems orthogonal to NRTs, and the team seems to stress that it doesn't affect how the compiler interprets the type, only that value. Theoretically a parameter declared object? foo! would be perfectly legal. But sure, you could say that they're related at least in spirit.

roji commented 5 years ago

Only thing I can suggest for that is yet-more-attributes, e.g. one that would denote that a method can accept a partially-initialized object and will result in that object being initialized.

This is quite close to the discussion in #2328. Like I wrote there, I really can't see how attributes can help here in the general case.

To me, late initialization is a property of a property, not of a type - some properties may be late-initialized on a type, while others not. So you'd have to express which properties are initialized in a given type by a given method which returns that type. That's starting to get a bit much, I think - but maybe I'm wrong.

I'm not implying that would apply universally. Perhaps only for POCOs that have parameterless constructors, or POCOs that are specifically annotated with an attribute to indicate that they are late-initialized.

So once again - in my mind this isn't all all-or-nothing setting on a type, but rather something that may need to be specified on a member-by-member basis (and it definitely can't be assumed just because a type has a parameterless constructor).

Yes there is: those libraries haven't been updated to support NRTs. If they were, and could enforce the validation of the types as they were being created, there'd be no need for this proposal at all.

So you're writing with the assumption of some solution being already present, i.e. with the warnings having moved from the definition site to the construction site (as you wrote below).

But what would that mean, unless you're proposing to drop those warnings altogether somehow?

It would mean that the warnings would move from the constructor to the code that is constructing/initializing that type.

Fair enough. I think that would depend on some cross-method boundary solution, which may make this unfeasable, but I may be wrong.

roji commented 5 years ago

@Joe4evr,

it's very common for instantiation and population to happen in different methods

Do you have anything to back this claim, because I sure have never seen that.

I took a quick look at JSON.NET and found the following: https://github.com/JamesNK/Newtonsoft.Json/blob/b34a3ccca7ff6535c0c9e7981479ee1abf169a94/Src/Newtonsoft.Json/Serialization/JsonSerializerInternalReader.cs#L476

I don't know the codebase at all so may be totally mistaken, but there seems to be a CreateNewObject() followed by a PopulateObject(). I think a solution is needed that wouldn't force merging these two together.

I've also written various deserialization/materialization routines in my life, and this kind of separation seemed to be frequent: instantiation and population are typically different operations with different concerns.

HaloFour commented 5 years ago

@roji

To me, late initialization is a property of a property, not of a type - some properties may be late-initialized on a type, while others not.

I'm not saying it's specific to a type. It can be specific to given properties. But whether or not an instance is initialized or not depends on whether the compiler determines that the required properties have been initialized for that instance.

So you're writing with the assumption of some solution being already present, i.e. with the warnings having moved from the definition site to the construction site (as you wrote below).

I'm writing this under the assumption that many people have reported that NRTs with POCOs is useless and thus the team will work on a solution to address that.

I took a quick look at JSON.NET

In my opinion what JSON.NET does under the hood is pretty irrelevant. As a serialization library of generic types NRTs are largely not going to apply, at least to how those internals are written.

roji commented 5 years ago

I'm not saying it's specific to a type. It can be specific to given properties. But whether or not an instance is initialized or not depends on whether the compiler determines that the required properties have been initialized for that instance.

The point was that trying to attack the problem of the method boundary via attributes seems like an extremely heavy approach for this, assuming you want the method to declare, for each property of its returned type, whether it has been initialized or not.

I'm writing this under the assumption that many people have reported that NRTs with POCOs is useless and thus the team will work on a solution to address that.

Of course, and that solution is exactly what we're discussing here.

To attempt a quick summary:

We agree there's currently an issue around NRTs and late initialization (of which POCO deserialization is an example).
This problem can be attacked via a runtime approach (this is what this proposal is about), where a runtime check ensures that null is never returned and the uninitialization warning is suppressed.
Or it can be attacked via a compile-time approach, which attempts to figure out at the construction site where everything has been properly initialized.
The compile-time option can be implemented within a single method, or across method boundaries, probably requiring a pretty elaborate attribute system which IMHO would be too much.
In theory, the runtime solution can be implemented quickly as a first step, with compile time checks added at a future point over the same attribute.

In my opinion what JSON.NET does under the hood is pretty irrelevant. As a serialization library of generic types NRTs are largely not going to apply, at least to how those internals are written.

Why is that? Isn't it a goal to allow libraries such as JSON.NET (and similar) to be written with NRTs?

Regardless, we've been concentrating on POCOs and materialization, but late initialization is something that occurs in many other scenarios.

svick commented 5 years ago

In my opinion what JSON.NET does under the hood is pretty irrelevant. As a serialization library of generic types NRTs are largely not going to apply, at least to how those internals are written.

Why is that? Isn't it a goal to allow libraries such as JSON.NET (and similar) to be written with NRTs?

The type of the targetObject variable is object, so the compiler won't be able to perform any kind of analysis regarding the initialization of its properties, no matter what.

roji commented 5 years ago

The type of the targetObject variable is object, so the compiler won't be able to perform any kind of analysis regarding the initialization of its properties, no matter what.

Fair enough, although doesn't that show a problematic aspect of moving the uninitialized warnings from the definition to the construction? Any instance that gets cast to object loses any tracked information about which properties have been initialized, etc.

But more generally, I was trying to say that late initialization goes beyond deserializers.

Basically it would be good to hear something from the language team about the directions being considered here (or at least that the issue is under consideration). Any approach that removes the uninitialized warnings from the definition removes the immediate pain point, but opens up other problems.

YairHalberstadt commented 5 years ago

@roji

Any instance that gets cast to object loses any tracked information about which properties have been initialized, etc.

When constructing an instance of a type you always know the type (excluding via reflection). Usages of that instance (including casting it) before it has been initialised would produce a warning.

roji commented 5 years ago

It seems like we're going around in circles.

If I'm the only person who thinks that initialization/population can occur across several methods (and even types), then fine - I'll drop it. In that case, a compile-time approach within a single method should be sufficient, while eliminating warnings from the definition.

But if that's the case, let's be super clear about it, and there's no need to be discussing any attribute-based approach to the problem - because the problem is by definition limited to the scope of a single method. This ambiguity in the discussion has been part of the confusion from the start IMHO.

YairHalberstadt commented 5 years ago

Personally, I think the 99% case outside of reflection based creation and initialisation is to initialise something in a single method. An attribute based DSL might be useful to help with some of the remaining 1%, but it's not strictly necessary.

Of course your experience in entity Framework is likely to be very different.

ajcvickers commented 5 years ago

To clarify again the case we want considered by the compiler:

private Foo? _foo;

public Foo Foo
{
    get => _foo ?? throw new InvalidOperationException($"The property has not been initialized");
    set => _foo = value;
}

The compiler does not and cannot know when set is going to be called or from where.

spydacarnage commented 5 years ago

I thought this was about non-nullable properties?

As it stands at the moment, any POCO I write that, for example, gets instantiated by Dapper from a database, either lists a property as:

public string? PropName { get; set; }

Or:

public string PropName { get; set; } = string. Empty;

I'd like to have null options, because the database has them, but I'd rather not have to pre-initialize every non-null, non-value-type property...

On Mon, 22 Apr 2019, 22:52 Arthur Vickers, notifications@github.com wrote:

To clarify again the case we want considered by the compiler:

private Foo? _foo; public Foo Foo { get => _foo ?? throw new InvalidOperationException($"The property has not been initialized"); set => _foo = value; }

The compiler does not and cannot know when set is going to be called or from where.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/2452#issuecomment-485565302, or mute the thread https://github.com/notifications/unsubscribe-auth/ADIEDQOUV55VTLLWNWFTEGLPRYXRBANCNFSM4HGXYFVA .

HaloFour commented 5 years ago

@roji

If I'm the only person who thinks that initialization/population can occur across several methods

I can't say that I've ever seen initialization spread all over the place like that, at least not in code constructing concrete types. Maybe that's common for serialization libraries where a feature like this doesn't seem particularly useful.

This problem can be attacked via a runtime approach

I don't like a runtime approach to this, not the least of which one espoused by the compiler. There is too much policy here and too little control as to what is actually emitted. If source generators were a thing yet this would very clearly fall under their domain, especially since the suggestion here is to use attributes. Furthermore I can see this causing issues with serializers and ORMs that would otherwise handle a null value just fine but now have no choice but to handle a completely unexpected exception. This also seems like a stopgap for that (hopefully) brief period of pain after NRTs ship but before major libraries like EF, JSON.NET, etc. explicitly support them.

Or it can be attacked via a compile-time approach, which attempts to figure out at the construction site where everything has been properly initialized.

This is what is currently being designed for NRTs. I expect that the POCO situation will be resolved in some way for the majority of use cases. That very well may not account for initialization spread out all over the place. It's tough to say what the non-POCO late initialization pass might look like, if it's considered at all. The problem with NRTs will be the same as the problem with this approach: the compiler has no idea which of the properties may have been initialized by any given constructor.

In theory, the runtime solution can be implemented quickly as a first step, with compile time checks added at a future point over the same attribute.

This is quite backwards to the approach that the team is taking with NRTs.

roji commented 5 years ago

Furthermore I can see this causing issues with serializers and ORMs that would otherwise handle a null value just fine but now have no choice but to handle a completely unexpected exception.

I don't really understand this... Why serializers and ORMs (or anyone) be reading a property which hasn't yet been initialized? The whole point of NRTs is to make sure that a null isn't observed on non-nullable members, so how is it better for them to return null?

This also seems like a stopgap for that (hopefully) brief period of pain after NRTs ship but before major libraries like EF, JSON.NET, etc. explicitly support them.

Once again, as I already wrote, late initialization is something more general than just deserialization of POCOs. An object may contain some properties that need to be populated from some other source, without this being doable from their constructor (e.g. because it requires async I/O) - but once properly initialized, will never contain null. Another possible example is two objects which need to contain references to one another.

My intuition is that method-scoped compile-time analysis is not going to be enough in the general case, but I may be wrong.

In theory, the runtime solution can be implemented quickly as a first step, with compile time checks added at a future point over the same attribute.

This is quite backwards to the approach that the team is taking with NRTs.

It's definitely not ideal, just a proposal attempting to be pragmatic. It seems like we're no longer in the early days of C# 8, and for now this problem hasn't been addressed - and it would be better to have a runtime solution than nothing at all.

In any case, it seems like what's needed next in this discussion is some input from the language folks, which have undoubtedly discussed this internally.

HaloFour commented 5 years ago

@roji

Why serializers and ORMs (or anyone) be reading a property which hasn't yet been initialized?

Because not all operations are symmetrical. Properties which are initialized or updated via the ORM itself do not require initialization, or may allow optional initialization with a database default. Those properties would never be null on read, and reads would constitute the vast majority of operations, so it makes sense to mark the properties as not null. Maybe those aren't situations where you envision this feature being useful, but I can see another developer falling into the pit of failure of thinking that it should apply here.

My intuition is that method-scoped compile-time analysis is not going to be enough in the general case, but I may be wrong.

I disagree. I think that scoping such initialization within a method would cover the majority of use cases of late-initialized objects. I think that the spreading of such initialization across multiple methods would be an edge case and one that I seriously doubt that NRTs will be able to solve to a degree that you'd likely find satisfactory.

It seems like we're no longer in the early days of C# 8, and for now this problem hasn't been addressed - and it would be better to have a runtime solution than nothing at all.

C# 8.0 hasn't shipped yet and the team is still working on NRTs based on feedback. I fully expect that the team will address late-initialization before C# 8.0 goes live.

You have a runtime solution today. It's just more verbose than you'd prefer. You could already make it less verbose through a helper method. Either way you're not blocked.

But sure, I'd also like to hear the language team triage this.

roji commented 5 years ago

Why serializers and ORMs (or anyone) be reading a property which hasn't yet been initialized?

Because not all operations are symmetrical. Properties which are initialized or updated via the ORM itself do not require initialization, or may allow optional initialization with a database default. Those properties would never be null on read, and reads would constitute the vast majority of operations, so it makes sense to mark the properties as not null. Maybe those aren't situations where you envision this feature being useful, but I can see another developer falling into the pit of failure of thinking that it should apply here.

I think we have a disconnect here... If a property may contain null as an actual value (i.e. in the database), then obviously it should be nullable in C# and this whole discussion doesn't apply .

On the other hand, if null isn't a valid data value, but needs to be temporarily allowed because of late initialization, then it should be a programming error for anyone to read it before it has been initialized, and therefore an InvalidOperationException makes sense, no?

You have a runtime solution today. It's just more verbose than you'd prefer. [...] Either way you're not blocked.

That's true. But seeing what the repetition of the verbose pattern does to a codebase makes me fear that users (e.g. of EF Core) would prefer to abandon the NRT feature rather than going down that road.

You could already make it less verbose through a helper method.

How could a helper method help here? We're talking about the definition of the property as backed by a nullable field, no?

One more point on this... The current alternative proposals seem to concentrate on removing the uninitialized warnings from the definition site, and instead relying on compile-time analysis at the initialization sites. While that may be good for late initialization, it seems important to keep the uninitialized warnings for properties which aren't late-initialized. If that is so, then users still need a means of distinguishing late-initialized properties from regular properties, i.e. a [LateInitialization] attribute. To me this seems required, regardless of whether that attribute has a compile-time or runtime meaning.

HaloFour commented 5 years ago

@roji

On the other hand, if null isn't a valid data value, but needs to be temporarily allowed because of late initialization, then it should be a programming error for anyone to read it before it has been initialized, and therefore an InvalidOperationException makes sense, no?

You can't differentiate between a programmer and some library doing reflection-based interrogation of that instance, such as a serializer or an ORM. null may be valid on creation, but never when reading the value.

That's true. But seeing what the repetition of the verbose pattern does to a codebase makes me fear that users (e.g. of EF Core) would prefer to abandon the NRT feature rather than going down that road.

It's the same pattern that has to be employed by anyone using properties for anything other than auto-properties. There are other proposals that are trying to look into making them less verbose for simple cases which may apply here and would be significantly more useful than a one off. Source generators would make all proposals like this accomplishable via plugins and would require zero language changes. There are probably a half-dozen other similar proposals involving sticking an attribute on an auto-property to have the compiler automatically emit some kind of code within the accessors all of which have been put in the Source Generator bucket, for better or worse.

As for users of EF Core, or any other ORM/deserializer, that responsibility should be built into the library. It should detect that it can't initialize a type that has required properties and throw immediately rather than leak out some half-initialized type that won't throw until some indeterminate time later.

How could a helper method help here? We're talking about the definition of the property as backed by a nullable field, no?

You could use a helper method to return or throw from the property getter. It doesn't eliminate the property boilerplate, but it makes the getter more concise.

While that may be good for late initialization, it seems important to keep the uninitialized warnings for properties which aren't late-initialized.

That is the point of those proposals. If you don't late-initialize the required properties during, the compiler will emit a warning. I do agree that either way an attribute is likely required here so that the compiler can tell which types are actually late-initialized vs. ones that assign their required properties in a parameterless constructor.

Either way, given NRTs are shipping with C# 8.0 and they're actively working on that now I expect that a resolution for late-initialized types will come as a part of that effort. I personally wouldn't expect new work to be considered/designed/implemented within the C# 8.0 timeframe.

roji commented 5 years ago

I admit this conversation has made see that the value of a runtime check here would be more limited than I originally thought. As there seems to be general resistance to the idea, I'll close this.

Here's a quick summary with the hope it helps.

Late-initialized NRT properties currently present a problem, as the compiler emits uninitialized warnings (CS8618). Users are typically forced to force-initialize them to null (with the null-forgiving operator, as suggested here.
This has the disadvantage that if a property isn't properly initialized, it will leak a null which will trigger an exception at some indeterminate place in the program.
Optional mitigation: instead of initializing the property to null, have the NRT property wrap a nullable field, and having the getter throw InvalidOperationException on null. This produces a clear, early exception. The disadvantage is the verbosity of this pattern.
A potential improvement in the compiler would be to perform compile-time initialization checks. This would mean removing the uninitialized warnings from the definition site, and instead having the compiler check at the instantiation site that all NRT properties are properly initialized.
- An attribute would probably be needed to allow the user to indicate which properties are late-initialized, and therefore should not produce CS8618 (e.g. [LateInitialization]), since normal NRT properties should continue to produce them. It may be useful to be able to specify the attribute on the entire type.
- Such compile-time checking would probably be limited to a single-method scope, not allowing an object that hasn't been fully initialized to escape. Compile-time initialization tracking across method boundaries (e.g. via attributes) seems like it would introduce a lot of new attribute logic everywhere and would may be too much. Hopefully method-scoped checking would be enough for most cases.
If compile-time checking isn't introduced for some reason, it may be useful to consider syntactic sugar to generate the nullable-backed NRT property described above, to make this verbose pattern more accessible to regular users.

dotnet / csharplang

Syntactic sugar for late-initialization non-nullable auto-properties #2452