dotnet / csharplang

The official repo for the design of the C# programming language
11.48k stars 1.02k forks source link

Indicate that a non-nullable Field must be initialized where the type is constructed #2328

Closed YairHalberstadt closed 4 years ago

YairHalberstadt commented 5 years ago

Problem

Consider the following three outstanding issues with NRTs

1. Object Initializers

Object initializers are greatly loved by many C# programmers. However they don't interact very well with NRTs.

Consider this for example:

public class C
{
    public string Str {get; set;}
}
...
var c = new C { Str = "Hello World" };

If Str is to be non-nullable, then when enabling nullability we are forced to completely change all this code:

public class C
{
    public C(string str) => Str = str;
    public string Str {get;}
}
...
var c = new C("Hello World");

Making object initializers usable only be nullable properties and fields.

2. DTOs

Many DTOs are initialized by ORM/Serialization frameworks. As a result they often contain only property getters and setters, without any constructors.

This of course causes warnings when NRTs are enabled, as it doesn't know the properties are always initialized by the ORM/Serialization framework.

3. Structs

Structs always have a default constructor, which cannot be overridden.

As a result whenever a struct containing a field of a non-nullable reference type is constructed using the default constructor, the field will initially be null. This leaves a big hole in the nullable-type-system:

public struct S
{
    public string Str { get; set; } //no warning
}
...
var s = new S(); //no warning
var l = s.Str.Length; // kabam! NullReferenceException! - but no warning

Solution

Based on a suggestion by @Joe4evr https://github.com/dotnet/csharplang/issues/36#issuecomment-471470105

The real issue here is that present mechanisms only allow us to guarantee a field is initialized inside the constructor, but currently many types rely on their invariants being upheld by whoever constructs the type.

It should be possible to mark that a field/auto-property must be initialized at the point it is constructed, before it is used. This could be done via an attribute for example:

public class C
{
    [MustInitialise]
    public string Str {get; set;}
}
...
public struct S
{
    [MustInitialise]
    public string Str { get; set; }
}

Then when an instance of the type is constructed, it should be a warning to use the type before all such fields are initialised:

public class C
{
    [MustInitialise]
    public string Str {get; set;}
}

...

var c = new C { Str = "Hello World" };
var l = c.Str.Length; //no warning
c = new C();
var x = c.Str; //warning
c.Str = "Hello"
var l = c.Str.Length;  //no warning

...

public struct S
{
    [MustInitialise]
    public string Str { get; set; }
}

...

var s = new S { Str = "Hello World" };
var l = s.Str.Length; //no warning
s = new S();
var x = s.Str; //warning
s.Str = "Hello"
var l = s.Str.Length;  //no warning

All structs should have this attribute on all their fields implicitly.

Further issues

Consider this struct:

public struct S
{
    public S(string str) => Str =str;

    public string Str { get; set; }
}

Then it would be a warning not to initialize S.Str when the the new S(string) constructor was called, even though S.Str was initialized by the constructor!

I think the solution to this would be for the compiler to detect which constructors initialize a field, and implicitly add parameters to the MustInitialize attribute which indicate which constructors did not initialize the struct.

ie. the compiler should generate code equivalent to the following:

public struct S
{
    public S(string str) => Str =str;

    [MustInitialize("DefaultConstructor")]
    public string Str { get; set; }
}

then, when the default constructor is called, the compiler knows it must make sure S.Str is initialized.

Kukkimonsuta commented 5 years ago

image

This seem to be already the case for structs, however follwing is an issue right now:

static void Foo(S s)
{
    var l = s.Str.Length; // kabam! NullReferenceException! - but no warning
}

static void Main(string[] args)
{
    var s = new S();
    Foo(s);
}
GeirGrusom commented 5 years ago

In Kotlin this feature is called lateinit and is a modifier applied to properties.

class C {
  lateinit val str : String;
}

Basically it says that the property is not properly assigned by the constructor but will be assigned at a later point.

DarthVella commented 5 years ago

Great idea, but there's some discussion points that should be addressed quickly.

What of internal initialization methods?

class C
{
    [MustInitialize("DefaultConstructor")]
    public string Str { get; private set; }

    public C() { }

    [Initializes("Str")]
    public void Init(CHelper helper)
    {
        Str = helper.GetHello();
    }
}

Similarly, what of external methods that might allow or disallow uninitialized values?

class C
{
    [MustInitialize("DefaultConstructor")]
    public string Str { get; set; }
}

static void Main()
{
    C c = new C();

    UseString(c); //warning?
    SetString(c); //should not be a warning?
    UseString(c); //no warning?
}

[Initializes("c.Str")]
static void SetString(C c)
{
    c.Str = "Hello!";
}

[MustBeInitialized("c.Str")]
static void UseString(C c)
{
    Console.WriteLine("String length is: " + C.Str.Length);
}

For the record, I would assume that this proposal strictly deals with immediate initialization by the caller, and that these considerations are not within the intended scope.

YairHalberstadt commented 5 years ago

@DarthVella In the long term some sort of attribute language might be necessary, but in the short term I believe this proposal represents a minimum viable product.

DarthVella commented 5 years ago

I should have mentioned that the attributes were only to show the intent, not to suggest that that level of user input was necessary.

Joe4evr commented 5 years ago

I'm generally supportive of the case to have finer granularity over what fields should get initialized by the caller, but I think there's a lot to be said to also have an attribute that can simply cover the whole type in one go. It could be the same, if an appropriate name is available, but my point is that code looks tidier with less attributes all over the place.

EDIT: I just realized there's one more caveat that would need to be filled for that to work smoothly: Externally generated values. A typical EF entity would have the ID generated when added to the database, and it'd be quite annoying to see warnings for those. :thinking:

Joe4evr commented 5 years ago

Brainstorming another potential name for this attribute: CallerInitialize?

ZeroUltimax commented 5 years ago

All examples so far were shown using a String field, but I'm certain this could very well apply to value types as well. It seems to me that, since value types have an automatic default constructor, this would ensure proper initialization of nested members.

Ex.:

public struct S
{
    public string A {get; set;}
}

public class C
{
    [MustInitialise]
    public S Data {get; set;}

    [MustInitialise]
    public int MoreData {get; set;}
}

Does this seem like a proper usage? It seems to fit in with the spirit of annotating what must be initialized by the caller.

YairHalberstadt commented 5 years ago

@ZeroUltimax Related: #146

ZeroUltimax commented 5 years ago

146 is about the usage of default() for all instances of the structure. Here, I'm suggesting that in some cases, you might want to enforce initialization a member, which happens to be a value type.

For example, int MoreData could be an application specific code that shouldn't be left to default. In that case, #146 doesn't apply.

YairHalberstadt commented 5 years ago

@ZeroUltimax I understand. I was just linking another issue which involved treating the default state of a struct as null.

alexk8 commented 5 years ago

I think I'm going to close my #2411 (as a duplicate). Therefore I'm copying here the general idea from there: [MustInitialize] would be quite messy (used everywhere), this behaviour should be default by design IMHO. My suggestion:

class A{
  public string str; //do not warn "not initialized"
}

void Main(){
  A a1 = new A(); //do warn: "str is not initialized"
  A a2 = new A{str="str"}; //ok
}

that becomes a bit tricky with constructors, I suggest this

class A{
  public string str1;
  public string str2;
  public A(){  
    this.str1="123";
  } //warn: "str2 is also must be initialized"
}

Just a suggestion

roji commented 5 years ago

The fact that this proposal tries to address several different problems (DTOs, object initializers, struct fields) may make things more complicated than they need to be...

It should be possible to mark that a field/auto-property must be initialized at the point it is constructed, before it is used.

Tracking whether a field has been initialized or not seems like an extremely heavy ask... While it's relatively easy to imagine this happening within a single method, once an object starts crossing method (and assembly!) boundaries it's hard to imagine how this would be done. If I receive some instance from a method somewhere, how is the compiler supposed to know which fields on it have been initialized, and which haven't?

In other words, while this could solve the object initializer problem (because initialization occurs within a single function), it's hard to imagine how this could address the general problem of deserialized DTOs, or knowing that some field on a struct is uninitialized.

I've opened #2452 as an alternative proposal to address the same problems, but hopefully in a simpler way. In a nutshell, this would be a syntactic sugar attribute on auto-properties which produces the following:

private Foo? _foo;

public Foo Foo
{
    get => _foo ?? throw new InvalidOperationException($"The property has not been initialized");
    set => _foo = value;
}

This makes the above 3 scenarios work without any additional compiler tracking, but of course doesn't catch cases where a field is accessed prematurely at compile-time. This seems acceptable to me, considering that prematurely accessing a late-initialized field goes against the API contract.

YairHalberstadt commented 5 years ago

@roji

Tracking whether a field has been initialized or not seems like an extremely heavy ask.

Definite assignment already does this, and the legwork is literally already in place. I'm not going to say extending definite assignment analysis to definite initialistion analysis is trivial, but it's not a huge issue.

once an object starts crossing method (and assembly!) boundaries it's hard to imagine how this would be done. If I receive some instance from a method somewhere, how is the compiler supposed to know which fields on it have been initialized, and which haven't?

In the long run this could be done by attributes if necessary. However I don't think that's necessarily the point here. The idea is to give you more safety than you currently have. It allows you to safely say these fields won't be initialised by the producer, and must be initialised by the consumer. In 95% of cases that will be done immediately upon creation of the object. The other 5%? That's what the ! operator is for.

it's hard to imagine how this could address the general problem of deserialized DTOs

The idea is that certain objects must always be initialised by the person who creates the objects. However the serialisation framework used may not work with constructors, and so object Initialisers must be used. The serialisation framework itself works through reflection, and so these warnings are irrelevant to them.

I've opened #2452 as an alternative proposal to address the same problems

I don't think it does. That's about providing runtime safety. The NRT feature (and this proposal) is about providing compile time safety.

roji commented 5 years ago

Tracking whether a field has been initialized or not seems like an extremely heavy ask.

Definite assignment already does this, and the legwork is literally already in place. I'm not going to say extending definite assignment analysis to definite initialistion analysis is trivial, but it's not a huge issue.

You mean within the same method, right? If so then you're right, it's essentially already being done.

once an object starts crossing method (and assembly!) boundaries it's hard to imagine how this would be done. If I receive some instance from a method somewhere, how is the compiler supposed to know which fields on it have been initialized, and which haven't?

In the long run this could be done by attributes if necessary.

How? We don't know at compile-time which fields are going to be initialized - that may be conditional on some runtime behavior...

However I don't think that's necessarily the point here. The idea is to give you more safety than you currently have. It allows you to safely say these fields won't be initialised by the producer, and must be initialised by the consumer. In 95% of cases that will be done immediately upon creation of the object. The other 5%? That's what the ! operator is for.

I do agree that something is needed that's better than what we currently have, and I think it makes sense to be able to say "this will be initialized later" (as I also proposed in #2452). However, trying to enforce or detect that by the compiler is a different business...

I don't think see how the ! operator could be relevant here - the way I understand it, we're discussing only non-nullable fields here, and not nullable ones. We're just trying to manage the initialization of those non-nullable fields better.

However, looking at it another way, it seems like it's possible to first introduce an attribute as in #2452, without any compile-time tracking, and at some point in the future to add the tracking you propose as an additional improvement (although again, I don't see this working outside of the scope of a single method).

it's hard to imagine how this could address the general problem of deserialized DTOs

The idea is that certain objects must always be initialised by the person who creates the objects. However the serialisation framework used may not work with constructors, and so object Initialisers must be used. The serialisation framework itself works through reflection, and so these warnings are irrelevant to them.

I understand the idea, but in reality it seems difficult to define "the person who creates the objects". Typical deserialization/materialization logic instantiates the object in one method, passes it to another one for populating the properties... I believe that initialization tracking that stops at the method boundary is likely to be of little value in many cases.

More importantly, I'm not sure that there's that much value in tracking this in compile-time. The end user is supposed to already receive instances which have been fully initialized - so they don't really need our help - and the people writing the deserialization code are less likely to need our help tracking this.

I've opened #2452 as an alternative proposal to address the same problems

I don't think it does. That's about providing runtime safety. The NRT feature (and this proposal) is about providing compile time safety.

I understand what you're saying, and #2452 wouldn't provide an ideal situation. But it's worth keeping in mind that NRT is supposed to provide compile-time safety around nullability, not around initialization - that's a very important distinction.

I'm also trying to be pragmatic here. The chance of a full-fledged compile-time initialization-tracking being introduced in the near future is, well, quite low (and again, I don't believe it could be meaningful as initialization can depend on runtime behavior). #2452 could be implemented pretty easily, and would it make it just as easy to write DTOs as your proposal.

I'm concerned less with providing absolute compile time safety, but rather with the fact that the NRT uninitialized warnings make it very difficult/verbose to write DTOs and similar objects in the NRT world, to the point where people are likely to just turn nullability off.

roji commented 5 years ago

One more note about your proposal to use attributes to express the initialization status of fields on returned types... The NullableAttribute introduced for NRTs exists on the property of a type. The initialization status, however, would have to be defined by each and every method on the types they return, which is something very different (and much more verbose). So Foo() could return type X with property P initialized, while Bar() could return X with P uninitialized...

YairHalberstadt commented 5 years ago

The initialization status, however, would have to be defined by each and every method on the types they return, which is something very different (and much more verbose)

I don't imagine anything as complicated as that.

As I stated above, in 95% of cases we initialise an object in the method in which it is declared. In most of the other 5% we may pass it to one or two helpers two initialise it. We can mark the specific helpers with an attribute indicating they initialise the struct manually.

roji commented 5 years ago

As I stated above, in 95% of cases we initialise an object in the method in which it is declared.

That may be true of object initializers (in fact, in that case it's even 100%), but I strongly disagree that 95% of all relevant initialization cases happen within a single method. Once again, DTOs and various other types are frequently instantiated here, populated there, etc. etc.

In general, the value of providing compile-time warnings about accessing uninitialized fields only within the same method simply seems pretty low to me.

In most of the other 5% we may pass it to one or two helpers two initialise it. We can mark the specific helpers with an attribute indicating they initialise the struct manually.

You seem to have a very specific/restricted scenario in mind... For a feature that goes so deep into the language/compiler, it feels like a more complete/comprehensive proposal is appropriate.

Finally, as I wrote above, there's nothing wrong with introducing the attribute as a syntactic sugar for runtime checks and warning suppression (#2452) - this solves the immediate difficulty of writing DTOs and similar objects in the NRE world. We can then investigate other compile-time options for that same attribute in the future...

chucker commented 5 years ago

Brainstorming another potential name for this attribute: CallerInitialize?

I like the caller part because it makes it clearer who is supposed to initialize it, but to bikeshed this further, I'd go with CallerMustInitialize. (Or possibly CallerInitialized.)

However, this is probably not that important as long as the associated warning message is sufficiently clear, e.g.:

Warning | CS8618 | Non-nullable property 'Bar' is uninitialized. It must be initialized by the caller.

(This probably shouldn't get the same warning code CS8618, as the expectation of the developer is different.)

jamesfoster commented 5 years ago

To clarify an earlier comment. The lateinit keyword from Kotlin effectively disables the warning you get when you fail to initialise a non-nullable field during construction. You're effectively disabling the compile-time checking at that point. However, when dereferencing it you don't get a NRE, you get an UninitializedPropertyAccessException which clearly identifies the field being accessed and that it hasn't been initialised.

I found myself using this a lot during unit testing and in rare occasions where a DI container could only initialise my types through property injection. The case for using object initialisers over calling the constructor is weak, however, I can see its usefulness in not breaking existing code.

lateinit is like saying "I promise to initialise this before I access it".

class C
{
  private lateinit string S { get; set; }
}

effectively becomes

class C
{
#pragma warning disable nullable
  public string S { get; set; }
#pragma warning enable nullable
}

new C().S.Length; // Throw a richer exception here than just NRE.
YairHalberstadt commented 5 years ago

@jamesfoster

That sounds more like https://github.com/dotnet/csharplang/issues/2452

charlesroddie commented 5 years ago

Please don't do this! Initialization results in many smelly C# libraries and NRT is a huge opportunity to clean this up. It would be a missed opportunity if these workarounds are introduced.

@YairHalberstadt

public class C
{
    public string Str {get; set;}
}
...
var c = new C { Str = "Hello World" };

... when enabling nullability we are forced to completely change all this code:

public class C
{
    public C(string str) => Str = str;
    public string Str {get;}
}
...
var c = new C("Hello World");

So much better! The first example is unsafe and it's undiscoverable how to construct a valid object. The second is done right. As a result NRTs will improve APIs across the .Net ecosystem.

Many DTOs are initialized by ORM/Serialization frameworks. As a result they often contain only property getters and setters, without any constructors.

This of course causes warnings when NRTs are enabled, as it doesn't know the properties are always initialized by the ORM/Serialization framework.

These ORMs/Serializers are badly structured. But that doesn't matter. What matters is they require a class with a default constructor which doesn't construct a valid object, and after doing various mutations they may guarantee that they have produced a valid object. This is OK. The ORM/Serializers are the ones who face the warnings. The warnings don't seep into user code unless the user is using the default constructor, which should generate an NRT warning (see below).

public struct S
{
    public string Str { get; set; } //no warning
}
...
var s = new S(); //no warning
var l = s.Str.Length; // kabam! NullReferenceException! - but no warning

This is indeed a problem but the solution is to warn on using the default constructor. This would then encourage the author of the struct to give a constructor which creates a valid object.

chucker commented 5 years ago

As a result NRTs will improve APIs across the .Net ecosystem.

NRTs will improve either way. This proposal by Yair means you'll receive a warning at compile time.

Suppose the library has:

public class C
{
    [MustInitialize]
    public string Str1 {get; set;}

    [MustInitialize]
    public string Str2 {get; set;}
}

And your code does: var c = new C { Str1 = "foo" };, then you get the warning that you failed to initialize Str2. It'll still compile; if you don't want that, you could always include that particular warning in treat warnings as errors.

YairHalberstadt commented 5 years ago

@charlesroddie

The LDC have been very clear all this time that they do not intend to boil the ocean. How existing APIs work is fixed, and there is no intention to change them, or to warn on existing legitimate usage of these APIs.

Therefore it's impossible for the APIs to start warning when you use a default constructor.

At least with the features suggested here, it will be possible to warn if the default constructor is used incorrectly.

Richiban commented 5 years ago

I'd be a bit concerned that this feature introduces too many ways to do the same thing... C# already has the concept of "Must Initialise"; it's a constructor.

What's the difference, really, between these two classes?

public class C
{
    [MustInitialize]
    public string Str1 { get; set; }

    [MustInitialize]
    public string Str2 { get; set; }
}

and:

public class D
{
    public D(string str1, string str2)
    {
        Str1 = str1;
        Str2 = str2;
    }

    public string Str1 { get; set; }

    public string Str2 { get; set; }
}
YairHalberstadt commented 5 years ago

@Richiban:

  1. This is useful for structs (may be mitigated by no-arg constructor)

  2. This is useful for a builder style pattern. You can assign all the fields bit by bit, so long as you don't use the object till they are all assigned.

  3. Some serialisation frameworks don't support constructors.

  4. Adding a new property here only would generate a warning. Adding a new field to a constructor is source breaking.

But yes, ideally everything would go through constructors. Unfortunately legacy code doesn't look like that, so this is necessary to allow them to benefit from NRTs.

spydacarnage commented 5 years ago

As an addition to 3, the ASP.NET Core MVC model binder also doesn't support constructors, so this class produces NRT-based warnings:

public class DemoModel
{
    [Required]
    public string Username { get; set; }
}
RikkiGibson commented 5 years ago

With an object initializer pattern there is no concern with needing to preserve the existing signatures of constructors for binary compatibility. I also speculate that object initializers can make it easier to distinguish the "no value was specified" scenario from "null/default was explicitly specified" scenario, which can matter sometimes.

andre-ss6 commented 4 years ago

How will the runtime guarantee that a MustInitialize property has really been set before access?

jnm2 commented 4 years ago

@andre-ss6 There is no runtime guarantee that null is not passed as a non-nullable reference type (or about any part of NRTs), so it would be consistent for there to be no runtime guarantees with MustInitialize.

andre-ss6 commented 4 years ago

Well, then this proposal basically consists of bringing a bit of the old unsafety back? If you can't guarantee that a property will be initialized, then maybe use ? or ??. [MustInitialize] without runtime guarantee is asking for NRE, especially in deserialization/ORM (DTO, entity models and Options) scenarios, which are mentioned in the OP.

jnm2 commented 4 years ago

NRTs do everything through compile-time warnings, not runtime guarantees. [MustInitialize] with compile-time warnings would be just as useful as any NRT annotations currently are, e.g. string? or [AllowNull].

andre-ss6 commented 4 years ago

NRTs do everything through compile-time warnings, not runtime guarantees.

Sure, I'm aware.

[MustInitialize] with compile-time warnings would be just as useful as any NRT annotations currently are, e.g. string?

I disagree though. With string? you're acknowledging that the string may be null, and thus the compiler will warn you for trying to dereference it without a guard. [MustInitialize], on the other hand, will act like !, but implicitly, which is bad.

I completely understand the motivation and proposed solution and I think it's especially important for OP's scenarios 1 and 3. However, I think it's important to consider the tradeoff here.

Maybe MustInitialize should be restricted to properties of nullable types only?

jnm2 commented 4 years ago

[MustInitialize] would move the warning off the field and into the code that is actually responsible for initializing it. That's way more helpful, I think. It's the same place the warning would show up if the constructor was requiring it.

andre-ss6 commented 4 years ago

into the code that is actually responsible for initializing it

That's not always true. In the cases I mentioned (which is OP's scenario 2), I think the responsibility is in the library, not the consumer.

class Foo (FooOptions options) { options.BarMustInitialize.Baz(); /* Who guarantees BarMustInitialize has really been initialized? */ }

It's the same place the warning would show up if the constructor was requiring it.

It's not the same thing. First off, remember that there are runtime guarantees to constructors. You can't simply choose to not honor a constructor and hand out incomplete objects around. Secondly, you're not saying that BarMustInitialize is necessary to be set in order for an instance of this class to exist. You're only saying that it's not null. And through a compiler trick, add to that. And by using MustInitialize, you're defeating its purpose, because the thing may in fact be null. By marking it as nullable, you're putting the responsibility in the consumer (programmer) to check, which is what they should be doing anyway if they want to have correct code.

jnm2 commented 4 years ago

/* Who guarantees BarMustInitialize has really been initialized? */

Whoever created the instance that is being passed to options. That's where the warning belongs.

First off, remember that there are runtime guarantees to constructors. You can't simply choose to not honor a constructor and hand out incomplete objects around.

I do remember. Runtime guarantees aren't necessary in order for NRT checks to be effective. You can simply choose to pass nulls and silence warnings everywhere with NRTs, and MustInitialize should be no different in this regard.

You're only saying that it's not null.

This is the case with everything in NRTs and yet they are super useful. I'm not sure what else I can say to point out all the equivalences with how nullability warnings already work.

andre-ss6 commented 4 years ago

Whoever created the instance that is being passed to options. That's where the warning belongs.

I'm not sure if you're following. I'm talking about deserialization/ORM scenarios. In the previous example, the FooOptions instance has been injected and probably deserialized from a settings file. So where is "that" where the warning belongs? The runtime? /jk

You can simply choose to pass nulls and silence warnings everywhere with NRTs

Mhm, and that's why I say that this proposal is trying to bring a bit of the old unsafety back. The reason why you can pass nulls around even with NRTs is not because somehow we don't think they're important or because it's safe to do so. The reason is strictly due to technical and design limitations. I don't see [yet] any of these limitations here that make it necessary to bring more unsafety back. The road forward should be paved with more safety, not less.

You're only saying that it's not null.

This is the case with everything in NRTs and yet they are super useful. I'm not sure what else I can say to point out all the equivalences with how nullability warnings already work.

Hey, cmon, don't quote out of context. I used that phrase in contrast to how constructors work, which is a equivalence you suggested:

It's the same place the warning would show up if the constructor was requiring it.

It's not the same thing. [...] you're not saying that BarMustInitialize is necessary to be set in order for an instance of this class to exist. You're only saying that it's not null.


I'm fully and very much well aware that NRTs are compile-time only. There's no need to lecture me on that.

HaloFour commented 4 years ago

@andre-ss6

The road forward should be paved with more safety, not less.

Adoption requires being able to adapt to existing patterns. Huge numbers of APIs rely on objects that are initialized via initializers instead of constructors. NRTs works exceptionally poorly for these scenarios, both because the objects themselves warn in their constructor and because there is zero NRT assistance during initialization for the consumer. The current situation is already very unsafe. This proposal seeks to close the existing gap and make it safer to define and consume those APIs. In many cases moving to constructors is simply not an option.

andre-ss6 commented 4 years ago

Huge numbers of APIs rely on objects that are initialized via initializers instead of constructors. NRTs works exceptionally poorly for these scenarios,

Sure, and quoting myself here:

I completely understand the motivation and proposed solution and I think it's especially important for OP's scenarios 1 and 3. However, I think it's important to consider the tradeoff here.

Maybe MustInitialize should be restricted to properties of nullable types only?

jnm2 commented 4 years ago

I'm not sure if you're following. I'm talking about deserialization/ORM scenarios. In the previous example, the FooOptions instance has been injected and probably deserialized from a settings file. So where is "that" where the warning belongs? The runtime? /jk

Yes. The deserializer is responsible. It should throw an exception or use a default value if null is not an acceptable value, just the same as it should when it sets a value-typed property. All a deserializer is doing is automating handwritten code. This would be a way to handle the nullability annotation that would be consistent to what you'd do to avoid the warning in handwritten code.

Hey, cmon, don't quote out of context. I used that phrase in contrast to how constructors work, which is a equivalence you suggested:

I apologize. I'm doing my best. I'll try to explain the equivalence I meant. If you were initializing via a constructor, the warning for initializing using a nullable expression would be at the constructor call site and not a burden on all the users of the field or property. Ideally, MustInitialize would be consistent with the constructor scenario in this specific way: that the burden of checking for null should belong to the code that is doing the initializing, not the users of the field or property. The type which declares the field or property should never see or speak about a null value.

(To be clear, "should never see or speak about a null value" is talking about a normal compile-time check. Having a runtime check as well sounds fine so long as it doesn't delay getting the compile-time fix of moving the warning.)

If leaving the burden of checking for null with each access of the field or property sounds better in a certain scenario than leaving the burden with the code doing the initialization, you'd make the field or property nullable. But that's not great for most of the scenarios I've come across.

andre-ss6 commented 4 years ago

Yes. The deserializer is responsible. It should throw an exception

Okay, I understand now. I was thinking about situations where deserializers would just slap a null there or just leave it uninitialized when the property wasn't present in the settings file. And of course old libraries will do that, but then of course they won't be NRT aware at all anyways, so absolutely nothing is stopping them from slapping null anywhere anyway.

But will libraries do that though (check for MustInitialize)? Will it be reasonable for System.Text.Json, for example, to check for MustInitializeAttribute in every property in order to see if any is missing from the incoming payload? (honest question, haven't thought deeply about the consequences) That's why I said it's important to consider the tradeoff here. OP's scenario 2 is the most complicated one wrt NRT and I think some thought must be given to it.

HaloFour commented 4 years ago

@andre-ss6

Maybe MustInitialize should be restricted to properties of nullable types only?

The gap with serializers already exists as they can already skip initialization of non-nullable properties, or explicitly set them to null, and the language is none-the-wiser. So I don't think such a limitation makes sense.

Hopefully in time deserialization libraries will take NRT attributes into consideration and throw if that property cannot be initialized to a non-null value.

jnm2 commented 4 years ago

@andre-ss6

But will libraries do that though (check for MustInitialize)? Will it be reasonable for System.Text.Json, for example, to check for MustInitializeAttribute in every property in order to see if any is missing from the incoming payload?

I think so. Entity Framework already respects nullability annotations on reference-typed properties to determine whether they are optional or required. It seems like the right direction to take. Entity Framework recommends setting non-nullable properties to null! when you can't require constructor initialization: https://docs.microsoft.com/en-us/ef/core/miscellaneous/nullable-reference-types#non-nullable-properties-and-initialization But if MustInitialize existed, that would be preferable.

Joe4evr commented 4 years ago

Maybe MustInitialize should be restricted to properties of nullable types only?

I disagree, I find myself recently writing objects with strongly typed IDs (that are structs), and I'd like to see a warning that those should also be initialized just to make it more clear when the caller is supposed to set a value.

public class Foo
{
    [CallerInitialize]
    public FooId Id { get; set; }
    [CallerInitialize]
    public string Bar { get; set; }
}
// ....
public void AddNewFoo(string b)
{
    var foo = new Foo { Bar = b };
    _db.Foos.Add(foo); // oops, foo.Id is still all-0's but applying the attribute makes this callsite aware it should've done something
    _db.SaveChanges();
}
andre-ss6 commented 4 years ago

and I'd like to see a warning that those should also be initialized just to make it more clear when the caller is supposed to set a value.

And you would, that is the point of MustInitialize. My suggestion would not change that.


Overall, though I think the reason I was uncomfortable with the proposal is that, philosophically, I think that if you're unsure that something may be initialized by a library, then just be honest with the compiler (and yourself) and either default initialize it or say that it's nullable. As I said in a previous reply, though, I understand that there isn't much point in pursuing this as non NRT aware libraries will do all sort of things anyway. So, unless there is something I'm not aware (maybe performance implications) that will prevent libraries like System.Text.Json from honoring those attributes, at least from me I see no point in continuing this discussion.

pablocar80 commented 4 years ago

I recently migrated a 100K+ code base to nullable and things didn't go too well. I was hoping to start using compiler warnings to detect null dereferences and get a cleaner, more maintainable code. In practice though, I ended up with code that mostly replaces the standard null exceptions behind the scenes with my own custom exceptions, as in #2452. There was little gain at the cost of a more bloated and verbose code.

Everything would be fine if I could simply pass all required variables to the constructor of a class. However, sometimes there are too many members, and calling a constructor with 15 parameters isn't nearly as readable as using the nice object initialization syntax:

// This isn't readable. But does the job of initializing references.
var myclass = new MyClass(param1, param2 ... param15); // bad

// This is readable. But forces me to turn off nullability checks, either with ? or null!
var myclass = new MyClass
{
  MyValue1 = param1,
  MyValue2 = param2,
  ...
};

So if you could extend the nullability functionality to apply to the syntax in the 2nd case, that would be truly amazing.

GeirGrusom commented 4 years ago
var myclass = new MyClass
(
  myValue1: param1,
  myValue2: param2,
  ...
);
sanjuroo commented 4 years ago

I love non-nullable references, but this is currently biggest drawback, this should be absolute top priority.

chucker commented 4 years ago

Another use case for what I believe is the same scenario is Blazor (Razor Component) parameters.

@functions {
    [Parameter]
    public string Name { get; set; }
}

This yields a warning. I can make it string?, but now I have to deal with a property that I don't actually want to ever be null. Instead, what I really want to do is surface the warning for the consumer of the component. E.g.,:

@functions {
    [Parameter, CallerInitialize]
    public string Name { get; set; }
}

So that:

<MyComponent /> <!-- warns: Name isn't set -->
<MyComponent Name="My name" /> <!-- doesn't warn -->

(This may overlap with https://github.com/dotnet/aspnetcore/issues/11815)

pablocar80 commented 4 years ago

This issue seems to be fixed in C# 9 with the init accesor