dotnet / csharplang

The official repo for the design of the C# programming language
11.48k stars 1.02k forks source link

`field` keyword nullability #8360

Closed RikkiGibson closed 1 month ago

RikkiGibson commented 2 months ago

Related proposal: https://github.com/dotnet/csharplang/blob/main/proposals/field-keyword.md PR containing the working version of this text: #8346

Goals

Ensure a reasonable level of null-safety for various usage patterns of the field keyword feature. Avoid requiring the user to apply [field: AllowNull, MaybeNull] or similar attributes to the property in order to express common patterns around lazy initialization, etc.

One of the key scenarios we would like to "just work", if possible, is the little-l lazy property scenario:

public class C
{
    public C() { } // it would be undesirable to warn about 'Prop' being uninitialized here

    string Prop => field ??= GetPropValue();
}

Terms

A property is a field-backed property if it meets any of the following conditions:

  1. It contains any automatically implemented accessors (get;/set;/init;).
  2. It uses the field keyword within any accessor bodies.

Some examples of field-backed properties include:

public string Prop1 { get; set; }
public string Prop2 { get => field; set; }
public string? Prop3 { get => field; }

The variable denoted by the field keyword in a property's accessors is the backing field of that property.

Nullability of the backing field

The backing field has the same type as the property. However, its nullable annotation may differ from the property. To determine this nullable annotation, we introduce the concept of null-resiliency. Null-resiliency intuitively means that the property's get accessor preserves null-safety even when the field contains the default value for its type.

A field-backed property is determined to be null-resilient or not by performing a special nullable analysis of its get accessor.

The nullability of the backing field is determined as follows:

Constructor analysis

Currently, an auto-property is treated very similarly to an ordinary field in nullable constructor analysis. We extend this treatment to field-backed properties, by treating every field-backed property as a proxy to its backing field.

We update the following spec language from An alternative approach (TODO rename that section?) to accomplish this:

At each explicit or implicit 'return' in a constructor, we give a warning for each member whose flow state is incompatible with its annotations and nullability attributes. If the member is a field-backed property, the nullable annotation of the backing field is used for this check. Otherwise, the nullable annotation of the member itself is used. A reasonable proxy for this is: if assigning the member to itself at the return point would produce a nullability warning, then a nullability warning will be produced at the return point.

Note that this is essentially a constrained interprocedural analysis. We anticipate that in order to analyze a constructor, it will be necessary to do binding and "null-resiliency" analysis on all applicable get accessors in the same type, which use the field contextual keyword and have not-annotated nullability. We speculate that this is not prohibitively expensive because getter bodies are usually not very complex, and that the "null-resiliency" analysis only needs to be performed once regardless of how many constructors are in the type.

Setter analysis

For simplicity, we use the terms "setter" and "set accessor" to refer to either a set or init accessor.

There is a need to check that setters of field-backed properties actually initialize the backing field.

class C
{
    string Prop
    {
        get => field;

        // getter is not null-resilient, so `field` is not-annotated.
        // We should warn here that `field` may be null when exiting.
        set { }
    }

    public C()
    {
        F = "a"; // ok
    }

    public static void Main()
    {
        new C().F.ToString(); // NRE at runtime
    }
}

The initial flow state of the backing field in the setter of a field-backed property is determined as follows:

At each explicit or implicit 'return' in the setter, a warning is reported if the flow state of the backing field is incompatible with its annotations and nullability attributes.

Remarks

This formulation is intentionally very similar to ordinary fields in constructors. Essentially, because only the property accessors can actually refer to the backing field, the setter is treated as a "mini-constructor" for the backing field.

Much like with ordinary fields, we usually know the property was initialized in the constructor because it was set, but not necessarily. Simply returning within a branch where Prop != null was true is also good enough for our constructor analysis, since we understand that untracked mechanisms may have been used to set the property.

Alternatives

In addition to the null-resilience approach outlined above, the working group suggests the following alternatives for the LDM's consideration:

Do nothing

We could introduce no special behavior at all here. In effect:

Note that this would result in nuisance warnings for "lazy property" scenarios, in which case users would likely need to assign null! or similar to silence constructor warnings.
A "sub-alternative" we can consider is to also completely ignore properties using field keyword for nullable constructor analysis. In that case, there would be no warnings anywhere about the user needing to initialize anything, but also no nuisance for the user, regardless of what initialization pattern they may be using.

Because we are only planning to ship the field keyword feature under the Preview LangVersion in .NET 9, we expect to have some ability to change the nullable behavior for the feature in .NET 10. Therefore, we could consider adopting a "lower-cost" solution like this one in the short term, and growing up to one of the more complex solutions in the long term.

field-targeted nullability attributes

We could introduce the following defaults, achieving a reasonable level of null safety, without involving any interprocedural analysis at all:

  1. The field variable always has the same nullable annotation as the property.
  2. Nullability attributes [field: MaybeNull, AllowNull] etc. can be used to customize the nullability of the backing field.
  3. field-backed properties are checked for initialization in constructors based on the field's nullable annotation and attributes.
  4. setters in field-backed properties check for initialization of field similarly to constructors.

This would mean the "little-l lazy scenario" would look like this instead:

class C
{
    public C() { } // no need to warn about initializing C.Prop, as the backing field is marked nullable using attributes.

    [field: AllowNull, MaybeNull]
    public string Prop => field ??= GetPropValue();
}

One reason we shied away from using nullability attributes here is that the ones we have are really oriented around describing inputs and outputs of signatures. They are cumbersome to use to describe the nullability of long-lived variables.

jnm2 commented 1 month ago

This has now been merged into https://github.com/dotnet/csharplang/blob/main/proposals/field-keyword.md#nullability. (https://github.com/dotnet/csharplang/pull/8390)