dotnet / csharplang

The official repo for the design of the C# programming language
11.37k stars 1.02k forks source link

Proposal: Adding nullable reference type features to nullable value types (16.3, Core 3) #1865

Open MadsTorgersen opened 5 years ago

MadsTorgersen commented 5 years ago

Augmenting Nullable Value Types

The upcoming nullable reference types (NRTs) feature builds on the existing nullable value types (NVTs) for syntax and intuition, but differs in several ways.

One big difference is that NVTs have a different runtime representation that their nonnullable counterparts, whereas NRTs are indistinguishable from nonnullable ones at the runtime level, and are only differentiated at compile time, in source and metadata. This is a fundamental difference that is a result of deliberate design decisions, and can't really be remedied.

However, some of the novel aspects of NRTs might be "backported" to NVTs, diminishing the feature gap between them, and providing useful expressiveness.

Tracking null state for NVTs

A key feature of NRTs is that we track null state for variables of reference type through a flow analysis, so that we can warn at points of dereference if the variable might be null:

void M(Person? p)
{
    WriteLine(p.Name); // warning: p might be null
    if (p != null) { WriteLine(p.Name); } // No warning
    WriteLine(p?.Name ?? ""); // No warning
    if (p == null) return; WriteLine(p.Name); // No warning
}

Similarly, we could track null state for NVTs. The null state would come into play when boxing a NVT to a NRT, or when accessing the Value of a NVT:

void M(int? i)
{
    IComparable c = i; // Warning: i might be null
    IComparable? n = i; // null state: n may be null
    WriteLine(n.ToString()); // Warning: n might be null
    if (i != null) 
    { 
        c = i; // No warning
        n = i; // null state: n is not null
        WriteLine(n.ToString()); // No warning
    }
    var n = (IComparable)i; // Warning: casting away nullness
    int x = i.Value; // Warning: i might be null
    x = (int)i; // Warning: casting away nullness
}

Just like with NRTs, ! can be used to suppress warnings, and to change the null state of a nullable value:

void M(int? i)
{
    WriteLine(i!.Value); // No warning
    int x = i!; // No warning, null state: x is not null
}

Special considerations for NVTs

The analysis should recognize null checks using HasValue as well as ones involving the null literal (which are generally translated to uses of HasValue by the compiler).

int M(int? i)
{
    if (i.HasValue) return i.Value; // No warning
    else return -1;
}

Also, the analysis should account for the semantics of "lifted" operators. For every operator (intrinsic or user-defined) over non-nullable value types, there's a corresponding language-provided lifted operator, that works over the corresponding NVTs. The lifted operator returns null if either operand is null.

int? a, b;
(a, b) = (7, 9);
int x = a + b; // no warning; the null state of the + result is not null
a = null;
x = a + b; // warning on assignment

Pros:

Cons:

Using nullable values as nonnullable values

Because of the null state tracking, NRTs allow use of the "underlying" value when the flow analysis says that the nullable reference isn't null.

NVTs, on the other hand, are a completely separate value, and even when you just checked for null, you still cannot use them as the underlying value directly; you have to get at it first with .Value or .GetValueOrDefault() or a cast.

If we do null state tracking for NVTs as proposed above, could we allow direct use as the underlying value when a nullable value is known not to be null?

One immediate obstacle is that NVTs are types in their own right, with their own members and a separate type identity wrt. overload resolution. For member access and conversions, they already have semantics, and any new semantics where they "pose" as the underlying type would have to be strictly additional and non-breaking. So they would kick in only in places where you'd get an error today.

For simplicity we should probably allow the additional member accesses and conversions regardless of whether the value is null or not, but then warn on it when they are applied to a nullable value that has the "may be null" null state. Allowing it regardless of null state also helps maintain the design principle from NRTs that the null state should never affect semantics, only whether warnings are yielded.

Member access

The proposal is that instance members of the underlying type become available on the nullable type, with a warning when the value "may be null".

For back compat, members that are defined on Nullable<T> should always shadow corresponding members on the underlying type. That is a pretty short list, though, and many of them (ToString, Equals, GetHashCode) work by calling through to the underlying type when the value is non-null, so only very few members of the underlying value would be effectively shadowed in the sense that the behavior is different. Those are members directly implementing the contract of the NVT (Value, HasValue, GetValueOrDefault), as well as reflection (GetType), which must acknowledge that a nullable value is different at runtime from its underlying value.

Other than those, members of the underlying type could be offered, and would imply an implicit indirection through the Value property.

void M(int? i)
{
    string s = i.ToString(); // int?.ToString
    int x = i.CompareTo(7); // int.CompareTo + warning
    if (i != null)
    {
        s = i.ToString(); // STILL int?.ToString
        x = i.CompareTo(7); // no warning
    }
}

Pros:

Cons:

Conversion and overload resolution

Ideally we would straightforwardly allow a NVT to be implicitly converted to its underlying value type, with a warning if it "may be null". However, that would be a big breaking change, since it would make new overloads applicable, leading to ambiguities or silent changes of behavior. In today's betterness algorithm, non-nullable wins over nullable

Instead we'd need sort of a "Hail Mary" pass, where if overload resolution/assignment would otherwise fail, we add these conversions and try again. Thus, you'd get the following behavior:

void N(int x);
void O(int? x); // 1
void O(int x); // 2

void M(int? n, int i)
{
    N(n); // warning
    N(i); // fine
    O(n); // overload 1
    O(i); // overload 2
    if (n != null) 
    {
        N(n); // no warning
        O(n); // STILL overload 1
        O((int)n); // overload 2
    }
}

Pros:

Cons:

Alternative: Lifting

There is already a language-level approach to making NVTs work smoothly for operators: lifting. For each operator over non-null value types, there is automatically a corresponding one over the corresponding NVTs (unless one already exists): What it does is to yield null if either operand is null, and the result of the underlying operator otherwise.

This is similar in functionality and typing to how the ?. operator works with respect to a NVT receiver. In a sense, ?. is an explicit lifting of the . operator.

Could we address the scenario by doing implicit lifting in more scenarios? It would look something like this:

For members, this really just means making ?. implicit on NVTs.

FOr methods, the benefit is that the old overloads would be "better" than the new lifted ones, because non-nullable types in the signature are better than nullable ones. So existing methods would continue to bind the same way.

Mostly. There are still some breaking changes possible. For instance, if there is an overload with a reference type, then it could get ambiguous with a lifted NVT overload on the null literal:

M(int i);
M(string s);

M(null); // ambiguous with M(int?) now?

We would need a tie breaker rule to make lifted methods take a backseat to non-lifted ones. That's doable.

Worse, though, there are still cases where we'd pick a different overload than before:

M(int i);
M(object? o);

M(null); // M(int?) instead of M(object?) now?

The issue is that the NVT in the lifted method may still be better than a reference type in an existing overload.

In short, lifted methods still need to be treated specially, taking more of a backseat in overload resolution. Lifting, then, still requires a special "second phase", but would use "extra overloads" instead of the "extra conversions" proposed above.

Pros:

Cons:

Recommendations

I would like to see us do null-tracking for NVTs. I don't think the downsides are significant.

I wish we could allow NVTs to be used as their underlying types, with a warning when they might be null. Doing this for member access is significantly less complicated than for conversions, but it would also seem inconsistent to only do the easy part.

I would like us to drill more on how best to solve the conversion case. If we can come up with something relatively elegant, without sacrificing back compat, then I think the whole package may be well worth considering. Otherwise I'd probably leave all of the new semantics for another day, and just do the null tracking.

LDM history:

CyrusNajmabadi commented 5 years ago

A major concern i have here is how the compiler (not language) team feels about warning here and how it expects to evolve them in this space wrt BackCompat concerns. Fundamentally to me, this entire space is a large set of heuristic and patterns that are matched and used to nudge you in the right direction. i.e. warnings at launch for:

  1. looks like you may be doing something bad with null.
  2. potential warnings saying: you're being too aggressive with silencing checks that wont' fire here. i.e. unnecessary usages of !

It seems to me highly likely that we will not get these rules "right" at launch. They'll be good, but will likely miss important patterns, or will get some things wrong in cases that users end up caring about. This will be compounded the more places we expose this analysis (i.e. for NVTs on top of NRTs).

My major concern is that we will come out with a set of rules that people later report having issues with, but which we are then unable to tweak/change (despite them being heuristics/patterns) because of potential warnings in some direction.

For example, if we lift members up to NVTs that we think are non-null, then we risk breaking that if we ever think our heuristic was too permissive. But, invariably, i think people are going to run into cases where our rules are wrong and they do get a null-ref exception and they do want to tell us to catch those cases.

Do we have a reasonable story thought through on how we'll be able to evolve this space which is, by tis very nature, squishy and best-effort?

svick commented 5 years ago

For every method overload M that takes at least one parameter of non-nullable value type , we introduce an overload where all non-nullable value types in the signature are replaced with their corresponding NVTs

Do I understand it correctly that with this alternative, the following code would compile fine with no warnings and when passed null length, it would do nothing?

void Write(Stream stream, byte[] buffer, int? length = null)
{
    stream.Write(buffer, 0, length);
    // should have been:
    // stream.Write(buffer, 0, length ?? buffer.Length);
}

I don't think that would be a good idea, since it would make it too easy to write buggy code. The difference between this method lifting and the existing operator lifting is that operators are almost always used for their value and don't have side-effects. But that does not apply to methods.

jnm2 commented 5 years ago

For back compat, members that are defined on Nullable should always shadow corresponding members on the underlying type. That is a pretty short list, though

I've written extension methods on Nullable<>. These would also shadow members on the underlying type, right?

paulomorgado commented 5 years ago

@MadsTorgersen,

I don't understand how M(null) is resolved to M(int). If it was M(int?), I would understand.

HaloFour commented 5 years ago

@CyrusNajmabadi

It seems to me highly likely that we will not get these rules "right" at launch. They'll be good, but will likely miss important patterns, or will get some things wrong in cases that users end up caring about.

This, a hundred times over. I'm sure that the team will get awfully close, but I think that it would be invaluable to leave a little wiggle room to allow expansion of the rules and to accept that new warnings may be produced, at the very least until the next major version (C# 9.0 presumably). At that point, maybe, the core rules could be locked in place and new warnings only produced via warning waves (if that ever happens).

ufcpp commented 5 years ago
using System;

struct Mutable
{
    public int Count;
    public void Increment() => ++Count;
}

class Program
{
    static void Main()
    {
        Mutable x = default;
        x.Increment();
        Console.WriteLine(x.Count); // 1

        Mutable? n = new Mutable();
        n?.Increment(); // if (n.HasValue) n.GetValueOrDefault().Increment(); → mutates copied value
        Console.WriteLine(n?.Count); // 0

        n.Increment(); // should be the same as "n?.Increment()" ?
        Console.WriteLine(n.Count); // should be 0? or 1?
    }
}
gafter commented 5 years ago

potential warnings saying: you're being too aggressive with silencing checks that wont' fire here. i.e. unnecessary usages of !

We're not going to do that. We will not punish people who are really trying to get this feature out of the way of doing their work. For example, we will not warn or error on chained uses of !, e.g. x!!.

Programmer: "Please Shut Up." Compiler: "I was't going to say anything. Please don't tell me to shut up unless I'm going to say something." Programmer: "You just said something. Please Shut Up."

TheUnlocked commented 5 years ago

x = y; - No warning.

x = y!; - Warning for unnecessarily ignoring a warning.

x = y!!; - Warning for unnecessarily ignoring a warning is ignored.

x = y!!!; - Warning for unnecessarily ignoring a warning about unnecessarily ignoring a warning.

qrli commented 5 years ago

@ufcpp That has to be 0, unless some hack is provided like ref T Nullable<T>.GetValueReference() and compiler uses that. A quirk but not new quirk with regard to value types.

ufcpp commented 5 years ago

@qrli Yes but the hack is not so tricky, just introducing a temporary variable:

Mutable x = n.GetValueOrDefault();
x.Increment();
n = x;
paulomorgado commented 5 years ago

x = y; - No warning. x = y!; - Warning for unnecessarily ignoring a warning. x = y!!; - Warning for unnecessarily ignoring a warning is ignored. x = y!!!; - Warning for unnecessarily ignoring a warning about unnecessarily ignoring a warning.

If I can't write amazing code, I'll write amazed code!!!

olmobrutall commented 5 years ago

What about producing a warning when using any member of Nullable, like Value or HasValue, and recommend !. and !=null instead?

Removing any syntactic differences will simplify writing generic code with nullables that works for value and reference types.

theunrepentantgeek commented 5 years ago

Adding a new warning is considered a breaking change to the compiler (because many many teams use the warnings as errors switch). This might happen if they introduce warning waves though.

olmobrutall commented 5 years ago

I was assuming for projects wit not null reference types on.

mattwar commented 5 years ago

The language design team originally opted to not have lifting apply to general method invocations when nullable value types were introduced, specifically because the behavior was seen as too surprising. I doubt we'd choose to go the other way now, unless we added some new syntax, like ?., but for arguments, that indicates that the lifting should occur.

jnm2 commented 5 years ago

How would lifting work for out parameters? If TryGetFoo has the signature (out FooValue foo), would the compiler smooth this over for me?

FooValue? foo;
if (cond)
{
    if (!TryGetFoo(out foo)) throw ...;
}
else { ... }

I ran into a real-world opportunity for this and was curious.

YairHalberstadt commented 5 years ago

@jnm2 I imagine that this will have to be sorted out using whatever technique is used for IsNullOrEmpty, Equals, and ReferenceEqual.

I think there was talk at one point of an attribute language used to say things like "if this returns false, this is null, else it's not null". I don't know where that is at the moment

yaakov-h commented 5 years ago

@YairHalberstadt https://github.com/dotnet/csharplang/wiki/Nullable-Reference-Types-Preview#annotation-attributes

jnm2 commented 5 years ago

Sorry, I should clarify my question. With reference types, the out parameter is the same runtime type as the variable even though one is nullable and one is not. With value types, the out parameter's runtime type no longer matches the variable type. Will the C# compiler enable the code sample above to compile, smoothing this over for me perhaps by introducing a temporary local variable to pass to the out variable and doing an assignment after the call? That's what I'd have to do manually in any case. (Yes, I know how this is different than passing the original variable.)

mattleibow commented 4 years ago

I just came across a case where nullableInt! would have been very useful: I have a web API that returns objects, but allows you to select which fields are available.

For example, you can request book details, and optionally tell it to only populate the id and title, or the id, title and the contents. As the book contents could be fairly large, you would want to skip that data when doing an auto-complete request. But, you want everything when loading the book to read.

My type would have been:

class Book {
    int? Id;
    string? Title;
    string? Contents;
}

The api call would be: http://api.website.com/books/search/?query=title+name&fields=id,name

I would have hoped to have been able to do this:

string title = book.Title!;

But I now have to do:

string title = book.Title!.Value;

I know this is not really a real API to request the book contents as part of a search, but it represents a feature of the API.

paulomorgado commented 4 years ago

Isn't that what Nullable<T>.Value is for?

jnm2 commented 4 years ago

But I really hate .Value. It usually lowers the readability of whatever I'm doing.

FiniteReality commented 4 years ago

I would have hoped to have been able to do this:

string title = book.Title!;

But I now have to do:

string title = book.Title!.Value;

I have the same issue. It would be amazing if the damnit (is it even called that anymore? :sweat_smile:) operator on NVTs called GetValueOrDefault.

mattleibow commented 4 years ago

I wouldn't have the ! call GetValueOrDefault as that is not what is expected. If I asked for the value, and there is no value, then I would want an exception. If I don't want an exception, then I should check for null or HasValue

FiniteReality commented 4 years ago

I wouldn't have the ! call GetValueOrDefault as that is not what is expected. If I asked for the value, and there is no value, then I would want an exception. If I don't want an exception, then I should check for null or HasValue

There is precedent for it returning the default value, as that is what ! does for NRTs - it suppresses the "value may be null" warning.

HaloFour commented 4 years ago

But there's also precedent that ! and any of the NRT syntax doesn't impact the behavior of the code. Granted, that can change when applied to NVTs.

FiniteReality commented 4 years ago

But there's also precedent that ! and any of the NRT syntax doesn't impact the behavior of the code. Granted, that can change when applied to NVTs.

Yeah, this is what I was implying; keeping the "semantics" the same and telling the compiler "I don't care that this may be null, give me the value damnit"

mattleibow commented 4 years ago

There is precedent for it returning the default value, as that is what ! does for NRTs - it suppresses the "value may be null" warning.

The ! doesn't give the default, it gives the value - and it happens to be null. With the Nullable<T>, the value is null - but that can't be represented by a int. So what should happen? I would think some exception.

TheUnlocked commented 4 years ago

I agree. You should only use the ! operator if you're sure that something is (or will be) non-null but the compiler can't guarantee it. If you're wrong, falling back to the default value is just asking for bugs. Better to fail loudly.

333fred commented 4 years ago

Note: we really didn't want to do this before, because the principle of the nullable feature is that we don't change runtime behavior. Now that C# 8 has shipped, we really can't do this as it would be a breaking change.

ViIvanov commented 4 years ago

I would have hoped to have been able to do this:

string title = book.Title!;

But I now have to do:

string title = book.Title!.Value;

Am I understand correctly and you need something like this:

string title = book.Title ?? throw new SomeException($"The {nameof(book.Title)} is null.");

?

jods4 commented 3 years ago

Out of all the NRT features, the one I would really like to see added to NVT is the ! operator.

As a few others have commented above, writing expressions with several .Value mixed in is really ugly and hurts readability.

When applied to a NVT, it seems natural that ! would mean .Value. True story: some devs on my team could not understand why it didn't work, as the syntax T? is the same for both refs and values -- although there's a big difference at runtime.

Implementing ! on NVT isn't hard, improves readability, and reduces the differences between NRT and NVT.

333fred commented 3 years ago

@jods4 this issue has long been shipped. If you have a different feature request, it will need to be a different discussion.

jods4 commented 3 years ago

@333fred Sorry about that, I didn't want to open a duplicate when there was another open issue about the same topic.

This issue has long been shipped. If you have a different feature request,

The 12 last comments on this issue were all about using ! on NVT and I see no end (shipped or rejected) to that thread?

If you think this can still be discussed and a new issue is the way to go, I'll open one.

333fred commented 3 years ago

If you think this can still be discussed and a new issue is the way to go, I'll open one.

This issue documents a feature that shipped in C# 8. If you have a new feature request, a new discussion is the way to go :).

thomaseyde commented 1 year ago

This issue documents a feature that shipped in C# 8

If so, why is this issue still open?

Joe4evr commented 1 year ago

@thomaseyde Because feature issues stay open until they are added to the ECMA spec. That's what the Implemented Needs ECMA Spec label means.