dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.92k stars 4.02k forks source link

Proposal: Nullable reference types and nullability checking #5032

Closed MadsTorgersen closed 7 years ago

MadsTorgersen commented 9 years ago

Null reference exceptions are rampant in languages like C#, where any reference type can reference a null value. Some type systems separate types into a T that cannot be null and an Option<T> (or similar) that can be null but cannot be dereferenced without an explicit null check that unpacks the non-null value.

This approach is attractive, but is difficult to add to a language where every type has a default value. For instance, a newly created array will contain all nulls. Also, such systems are notorious for problems around initialization and cyclic data structures, unless non-null types are permitted to at least temporarily contain null.

On top of these issues come challenges stemming from the fact that C# already has null-unsafe types, that allow both null values and dereferencing. How can a safer approach be added to the language without breaking changes, without leaving the language more complex than necessary and with a natural experience for hardening existing code against null errors in your own time?

Approach: Nullable reference types plus nullability warnings

The approach suggested here consists of a couple of elements:

The feature will not provide airtight guarantees, but should help find most nullability errors in code. It can be assumed that most values are intended not to be null, so this scheme provides the minimal annotation overhead, leaving current code to be assumed non-nullable everywhere.

Nullable reference types are helpful, in that they help find code that may dereference null and help guard it with null checks. Making current reference types Non-nullable is helpful in that it helps prevent variables from inadvertently containing an null value.

Nullable and non-nullable reference types

For every existing reference type T is now "non-nullable", and there is now a corresponding nullable reference type T?.

Syntactically speaking, nullable reference types don't add much, since nullable value types have the same syntax. However, a few syntactic corner cases are new, like T[]?.

From a semantic viewpoint, T and T? are mostly equivalent: they are the same type in that they have the same domain of values. Specifically, because the system is far from watertight, non-nullable reference types can contain null. The only way in which they differ is in the warnings caused by their use.

For type inference purposes T and T? are considered the same type, except that nullability is propagated to the outcome. If a reference type T is the result of type inference, and at least one of the candidate expressions is nullable, then the result is nullable too.

If an expression is by declaration of type T?, it will still be considered to be of type T if it occurs in a context where by flow analysis we consider it known that it is not null. Thus we don't need new language features to "unpack" a nullable value; the existing idioms in the language should suffice.

string s;
string? ns = "hello";

s = ns; // warning
if (ns != null) { s = ns; } // ok

WriteLine(ns.Length); // warning
WriteLine(ns?.Length); // ok

Warnings for nullable reference types

Values of nullable reference types should not be used in a connection where a) they are not known to (probably) contain a non-null value and b) the use would require them to be non-null. Such uses will be flagged with a warning.

a) means that a flow analysis has determined that they are very likely to not be null. There will be specific rules for this flow analysis, similar to those for definite assignment. It is an open question which variables are tracked by this analysis. Just locals and parameters? All dotted names?

b) means dereferencing (e.g. with dot or invocation) or implicitly converting to a non-nullable reference type.

Warnings for non-nullable reference types

Variables of non-nullable reference types should not be assigned the literal null or default(T); nor should nullable value types be boxed to them. Such uses will result in a warning.

Additionally, fields with a non-nullable reference type must be protected by their constructor so that they are a) not used before they are assigned, and b) assigned before the constructor returns. Otherwise a warning is issued. (As an alternative to (a) we can consider allowing use before assignment, but in that case treating the variable as nullable.)

Note that there is no warning to prevent new arrays of non-nullable reference type from keeping the null elements they are initially created with. There is no good static way to ensure this. We could consider a requirement that something must be done to such an array before it can be read from, assigned or returned; e.g. there must be at least one element assignment to it, or it must be passed as an argument to something that could potentially initialize it. That would at least catch the situation where you simply forgot to initialize it. But it is doubtful that this has much value.

Generics

Constraints can be both nullable and non-nullable reference types. The default constraint for an unconstrained type parameter is object?.

A warning is issued if a type parameter with at least one non-nullable reference constraint is instantiated with a nullable reference type.

A type parameter with at least one non-nullable constraint is treated as a non-nullable type in terms of warnings given.

A type parameter with no non-nullable reference constraints is treated as both a nullable and a non-nullable reference type in terms of warnings given (since it could without warning have been instantiated with either). This means that both sets of warnings apply.

? is allowed to be applied to any type parameter T. For type parameters with the struct constraint it has the usual meaning. For all other type parameters it has this meaning, where S is the type with which T is instantiated:

Note: This rule is not elegant - in particular it is bad that the introduction of a struct constraint changes the meaning of ?. But we believe we need it to faithfully express the type of common APIs such as FirstOrDefault().

Opting in and opting out

Some of the nullability warnings warn on code that exists without warnings today. There should be a way of opting out of those nullability warnings for compatibility purposes.

When opting in, assemblies generated should contain a module-level attribute with the purpose of signaling that nullable and non-nullable types in signatures should generate appropriate warnings in consuming code.

When consuming code references an assembly that does not have such a top-level attribute, the types in that assembly should be treated as neither nullable nor non-nullable. That is, neither set of warnings should apply to those types.

This mechanism exists such that code that was not written to work with nullability warnings, e.g. code from a previous version of C#, does indeed not trigger such warnings. Only assemblies that opt in by having the compiler-produced attribute, will cause the nullability warnings to happen in consuming code accessing their signatures.

When warnings haven't been opted in to, the compiler should give some indication that there are likely bugs one would find by opting in. For instance, it could give (as an informational message, not a warning) a count of how many nullability warnings it would have given.

Even when a library has opted in, consuming code may be written with an earlier version of C#, and may not recognize the nullability annotations. Such code will work without warning. To facilitate smooth upgrade of the consuming code, it should probably be possible to opt out of the warnings from a given library that will now start to occur. Again, such per-assembly opt-out could be accompanied by an informational message reminding that nullability bugs may be going unnoticed.

Libraries and compatibility

An example: In my C# client code, I use libraries A and B:

// library A
public class A
{
  public static string M(string s1, string s2);
}

// library B
public class B
{
  public static object N(object o1, object o2);
}

// client C, referencing A and B
Console.WriteLine(A.M("a", null).Length);
Console.WriteLine(B.N("b", null).ToString());

Now library B upgrades to C# 7, and starts using nullability annotations:

// upgraded library B
public class B
{
  public static object? N(object o1, object o2); // o1 and o2 not supposed to be null
}

It is clear that my client code probably has a bug: apparently it was not supposed to pass null to B.N. However, the C# 6 compiler knows nothing of all this, and ignores the module-level attribute opting in to it.

Now I upgrade to C# 7 and start getting warnings on my call to B.N: the second argument shouldn't be null, and I shouldn't dot into the return value without checking it for null. It may not be convenient for me to look at those potential bugs right now; I just want a painless upgrade. So I can opt out of getting nullability warnings at all, or for that specific assembly. On compile, I am informed that I may have nullability bugs, so I don't forget to turn it on later.

Eventually I do, I get my warnings and I fix my bugs:

Console.WriteLine(B.N("b", "")?.ToString());

Passing the empty string instead of null, and using the null-conditional operator to test the result for null.

Now the owner of library A decides to add nullability annotations:

// library A
public class A
{
  public static string? M(string s1, string s2); // s1 and s2 shouldn't be null
}

As I compile against this new version, I get new nullability warnings in my code. Again, I may not be ready for this - I may have upgraded to the new version of the library just for bug fixes - and I may temporarily opt out for that assembly.

In my own time, I opt it in, get my warnings and fix my code. I am now completely in the new world. During the whole process I never got "broken" unless I asked for it with some explicit gesture (upgrade compiler or libraries), and was able to opt out if I wasn't ready. When I did opt in, the warnings told me that I used a library against its intentions, so fixing those places probably addressed a bug.

MgSam commented 9 years ago

So are the thoughts on the ! operator from the Aug 18 notes not part of the official proposal yet?

Additionally, fields with a non-nullable reference type must be protected by their constructor so that they are a) not used before they are assigned, and b) assigned before the constructor returns. Otherwise a warning is issued. (As an alternative to (a) we can consider allowing use before assignment, but in that case treating the variable as nullable.)

This rule seems like a non-starter to me, as it is impossible for the compiler to verify this if you use a helper method called from the constructor to assign fields. That is unless you provide an in-line way of disabling warnings (essentially telling the compiler that you know for certain the fields are being assigned). ReSharper has such a feature to disable warnings.

HaloFour commented 9 years ago

@MgSam Isn't that currently a pain-point with non-nullable fields in Apple Swift? What's worse is that the base constructor could call a virtual method overridden in the current class that could access those fields before they've been assigned so not even the constructor is good enough.

tpetricek commented 9 years ago

This looks nice!

Can the proposal be clarified on how are the nullability annotations going to be stored at the CLR level? (I suppose these have no runtime meaning and they would just produce a .NET attribute or some other sort of metadata?)

It would be nice if some future version of the F# compiler could read & understand these and offer some sort of "strict" mode where nullable types are automatically exposed as option<'T> (or something along those lines).

One other possible related idea would be to have an annotation not just when referring to a type, but also when defining a type - for example, types defined in F# do not allow null value (when used safely from F#) and so when the C# compiler sees them, it should (ideally) always treat them as non-nullable, but when working with a type that came from some old .NET library that has nulls everywhere, it makes sense to treat the type as nullable by default. (This would perhaps make things too complicated - but some declaration-site/use-site option might be worth considering...)

rmschroeder commented 9 years ago

Seconding Tomas' points above, this is a great as it stands, but it would be nice if there could be some way to enable a 'strict' mode that guaranteed no nulls and if that could be trusted by consumers of libraries, or is there some way to determine if a dependency did not ignore the warnings? I would also like to hear about how this would be exposed via metadata, reflection, etc.

mariusschulz commented 9 years ago

What I love about this approach is that non-nullability would be the default for reference types, which I think is a lot better than having to opt in to it explicitly every single time (like string! or object!).

@HaloFour In Swift, you can't call a base class' constructor until all (non-nullable) fields have been initialized. That prevents exactly the problem you mentioned.

sfiruch commented 9 years ago

I like this idea and implementation very much. +1 vote to enable (possible) warnings for arrays of non-nullable references.

YuvalItzchakov commented 9 years ago

First of all, kudos to the language team for this proposal. It definitely looks like effort and time has been put to thinking about this.

Regarding the proposal, It seems to me that non-nullability in C#, given its age and maturity would be confusing to many developers, senior and junior as one. Given that the proposal is mainly around compiler time warnings and not errors, this would lead to even more confusion, as it would feel like something developers can easily ignore. Also, we are so used to nullability being around structs, that suddenly allowing them on reference types can add to that same confusion and peculiar feeling.

Think feature doesn't "feel" like it can be done properly without baking it into the CLR. I think attempts to work around not creating a breaking change wouldn't be as good as it can be.

FlorianRappl commented 9 years ago

One question that directly appeared was how library authors should handle parameters, which are not supposed to be null, but are given as null.

The example reads:

public static string? M(string s1, string s2); // s1 and s2 shouldn't be null

Now prior to C# 7 we could introduce a runtime check such as:

public static string M(string s1, string s2) // s1 and s2 can be null
{
  if (Object.ReferenceEquals(s1, null))
    throw new ArgumentNullException(nameOf(s1));
  if (Object.ReferenceEquals(s2, null))
    throw new ArgumentNullException(nameOf(s2));
  /* ... */ // won't crash due to s1 or s2 being null
}

This result, of course, in a runtime error if null is passed-in. In C# 7, knowing that the input should be non-null, a warning in generated. Now we could omit these checks and maybe have a crash, which is a consequence of passing in null references, however, since the user has been warned the origin should be obvious.

public static string? M(string s1, string s2) // s1 and s2 shouldn't be null
{
  /* ... */ // could crash due to s1 or s2 being null
}

Is it supposed to be used like that or would the recommendation be to have both kind of checks (a runtime guard and the compiler mechanism to have an initial warning)?

Needless to say that the runtime guard may have other benefits and provides backwards compatibility, but also adds noise that could be reduced with the new approach.

Additionally I agree to @YuvalItzchakov (especially regarding baking it into the CLR).

sandersaares commented 9 years ago

@FlorianRappl I imagine the null-checking operator from https://github.com/dotnet/roslyn/issues/5033 would be an ideal match for that scenario if it could also apply to method arguments.

public static string? M(string s1!, string s2!)
{
}

Or maybe with an alternate syntax (inconsistent with null-checking operator elsewhere but potentially more suitable for method arguments):

public static string? M(string! s1, string! s2)
{
}

Such an approach would check for null (at runtime) in addition to emitting non-nullable reference warnings at compile time.

The visuals look a bit odd at first glance but having an easy way to specify "throw if null" is very valuable, especially if we also consider that nullability analysis by the compiler would be one basis for the non-nullable reference type warnings, meaning that you would normally run into the following issue:

public static void Bar(string s) { ... }

public static void Foo(string? s)
{
    Argument.VerifyIsNotNull(s, nameof(s)); // Think green, save energy by saving 1 line of code!

    Bar(s1); // Warning: potentially null value used.
    // The compiler can't tell (without extra annotations) that a null check was done out of band.
}

With the null-checking operator applying to a method argument, this is no problem:

public static void Bar(string s) { ... }

public static void Foo(string s!)
{
    Bar(s1); // The compiler knows that a null check was done, so no problem here.
}
HaloFour commented 9 years ago

@mariusschulz It solves one problem by creating another. Since the ctor must come first you're stuck having to initialize those fields via initializers to some arbitrary non-null but still invalid values.

tpetrina commented 9 years ago

What if we have two functions defined as

public void foo(string s) {}
public void foo(string? s) {}

Are these two different functions or will this trigger an error? Can we restrict generics to accept only non-nullable references?

HaloFour commented 9 years ago

@tpetricek

Considering that the ? is just shorthand for attribute-based metadata those two functions would have the same signature and such an overload would not be permitted.

ruuda commented 9 years ago
string s;
string? ns = "hello";

s = ns; // warning
if (ns != null) { s = ns; } // ok

I am a big fan of non-nullable types and I applaud this proposal, but in my opinion this is a bad example. If string? ns = SomeFunctionThatReturnsAString() this would make sense, but in the current case perhaps it would be more appropriate to warn about the unnecessary nullability of ns, as it is assigned a non-null value.

govert commented 9 years ago

+1 for making non-nullable the default, and not introducing string!. This proposal points us in the right direction, even if the road is long.

Although the scope here is limited to a language feature, one might anticipate a future where the runtime type system is extended to include non-nullable reference types, and where the runtime can enforce type safety for non-nullable types. In other words, an 'airtight' system should still be the end goal.

With such a future in mind, +1 also for the suggestion of @rmschroeder to have an opt-in 'strict' compilation mode which ensures the resulting assembly would (potentially) be type-safe in such a non-nullable-aware runtime. When compiling in 'strict' mode:

References between 'strict' and non-'strict' assemblies are still a problem. At least the compiler should be able to generate a null-checking wrapper that allows a 'strict' assembly to be safely used by a non-'strict' assembly, and vice versa. The wrapper would generate the null check to ensure that void MyMethod(string s) never receives a null when called from another assembly through the wrapper. Maybe the checking code is built into the 'strict' assembly, but not invoked when called from another 'strict' assembly.

While it makes sense to decouple the language feature from having a runtime that is non-nullable-aware, let's also keep in mind where we really want to go. That will allow C# 7.0 to build our trusted assemblies of the future.

JamesNK commented 9 years ago

There is already a "strict" mode, it's called treat warnings as errors.

govert commented 9 years ago

@JamesNK That could work, if we have:

I kind of see this like "unsafe" code, which opts out of type safety, and is not implemented merely as warnings.

naasking commented 9 years ago

The discussion of generics is incomplete. For one, it doesn't actually explain how to declare a non-null type parameter. Unconstrained type parameters are T:object? by default, so then if you explicitly constraint it as T:object, it's non-nullable?

Except T:object is currently forbidden as a constraint by [1]: constraints can't be special type "object". I'm all for eliminating that constraint entirely since it makes no sense (along with the sealed class/struct constraint error!).

Finally, creating a guaranteed non-null array seems achievable by using a new array constructor:

public static T[] Array.Create<T>(int length, Action<T[], int> initialize)
    where T : object

Then a simple flow analysis ensures that the unmodified index parameter is used to initialize the array parameter.

[1] https://github.com/dotnet/roslyn/blob/051ad4381100859442af93e16d8d8edbb681b05f/src/Compilers/CSharp/Portable/Binder/Binder_Constraints.cs#L164

xdaDaveShaw commented 9 years ago

Just a thought as someone who uses "Treat Warnings as Errors" on everywhere, there's no mention of how likely false warnings are to appear and how are people going to deal with them. Will it be case of people just sticking ?. instead of . to try and avoid dereferencing null, which could lead people to introduce other subtle bugs if they are not thinking things through. The last think I'd want to see is a code base full of #pragma disable CSXXXX around blocks of code - it's bad enough when Code Analysis flags false warnings and they need to be suppressed, but at least those have support for a Justification - which can be enforced with another CA rule if needed :).

danieljohnson2 commented 9 years ago

I think you really should separate the checked-nullable-references (that is: T? for reference type T) from this proposal; there is some value in having a clear indication that this or that field or parameter is expected to be null sometimes, and having the compiler enforce that null is checked for.

But it seems to me that he rest of this is just not viable as given. The proposal is to declare the existing references types non-nullable, without enforcing this at all. All existing C# code is to be declared broken, and to be punished with an avalanche of useless warnings.

I do feel some confidence about that last point: I don't believe you can generate the warnings accurately enough. The false positives will be overwhelming. We'll all wind up silencing that particular warning.

This is a shame. The advantage of non-nullable references, if they are enforced somehow, is not that we are forced to check for null everywhere. It is that we would not need to anymore.

MiddleTommy commented 9 years ago

I think getting rid of null (mostly) all together is a bad idea. We are so entrenched with it. If your reference type can not be null we will be doing default value checking and error catching more. We need better automatic ways to handle null. I think we need something like default values to ease our problem.

public int Count(this IEnumerable items != null) { return ....;}

In this method items is guaranteed to not be null by the compiler. There can be compiler checks to help this but also automatic ArgumentNullException throwing.

I use null a lot instead of throwing exceptions. I always felt exceptions mean something went wrong. Null means something doesn't exist not that that is wrong.

So there will be a host of people who would like the != null to just pass the method by returning a default public int Count(this IEnumerable items != null ?? 0) { return ....;}

This is stating that if items is null don't throw an exception, return the default value instead.

Eirenarch commented 9 years ago

I dislike the proposal. Reliance on warnings, not proving correctness and punishing existing code with a bunch of warnings all feel like we are discussing JavaScript rather than C#. I'd prefer a proposal that works in fewer cases but is proven to be correct and results in errors rather than warnings. The current one seems very confusing to existing programmers and will be very frustrating when the compiler declares their existing project broken.

ScottArbeit commented 9 years ago

This is brilliant. Can anything be easy and perfect 16 years into the lifespan of C#? No. Is this a robust and minimalist way to accomplish this? Yes. And two years after it's available in C# 7? We'll go through the phase of having warnings, we'll clean up our code, and we'll finally get rid of (almost) all of the null-reference exceptions we're all plagued with. Thank you, Mads, as a .NET developer since Beta 2, I really appreciate it.

jonathanmarston commented 9 years ago

I like the idea of non-nullable reference types in C#, but I'm not sold on them only being implemented as compiler warnings. Consider the following code:

string GetAsString(object o)
{
    return o.ToString();
}

Under the current proposal the method above would not produce any warnings or errors, but neither the compiler nor the runtime actually provide any guarantees that "o" isn't null.

Yes, the caller to this method would receive a warning if the compiler determines that it is not "very likely" that the value being passed is not null, but what if this method is part of a shared class library? And what if the consumer of this library isn't using C# 7, or ignored the warning? This code would generate an NRE just as it would today.

With the current proposal the writer of the library would still need to add a runtime null check to be safe, even though he was told that a reference type without a question mark is non-nullable. The compiler is just giving the illusion of null-safety without actually providing it. Developers will get used to receiving warnings when they write unsafe code (most of the time), assume that if they don't get them that they wrote null-safe code, and start neglecting to write proper runtime null checks when necessary, in some ways making matters worse than they are today!

pawchen commented 9 years ago

+1 @YuvalItzchakov and @FlorianRappl for the concern about the potentially broad confusion.

Traditionally, reference types are already expected to be null-able, adding new notation to specify the same thing and changing the meaning of the existing type identifiers can cause a lot of troubles, for example, when you read existing open source code, you have to check document for the language version, when you use some library, you have to check that too, but the library wouldn't be necessarily written in C#. Today we just check the runtime version.

A more reasonable approach would be to add new notation for types that are expected to be non-nullable, like the ! mentioned by @sandersaares

When old code calls into new code where new code expect objects to be non-nullable, runtime failures are expected under both proposals (implicit non-nullable T or explicit non-nullable T!). I would imagine the binary of the new code would be the same, but only diff at human readable code, in that case, the explicit ! seems to be easier for human to spot and identify the problem.

naasking commented 9 years ago

Treating this broad proposal as compile-time errors instead of warnings is simply infeasible for obvious reasons. Those who wish to assure null correctness right off the bat can just use the csc compiler flag /warnaserror [1].

@diryboy, the point of the proposal is to stop thinking about reference types as inherently nullable, because they're mostly not. There are few scenarios where nullable type is justified, just as there are few scenarios where a nullable struct is justified, and making this explicit via ? is the sane default. We program with values, and whether null is a valid value is contextual, not universal to a whole class of types. The "null everywhere" property is baggage from other languages, and thank god it's finally dying off.

When old code calls into new code where new code expect objects to be non-nullable, runtime failures are expected under both proposals (implicit non-nullable T or explicit non-nullable T!).

Except old code already expected such runtime failures. The point is that linking new code with new code can prevent any such failures by addressing all of the warnings. The old code can also address these warnings the same way via recompilation.

Doing this via a new annotation won't fix existing code either. Warnings provide an easy migration path that doesn't break your existing compiles, to a point where you can eventually use /warnaserror everywhere and be assured your program will never throw a null reference exception.

[1] https://msdn.microsoft.com/en-us/library/406xhdz3.aspx

pawchen commented 9 years ago

Are there any good way to express the context where T? is actually T?

Consider this simple class, when execution is success, Result is not null, Error is null, when not, opposite

class ExecutionResult
{
  public object? Result;
  public object? Error;
}

Under this proposal seems one have to perform 2 null checks in either success or failure case.

When the above class was declared as

class ExecutionResult<T>
{
  public T? Result;
  public object? Error;
}

This even bring more questions. What is ExecutionResult<int?>? What about ExecutionResult<string?>?

Let's consider another case.

class Exception
{
  public Exception? InnerException { get; }
}

For Tasks that throws exceptions, it's wrapped into InnerException and thrown as AggregatedException, how to specify that InnerException is not null? By overriding the property with Exception?

devuxer commented 9 years ago

I think this is an exciting proposal. Besides the promise of cutting way down on null reference exceptions, it would make expectations clear. If there is a ? after the type, null is acceptable, otherwise, it's not. I've always thought value types had an unfair advantage over reference types because they could be explicitly marked as nullable or not. With this proposal, reference types would no longer be second-class citizens in this respect.

That said, I'm a little confused by the opt-in process. I may be misunderstanding, but it seems like both the author and consumer of a C# 7 assembly must decide whether to opt-in to default non-nullability. If this is correct, this is how I think it should work:

jfrijters commented 9 years ago

I really like this proposal. One thing I'm missing is field initialization before base constructor invocation. This would solve many common cases for non-nullable fields (i.e. those initialized with values passed in to the constructor).

Something like this:

class NamedFoo : Foo {
  readonly string name;
  public NamedFoo(string name)
    : name(name), base() {
  }
}

Something else: I think its interesting to consider encoding non-nullable T as modopt(NonNullable) in signatures and allow method overloading on nullableness. This would make it easier to update APIs in a backward compatible way (as the previous C# compiler would prefer the overload without the modopt (i.e. the nullable one)). The obvious downside is that this doesn't scale very well to multiple arguments, but there could be some guidance around that (it's probably possible to explain to people that overloads with to least number of nullable references types will be preferred by downlevel compilers).

AdamsLair commented 9 years ago

Would this proposal work completely on a compiler level, or would it actually introduce changes to the underlying Types? And if so, what would be the effect on Reflection, should this proposal be implemented?

Type first = typeof(string);
Type second = typeof(string?);
  1. Would first and second be different / non-equal Types?
  2. Would second still be a string Type, or some kind of wrapping generic like NullableReference<string>?
  3. Would the answer to either of this change, depending on whether the (calling or object's defining) code was compiled with or without nullable references active?

Especially with regard to automated serialization and copying / cloning systems, this could make a difference. Since they can easily encounter objects / Types that were just passed on to them in some way, I'm not sure how well opt-in strategies would apply here.

I didn't read the entire thread and merely skimmed certain parts, so I might have overlooked the answer to this - in that case, feel free to disregard my comment.

JesperTreetop commented 9 years ago

@AdamsLair According to this part in the first post:

From a semantic viewpoint, T and T? are mostly equivalent: they are the same type in that they have the same domain of values. Specifically, because the system is far from watertight, non-nullable reference types can contain null. The only way in which they differ is in the warnings caused by their use.

It looks like they both have the same System.Type, as opposed to e.g. *T. It's an extra value that follows the type around during analysis (not known to be anything, known to be nullable, known to be non-nullable), maybe in a way that's comparable to dynamic vs object, and probably similar to dynamic being distinguished by DynamicAttribute in the IL since this will be something you'll be able to see in libraries and thus can't evaporate in compilation.

I support this proposal. There are tons of corner cases, I'm sure (like is T?[]? going to be a thing) and lots of things to figure out, but it is the best that can be done to move forward and allow for progress to be made without a full reboot. (I'm in favor of a reboot too, I just don't like the looks I get from the people who just spent six years rewriting the compiler.)

naasking commented 9 years ago

@diryboy > What is ExecutionResult<int?>? What about ExecutionResult<string?>

It's the same result as you'd expect with ExecutionResult<Nullable>, if ExecutionResult where T:struct. If a ? appears next to a type, we expect it to be nullable. A Nullable<Nullable> is possible. While semantically dubious, why should we care what the developer writes as long as it's not unsound?

Under this proposal seems one have to perform 2 null checks in either success or failure case.

Why? Presumably your constructors will enforce that one of either Result or Error cannot be null, so just check whether one is null, and if it is, you know the other isn't.

For Tasks that throws exceptions, it's wrapped into InnerException and thrown as AggregatedException, how to specify that InnerException is not null? By overriding the property with Exception?

That's a good example. I think the only answer is that it requires a new property that shadows the base property, ie. "public new Exception InnerException { get { (Exception)return base.InnerException; } }", because you can't change the type while overriding (if ? is treated as a type). For clients that work on AggregatedException, they see the new property and know it's not null, and for clients accepting Exception, they see that it may be null and check for that as usual.

paul1956 commented 9 years ago

Then should I be able to do the following without getting a warning/error or being told I need a cast on "x?.getvalue" and A is unchanged if x is Nothing, correct?

Dim A as MyTypeValue = 4 ' none nullable
A = x?.getvalue

What is the propose behavior of a CType (cast) from a nullable to a non-nullable when the value being cast is Nothing as the result of a null propagation?

Sent from my iPhone I apologize for any typos Siri might have made. (503) 803-6077

On Sep 6, 2015, at 5:52 AM, JesperTreetop notifications@github.com wrote:

@AdamsLair According to this part in the first post:

From a semantic viewpoint, T and T? are mostly equivalent: they are the same type in that they have the same domain of values. Specifically, because the system is far from watertight, non-nullable reference types can contain null. The only way in which they differ is in the warnings caused by their use.

It looks like they both have the same System.Type, as opposed to e.g. *T. It's an extra value that follows the type around during analysis (not known to be anything, known to be nullable, known to be non-nullable), maybe in a way that's comparable to dynamic vs object, and probably similar to dynamic being distinguished by DynamicAttribute in the IL since this will be something you'll be able to see in libraries and thus can't evaporate in compilation.

I support this proposal. There are tons of corner cases, I'm sure (like is T?[]? going to be a thing) and lots of things to figure out, but it is the best that can be done to move forward and allow for progress to be made without a full reboot. (I'm in favor of a reboot too, I just don't like the looks I get from the people who just spent six years rewriting the compiler.)

— Reply to this email directly or view it on GitHub.

pawchen commented 9 years ago

@naasking It's a trap. Nullable<Nullable<int>> won't compile. T?? causes parser error like old C++ compilers parse A<B<C>> where >> or ?? happens to be another operator. So actually it's more reasonable that they collapse to single ? for both generic examples of ExecutionResult.

Making the compiler considering the constructor looks very smart. I'm not sure it's possible. I meant when you consume ExecutionResult from some binary, I thought the sematic analyzer only sees the "type" of Result and Error as nullable of something. ExecutionResult could be written in other languages.

Not sure one should override or shadow in the AggregatedException case, I felt like overriding, as ? might just be some kind of extra metadata annotated via some attribute. Like dynamic, it's fine to do things like

class A
{
  public virtual dynamic M(){ return null; }
}
class B : A
{
  public override object M() { return 1; }
}
naasking commented 9 years ago

Nullable<Nullable> won't compile.

But it won't compile due to extra rules added specifically for this case, not because it's semantic nonsense according to C#'s standard rules. System.Nullable is a struct, so it satisfies the requirements of it's own type parameter.

I agree collapsing is probably the better default just for consistency with these extra rules, but there have been plenty of discussions in the functional programming community whether "Maybe Maybe T" is sensible, and it really is necessary in some cases. Consider a Dictionary<TKey, TValue> where Dictionary.Add returns TValue? that's null if the key isn't present. Dictionary<string, object?> should be a valid instantiation, which means null should be a valid value, which means Add should return object??, and this is all perfectly sensible, but would be unrepresentable in C#.

I thought the sematic analyzer only sees the "type" of Result and Error as nullable of something. ExecutionResult could be written in other languages.

Regardless, the developer clearly understands the intended semantics of ExecutionResult, when either one or the other is not null. It's up to the ExecutionResult's developer to ensure those semantics, and the client relies on those semantics. I still don't see any motivation for checking both members for null.

Not sure one should override or shadow in the AggregatedException case, I felt like overriding, as ? might just be some kind of extra metadata annotated via some attribute.

Shadowing is compatible with both scenarios, where non-null is treated as a type or it's just metadata, ie. it's future-compatible. Overriding is compatible only with one scenario. I don't see any motivation for overriding in this case, unless you can provide one?

pawchen commented 9 years ago

@naasking nullable nulls or Just Nothing.. Given int?? i = new Nullable<Nullable<int>>(null), how to construct something of the similar type object??? What would be the result of i == null? What about int?? j = new Nullable<Nullable<int>>(0) and the result of j == 0?

It seems to me that if this proposal implements Maybe<TRef> to pair Nullable<TVal> makes more sense for nullable nulls?

When you are given some type with some property defined as nullable and the compiler give you warning(or even treated as error) if you don't check null before use, would you bother check how that type constructs? In my ExecutionResult example:

// suppose we have a function expects non-nullable object
static void use( object o ) {...}

// check error first
if ( r.Error == null )
{
  use(r.Result);  // warning or even error
  use(r?.Result); // happy?
}
else ...

// check result first
if ( r.Result == null )
{
  use(r.Error);  // warning or even error
  use(r?.Error); // happy?
}
else ...

Unless we have subclass SucessfulResult and FailureResult which implement the similar thing we discussed in the AggregatedException that override/shadow Result/Error, turning the above code into something like

var sr = r as SucessfulResult;
if ( sr != null )
{
  use(sr.Result); // happy
}
else
{
  var fr = (FailureResult)r;
  use(r.Error);  // happy
}

Shadowing makes a new member, diff behavior happen when cast the object to base type. No diff in this simple property getter case, but could be a problem when subtype do more things than that.

qrli commented 9 years ago

+1000 to non-nullable by default, thus a consistent syntax with value types.

PS: it seems better to have a compiler option to generate run-time non-nullable argument check code, at least for public functions.

jonathanmarston commented 9 years ago

@qrli

I like the idea of generated run-time null checks of non -nullable arguments. This would totally remove the need for the oft-repeated "if null throw ArgumentNullException" code. The only issue would be that you'd now be doubling the checks with legacy code that already has these checks in place. Maybe one of the new warnings could be for comparing a non-nullable variable to null. This way the developer would get a warning as a signal to remove the redundant, manual checks.

danieljohnson2 commented 9 years ago

@jonathanmarston Adding runtime checks would make this feature rather less useless, but we'd really need a distinct 'mandatory T' type to trigger that, or all existing C# code would be broken right out of the gate. In that case, you might as well have errors rather than warnings- better to break everything at compile time rather than runtime.

Then again, for anyone who treats errors-as-warnings, this proposal already breaks everything anyway. In for a penny, in for a pound, perhaps?

jonathanmarston commented 9 years ago

@danieljohnson2

In my opinion, being overly concerned with backward compatibility is a bit counter-productive for this feature. We are talking about a fundamental change to the way C# thinks about reference types here. The current proposal already acknowledges this by making even just the warnings opt-in, but I don't feel like warnings are enough. If I am opting in, let me go all the way and opt in to enforced non-nullability.

In my opinion, if this feature is implemented, it should come with a compiler option with three choices for how to handle non-nullables:

None: works the same as C# 6. You're opting out of the new functionality. Any upgraded projects would have this set by default and the developer would have to go to the project properties to change it to get the new functionality. This means all existing projects still compile fine in C# 7 by default.

Warn: works as described in the current proposal. This would give developers a migration path to start coding for non-nullables without diving in completely.

Enforce: the default for new C# 7 projects. Most of the warnings from the "warn" option are now errors, and runtime null argument checks are generated for non-nullable parameters in public methods. This protects class libraries that are fully using non-nullables from consumers that have not been upgraded to use non-nullables yet.

There a few reasons why I prefer this over depending on errors-on-warning for those of us that want to go all-in on non-nullables. The first is that I may have other warnings that I don't have time to fix right now (calling deprecated APIs, for instance) because it would require a large refactoring, but I would like to start using non-nullables right away. Another is the issue of class libraries that have opted in to using non-nullables, but consumers that have not. In this situation the developer of the class library would still have to write null argument checks, even though she has opted into and is using non-nullable parameters. It's totally counter-intuitive.

Even what I suggest above still leaves the rather ambiguous case of opted-in code calling old code that has not opted-in. I think the compiler giving the benefit of the doubt and not raising errors or warnings in this case is the best route (as opposed to flooding the developer with hundreds of warnings), but the type should somehow be displayed differently in the IDE so a developer knows that they are dealing with unknown nullability.

As a side note, what is the plan for string v = default(string)? Would this produce a warning? It would be really nice if somehow default(string) was string.Empty, and default(string?) was null...

qrli commented 9 years ago

@jonathanmarston

Enforce: the default for new C# 7 projects. Most of the warnings from the "warn" option are now errors, and runtime null argument checks are generated for non-nullable parameters in public methods.

I prefer runtime-check for all only for debug build. For release build, I'd want runtime-check only for public members, if I am building a library; or none, if I am building an application.

As a side note, what is the plan for string v = default(string)? Would this produce a warning? It would be really nice if somehow default(string) was string.Empty, and default(string?) was null...

That may be too much breaking change, and it is incompatible with generic case. And what about custom string-like classes (e.g. if I implement a Matrix class) which has its own Empty value? I think the simple rule is default(string) == null and it is not allowed for non-nullable variable.

tpetrina commented 9 years ago

Can we define default members for each class/struct? Something like

public class Matrix
{
    public static default Matrix Empty { get; } = new Matrix();
}

// works with
var emptyMatrix = default(Matrix);
// also for arrays
var matrices = new Matrix[5];
danieljohnson2 commented 9 years ago

@jonathanmarston, backwards compatibility would certainly hinder this feature- but it is still a very important thing. This is a "fundamental change to the way C# thinks about reference types", and having it turned on by a compiler switch means you have two C# dialects that you can't mix in the same project.

Turning that switch on isn't going to be easy for existing code: it's not enough to sprinkle your code with '?', you have to find the idioms needed to convince the compiler that you are checking for null and apply them throughout.

This is going to be ugly.

jonathanmarston commented 9 years ago

@tpetrina

That's sort of like what I had in mind. If default became an overridable operator a class implementation could make default whatever makes sense for that type. But what would default be for a class that doesn't override it?

@danieljohnson2

I totally agree. This could get messy. Maybe rather than fundamentally change C# it's time for C## instead?

ljr1981 commented 9 years ago

The Eiffel language has completely solved and resolved pointer dereferencing once and for all. We refer to it as "Void Safety". There is a simple and elegant solution, which requires knowledge by the compiler and language constructs that do not obfuscate. The solution also involves a couple of keywords with new code constructs: detachable and attached.

The language constructs are call CAPs or Certifiable Attachment Patterns. They are code patterns, which the compiler can parse and logically deduce that a reference will not become Void at every time it is used.

This technology works for both synchronous and asynchronous software.

HaloFour commented 9 years ago

@ljr1981

"attached" and "detached" read as non-nullable and nullable respectively. Most of CAP seems to just be flow-analysis. I'm not seeing much in the paper on the subject that seems really new given the type-checking and null coalescing operators that C# has today. I'd think that the C# compiler could apply such techniques today on existing reference types without requiring any syntax changes.

The paper does highlight a lot of the same problems facing C#, particularly where generics and arrays are concerned. The pain of the migration is also mentioned.

paulomorgado commented 9 years ago

This was an hard to understand proposal.

If I understand it correctly, this is nothing more that a method contracts on steroids. There are no guarantees made by the compiler and less by the runtime. The compiler just does dataflow analysis on variables declared to be nullable.

So, those on the non-nullable reference types parade may store away their drums and pipes. This isn't nor will not be in the foreseeable future about non nullable reference types.

Although I understand this the best that can be done with the current runtime, there are a few things that bother me:

  1. Like value types are inherently not nullable, reference types are inherently nullable. Likewise, if a modifier is needed to indicate that a value type is nullable, a modifier should be needed to indicate that a reference type is not nullable.
  2. Reference types are inherently nullable. This proposal artificially tries to pass them as inherently not nullable.

@MadsTorgersen, can you elaborate on why it was chosen a modifier for nullable reference type variables and not a modifier for non nullable reference type variables?

ndepend commented 9 years ago

I agree with @paulomorgado "this sounds like artificial" + "I understand this the best that can be done with the current runtime".

Also, unless I miss this piece of info (but I searched for it in the various responses), we'd really need to understand how this would work at metadata level to provide our definitive thoughts on this. (see first questions from @tpetricek @rmschroeder).

I am a heavy consumer of the pattern: bool TryGet(object obj, out T t); to avoid things like T Get(object obj) where the result can be null in case of failure to get. With this syntax the caller code would have to be:

T? t;
if(!TryGet(obj, out t) { return; } // t is null, if and only if, TryGet() returns false
if(t == null) { return; }

which is awkward.

Also some tools should be developed upon Roslyn, to automatically modify the legacy code to avoid warnings. This could be done for at least simple cases.

dsaf commented 9 years ago

@MadsTorgersen do you have any comment about @FlorianRappl 's question?

...would the recommendation be to have both kind of checks (a runtime guard and the compiler mechanism to have an initial warning)? Needless to say that the runtime guard may have other benefits and provides backwards compatibility, but also adds noise that could be reduced with the new approach.

It would be nice to know, because it affects the code written by us today. Thank you.

tpetrina commented 9 years ago

After littering my code with [NotNull] and fixing obvious nullref bugs, I am ready for shorthand syntax. Actually, no. I would prefer that by default references are non-nullable in my projects.

Adding ! suffix for non-nullability is a readability nightmare and using ? for nullable references is a much better choice.

NightElfik commented 9 years ago

I am really excited that the The Billion Dollar Mistake is being addressed!! As people already mentioned (@rmschroeder, @govert, @jonathanmarston), I am strongly in favor of following behaviors:

  1. By default, reference types should be non-null.
  2. The non-null checking should be strictly enforced by compiler (leak-proof) - this might be compilation option.

With current proposal, do I understand it correctly that even if I have string argument somebody can in fact pass me null, However, if I decide to check it for null I will get a warning?

I am personally trying to not use null at all in my recent C# programs. I use struct Option<T> where T : class for nullable types and I am really happy with it - I haven't seen NullReferenceException for very long time :) Language support would be amazing!

I understand that there is big problem to make it backwards compatible. On the other hand, this is very very important feature and huge step forwards for C# - I could imagine having a compiler switch that would make the null checking leak-proof with cost of breaking backwards compatibility. Otherwise just generate warnings.

Thanks for the proposal!