dotnet / csharplang

The official repo for the design of the C# programming language
11.39k stars 1.02k forks source link

Champion "Nullable reference types" (16.3, Core 3) #36

Open MadsTorgersen opened 7 years ago

MadsTorgersen commented 7 years ago

Note this includes the object constraint: where T : object.

LDM:

shmuelie commented 7 years ago

@Pzixel I wasn't arguing against this, sorry if it came out that way. I was more trying to say how I see the parallels between what C# vNext is trying to do and what TypeScript does with nulls.

MikeyBurkman commented 7 years ago

That's just the thing though -- TS does almost nothing special with nulls as compared to other values. There is some syntactic sugar of course, and null assertions are a little bit different than, for instance, seeing if something is a string or number. But fundamentally, nullability is done through union types, and null is essentially now its own type with no functions/members on it.

The important change in TS 2.x was making null no longer a top-level type that is assignable to anything. Once they did this, then doing let x: string = null; made about as much sense as let x: string = 123;. This is a very fundamental difference from what C# has. As long as null remains a top-level type in C#, the implementation of this proposal will almost certainly differ a lot from what TS has, unfortunately.

Maybe Kotlin is a better choice of languages to copy? I don't know much about it personally, but it supposedly adds nullability checks and it certainly doesn't have a robust union type system like TS or Ceylon.

jeffanders commented 7 years ago

@benjamin-hodgson To answer your first question my proposal states "It is important to note that the nullable modifier is only encoded via an attribute and it therefore does not affect the runtime or assembly representation of T".

Therefore foo._value has type string at runtime. As far as the runtime is concerned string can legally contain nulls.

In my original proposal (dotnet/roslyn#4443) based on T! the very first point I address is runtime types. So the following conditions should always be true.

typeof(string?) == typeof(string) // true
string? s1 = ""; 
string s2 = "";
s1.GetType() == s2.GetType() // true

So for your Foo<int> example foo._value is just an int and your code as written still all works as expected for an int.

For Foo<int?>, foo._value is an int? (or Nullable<int>) and your code as written still all works as expected for an int? (that is to say you can set the Value property to null and retrieve that without an error).

I think it might be clearer to state that under my proposal #403 for a preserving type parameter T that T? should be pronounced as "defaultable T" rather than "nullable T".

Going back to the T? FirstOrDefault<T>(this IEnumerable<T> e) example from my proposal and running through your scenarios again but this time for the return type and what value would be returned for an empty sequence.

This preserves the existing runtime semantics while allowing the compiler to warn us at compile time if we use the return value in a way we should not.

gulshan commented 7 years ago

The latest sprint summery in Roslyn repo https://github.com/dotnet/roslyn/issues/18719 mentioned this feature as "Non-Null References". I'm glad the word "type" was omitted.

fubar-coder commented 7 years ago

@MadsTorgersen @gafter When #52 requires a change to the CLR, then maybe it's an opportunity to add non-nullable reference types (enforced by the CLR) to the CLR?

EDIT: This would avoid the automatic if (ReferenceEquals(x, null)) check generation in most places which should result in a performance boost (and not a penalty) when using non-nullable reference types.

fubar-coder commented 7 years ago

If this feature gets implemented using the NotNullable<T> struct, then only the following parts of the CLR must be modified:

The CLR opcodes must be modified to disallow default initialization and we get full support for non-nullable reference types. This should be much easier than real non-nullable reference types.

It is easier to forbid default(T) (see dotnet/csharplang#146) just for NotNullable<T> because we can easily check for ldelem if the returned element is initialized (or invalid) by checking if it's a null reference pointer.

The performance for a NotNullable<T> struct should increase as soon as dotnet/coreclr#11407 gets implemented.

There should be a project setting to enable automatic wrapping of reference types not annotated with the ? in a NotNullable<T> struct. This should also mean inserting .Value accesses whenever a member of T gets accessed. Compilers without support of NotNullable<T> will continue to use nullable reference types without breaking any compatibility. You could also explicitly use the NotNullable<T> struct without native support by the compiler.

An implicit conversion from nullable T to non-nullable T must emit a warning (or maybe even an error) - except when suppressed with a !. The use of the ! must not result in a null check.

Special consideration must be taken in cases like Dictionary<K,V>.TryGetValue(K, out V) with V being an NotNullable<T>. There are two possible solutions using new functions:

A NotNullable<T>? must be converted to T.

Kukkimonsuta commented 7 years ago

Nullability adornments should be represented in metadata as attributes. This means that downlevel compilers will ignore them.

It was noted before typeof(string?) is equal to typeof(string). However how will Task<string?> be represented? Consider following example:

var stringType = typeof(Task<string>).GetGenericArguments()[0];
var nullableStringType = typeof(Task<string?>).GetGenericArguments()[0];

// as stated before, types should be equal
stringType == nullableStringType;

// but this returns null
stringType.GetTypeInfo().GetCustomAttribute<NotNullAttribute>();

// and this somehow should return `NotNullAttribute`
nullableStringType.GetTypeInfo().GetCustomAttribute<NotNullAttribute>();
Richiban commented 7 years ago

@Kukkimonsuta

stringType.GetTypeInfo().GetCustomAttribute();

I think there will need to be a special provision made for generic methods in general, i.e. the compiler will have to tag methods to say "I guarantee that this method does not return default(T)".

mattwar commented 7 years ago

@Kukkimonsuta the attributes are not on the constructed generic type, they are on the members of types that refer to it, and in the compiler's logic only inside method bodies.

HaloFour commented 7 years ago

@Kukkimonsuta

At the boundary those types would still be adorned with attributes. They would likely work similarly to the DynamicAttribute which is used to denote when a normal object parameter should be treated as dynamic.

For example:

public Dictionary<int, (string, dynamic)> M() { ... }

//is really
[Dynamic(new bool[] { false, false, false, false, true })]
public Dictionary<int, ValueTuple<string, object>> M() { ... }

The array of bools counts recursively into the generic types and their generic type arguments as follows:

  1. false - Dictionary<,>
  2. false - int
  3. false - ValueTuple<,>
  4. false - string
  5. true - object
Kukkimonsuta commented 7 years ago

God that's ugly, but clever :) Thanks for explaining.

fubar-coder commented 7 years ago

Uh, isn't using dynamic also very slow?

Pzixel commented 7 years ago

@fubar-coder it's beside the point, but yes.

HaloFour commented 7 years ago

@Kukkimonsuta

I won't argue against that. 😁

@fubar-coder

Not due to the way they are encoded in attributes.

Joe4evr commented 7 years ago

@fubar-coder

If this feature gets implemented using the NotNullable<T> struct [...]

Nothing in your post is any relevant beyond this point because this proposal does not mention anything at all about a wrapper type. The whole thing is just compile-time static analysis, where the developer can declare that a variable/parameter/field/return type can or cannot have null as an intended/handled value.

In fact, as the team had already discussed over two years ago:

Probably the most damning objection to the wrapper structs is probably the degree to which they would hamper interoperation between the different variations of a type. For instance, the conversion from string! to string and on to string? wouldn't be a reference conversion at runtime. Hence, IEnumerable<string!> wouldn't convert to IEnumerable<string>, despite covariance.

And this was when T! was still a proposed syntax and types would have different variations to declare with. Now the proposal is that the types are really the same under the hood and the compiler assumes that null isn't an valid/intended value at declaration by default (which is like 95~99% of the time).

The attribute-based approach has been deemed the best option by the LDM team because A) it's not a runtime breaking change and B) doesn't hamper interop with prior language versions and stuff like generics. The only downsides then are A) it's opt-in behavior* and B) it's not a silver bullet (developers can choose to ignore the warnings and the runtime can still allow null to be assigned anyway).

* Would be pretty nice if a future version of VS would auto opt-in to nullability analysis on File -> New Project as part of the templates. It doesn't have to be the same version of VS that the feature ships with, just something to consider.

fubar-coder commented 7 years ago

My main problems​ with this approach are, that it might hurt performance and that it really doesn't help against NREs. Resharper uses the same approach and it's difficult to get right and you can still get NREs. IOW: A developer gains nothing beside a slight feeling that his code might be a little bit more stable.

yaakov-h commented 7 years ago

@fubar-coder I agree, but there are no good options at this point for enforcing non-nullability at runtime in a backwards-compatible manner.

I'd love to see something like Swift's nullability model, but that's clearly not going to happen.

What a developer does gain, if the analysis is correct or close to correct, is warnings on obvious places where null deferences might occur, and confidence that either the analysis system knows the code is safe

Or if a developer suppressed warnings (with #pragma or postfix-! or whatever), one would hope that the developer in question knows the codebase better than the flow analysis does.

I currently work on several extremely large codebases which employ Code Contracts. The value from that means I can count on my fingers the number of NREs we've had in Contracted areas of the codebase. It isn't perfect, but it's damn good (when it works), and I've love to replace it with this proposal.

Joe4evr commented 7 years ago

Something I've been wondering for a while: Will there be room for better refinement in versions after nullability analysis initially ships? Specifically, I'm thinking additional attributes that API authors can apply to indicate when some property/field can (or even will) be null and thus give more accurate feedback to consumers of the API.

Pzixel commented 7 years ago

@Joe4evr declare type without ! suffix and it's going to be nullable. What extra attributes you want here?

@all Do we really want ! syntax here? So we get inconsistent !, ? and no-suffix for nullable and non-nullable reference and value types? I understand that we want to persist a backward compatibility, but maybe we want to break things here like we have changed foreach loop closure in C# 4.0. I guess in virtually 99% of code we don't want nulls, so we'l have to just spam these ! everywhere in our codebases, just like ConfigureAwait(false) today. Why can't we just accept that starting with C# 8.0 string is a not-null type, and not nullable one? It's quite easy to migrate old code to new one, just add question marks everywhere. So we can just break things and write a simple migration tool that performes all required operations. It's much better that introducing an inconsistent syntax. Eric Lippert agree with me here:

Ritchie's wry remark illustrates the lesson. To avoid the cost of fixing a few thousand lines of code on a handful of machines, we ended up with this design error repeated in many successor languages that now have a corpus of who-knows-how-many billion lines of code. If you're going to make a backward-compatibility-breaking change, no time is better than now; things will be worse in the future.

Kukkimonsuta commented 7 years ago

@Pzixel I think ! was in earlier proposal and the latest proposal states that string would be not-null and string? would be nullable. Even though this is breaking change, there is also supposed to be opt in/out mechanism, so you can actually use C# 8.0 without being forced to update your code.

Pzixel commented 7 years ago

@Kukkimonsuta I hope so because I'm tired of writing something like

public T GetValueOrDefault<T>(...) where T : class => null;
public T? GetValueOrDefault<T>(...) where T : struct => null;

I with I have only latter variant because if don't set class constraint, I cannot use null as return value, and if do, then I can't use T? to perform operation over value types. In this case I have to duplicate all APIs to work with classes and structs. If we accept unified syntax with ? we are able to write algorithm like this very consistently.

jnm2 commented 7 years ago

@Pzixel I wish I even could write that. I can only write this:

public T GetValueOrDefault<T>(...) where T : class => null;
public T? GetValueOrNull<T>(...) where T : struct => null;
navozenko commented 7 years ago

I have a question: what will happen with the API of standard .NET libraries? Will it remain "as is" or will it be adjusted to a new syntax? For example, if the standard .NET method/property can input or return null, how should I use them?

Pzixel commented 7 years ago

@navozenko I don't see how syntax may change API. It stay the same, except that somewhere compiler will insert attributes like cannot be null. You don't consume source code of .Net libraries, you consume compiled binaries. And they remain the same except having some extra attributes that modern Visual Studio can warn that You check it on null when it cannot be it or Possible null dereferencing. Older VS won't see any changes.

Joe4evr commented 7 years ago

@navozenko You may be under the mistaken impression (like some others also were) that this feature introduces new types for nullable and non-nullable references, but that's not the case. A string? would still be the exact same thing to the CLR as a string, it's just that the former has some attributes on it to allow compilers to see if that variable/field/return value/parameter may be null and would warn if you try to dereference it without checking first. As I stated earlier:

A) it's not a runtime breaking change and B) doesn't hamper interop with prior language versions and stuff like generics. The only downsides then are A) it's opt-in behavior and B) it's not a silver bullet (developers can choose to ignore the warnings and the runtime can still allow null to be assigned anyway).

When it comes to the BCL, I presume that it would be updated with the appropriate annotations wherever a null is considered to be a possible/handled value, a Herculean task it may be. (Most of this feature would fall flat on its face if the BCL didn't lead the way, IMO.)

navozenko commented 7 years ago

@Pzixel @Joe4evr I did not understand, will all parameters in standard .NET libraries be interpreted as nullable? Or will the attributes "nullable" and "non-nullable" be placed in its?

For example, which list will return a LINQ query or ToString(): nullable or non-nullable? That is, how will I write:

List<Foo> foos = items.Where(...).ToList();
string s = foo.ToString();

or

List<Foo>? foos = items.Where(...).ToList();
string? s = foo.ToString();
Pzixel commented 7 years ago

I did not understand, will all parameters in standard .NET libraries be interpreted as nullable? Or will the attributes "nullable" and "non-nullable" be placed in its?

The latter.

For example, which list will return a LINQ query or ToString(): nullable or non-nullable? That is, how will I write:

I think it will be

List<Foo> foos = items.Where(...).ToList();
string? s = foo.ToString();

Because ToList cannot return null by design, while ToString can be overloaded and return any string including null.

Joe4evr commented 7 years ago

while ToString can be overloaded overridden and return any string including null.

Yes, but then the override has to be declared like this:

public override string? ToString()
{
    return null; //since null is a returned value, the return type should reflect that
}
Pzixel commented 7 years ago

@Joe4evr override (yes, thank you) cannot change signature. So, it will be stiring? in object. And yes, declaration of overriden method will be this one.

sharwell commented 7 years ago

@Pzixel object.ToString() has a non-null post-condition, so it would not be updated to have the return type string?.

Pzixel commented 7 years ago

@sharwell where?http://referencesource.microsoft.com/#mscorlib/system/object.cs,ff31a6bf27c58f89,references

shmuelie commented 7 years ago

@Joe4evr @Pzixel @sharwell overriding will not be effected by this since it's an attribute on the method. Attributes are not part of the signature.

gafter commented 7 years ago

@SamuelEnglard The compiler will certainly give a warning when a method's override relaxes the nullability guarantee on its returned value. Although these nullability annotations are recorded in assemblies as mere attributes, the compiler pays attention to them.

shmuelie commented 7 years ago

@gafter true My point was that since it's not changing the signature you can change nullablity (whether or not you should is a different story). So even though ToString() on Object says not nullable you can make it nullable in your override.

Pzixel commented 7 years ago

@SamuelEnglard just because C# team doesn't want to make types first-class citiziens. You can assign null to not-null and suppress warning in the same manner. Or change readonly field via reflection. That's out of point, really. If base object say it's not null you cannot change it. You get same warning as setting null to not-null, it's actually an error, which is the best C# team can offer without breaking things.

shmuelie commented 7 years ago

My comment was not to say that what they're doing is wrong. My point was simply that while yes an override cannot change the signature of the method, nullability is not part of the signature (though it looks like it is). I'm not saying whether anything is good or bad.

Pzixel commented 7 years ago

My position that we have to treat these warnings as errors. Because they are warnings due to backward compatibility only. So it's really doesn't differ from other errors such as "not all members of interface are implemented".

I wrote a big post, but I have deleted it and summarize: nullability are part of signature on C# level. Yes, on IL it's the same, but for C# it's an error. This looks really like readonly - on IL level you can modify it, but not in C#. So if we are talking about CLR - yes, you can change nullability. But talking about C# - no way. You get a "warning" which actually is an error, think about it like if C# didn't have type errors at all but only "warnings". Yet another reason for treat warnings as errors.

fubar-coder commented 7 years ago

When this is solved using attributes, then wouldn't this cause problems with existing .NET runtimes? LDM-2017-02-21.md mentions this problem.

Pzixel commented 7 years ago

@fubar-coder it has nothing with generic attributes. Explain your position more verbose.

fubar-coder commented 7 years ago

@Pzixel I misread the title. I thought the problem would be that the .NET runtime might have problems with attributes attached to generic type arguments.

Joe4evr commented 7 years ago

Nah, the CLR has supported attaching attributes to generic type parameters forever (there's even an AttributeTargets value for them). It's attributes that are themselves generic that are talked about there.

shmuelie commented 7 years ago

@Pzixel

I wrote a big post, but I have deleted it and summarize: nullability are part of signature on C# level. Yes, on IL it's the same, but for C# it's an error. This looks really like readonly - on IL level you can modify it, but not in C#. So if we are talking about CLR - yes, you can change nullability. But talking about C# - no way. You get a "warning" which actually is an error, think about it like if C# didn't have type errors at all but only "warnings". Yet another reason for treat warnings as errors.

I get what you're saying, BUT nullability isn't even part of the signature on the C# level. You can't use it as a way to overload, so void A(string str) and void A(string? str) won't compile (unless I've misunderstood the standard).

Pzixel commented 7 years ago

@SamuelEnglard of course it compiles. It just produce a warning. All these changes won't produce any additional errors at all! So you can write literaly everything, messing any nulls and non-nulls and still have successful compilation. I guess you misunderstood the standard, yes. It's all about warnings, not errors. However, I'l do all my best to force these warnings to be errors on my projects when this feature releases.

shmuelie commented 7 years ago
// C# with non-null reference types
class SomeClass
{
    void SomeMethod(string str) { };
    void SomeMethod(string? str) { };
}

Isn't valid C# (let alone IL).

HaloFour commented 7 years ago

@Pzixel

No, that won't compile because both methods have the same signature: void SomeClass(System.String). You can't overload based on parameter attributes.

Yes, it's a shame that this feature can't have more teeth out of the gate, but I understand why. It's a question of adoption. Sure, there will be some developers who jump on this right out of the gate. They'll immediately opt-in and set those warnings to errors and spend the time and effort to correct their codebase as soon as possible. I'll be in that camp and as I suspect that most people here will be also. But for the majority of developers this represents a massive breaking change and a lot of additional work that they would rather not have to do (at least all at once) and if they are immediately greeted by a wall of error messages they'd be more likely to shy away from updating. Hopefully, over the course of a couple releases as the BCL and common third party assemblies are updated, the safeguards can become not only default but also stricter.

gulshan commented 7 years ago

I really want "Non-nullable reference types" as mentioned in the title, not just "Non-nullable references" actually proposed in the proposal. It seems the proposed backward compatible "soft-break" may bring more liabilities into the ecosystem rather than making the language safer. I think, maintaining binary backward compatibility, breaking source compatibility with an option for opt-out using explicit language version as an argument to the compiler is the way to go. That's just my opinion though.

Pzixel commented 7 years ago

@HaloFour sorry, I misread it. I ment that if you have virtual void A(string) and override void A(string?) it will compile with a warning. Of course you cannot declare two methods with same signature in one class.

@gulshan I believe .net team just cannot afford such a breaking change you propose. I'd like this feature too. However, it require unreal amount of work for a small profit - safety at runtime. If you check everything at compile time you just don't need it. The only drawback is that you cannot do some things like have an overload of same type but which is nullable and so on. But I don't believe it's really worth to perform huge amount of work of these rare use cases. I was on your side some time ago, but now I see that .net team way is kinda better for many reasons.

Joe4evr commented 7 years ago

I think, maintaining binary backward compatibility, breaking source compatibility with an option for opt-out using explicit language version as an argument to the compiler is the way to go.

Remember that this conversation has been going on for years. I think both MS and the community discussed pretty much every possible option by this point, and after weighing all the pros and cons of each of those, MS concluded that this will be the one to go with.

Yes, it's opt-in and not the silver bullet that people had hoped for, but after so many discussions, it's damn well better than nothing at all. The knot as been cut, and the odds of changing the decision now is near impossible.

I ment that if you have virtual void A(string) and override void A(string?) it will compile with a warning.

Which is exactly what Gafter said.

HaloFour commented 7 years ago

Yes, it's opt-in and not the silver bullet that people had hoped for, but after so many discussions, it's damn well better than nothing at all.

I believe that it was also mentioned somewhere that the opt-in+warning approach might be the first iteration and that after wider adoption the default behavior from the compiler might evolve towards opt-out+error. I have no cite for that but this was probably during the Codeplex timeframe and I don't feel like digging through all of those issues and comments trying to find it.

Joe4evr commented 7 years ago

after wider adoption the default behavior from the compiler might evolve towards opt-out+error.

While I may or may not recall the same thing (human memory is quite easily tricked), it still won't be the silver bullet until it can be enforced at runtime.

To be clear: I'm perfectly fine with how the proposal stands right now. I'm just stating the argument that the last few skeptics still have.