MadsTorgersen commented 9 years ago

C# Design Meeting Notes for Mar 10 and 17, 2015

Agenda

These two meetings looked exclusively at nullable/non-nullable reference types. I've written them up together to add more of the clarity of insight we had when the meetings were over, rather than represent the circuitous path we took to get there.

Nullable and non-nullable reference types
Opt-in diagnostics
Representation
Potentially useful rules
Safely dereferencing nullable reference types
Generating null checks
1. Nullable and non-nullable reference types

The core features on the table are nullable and non-nullable reference types, as in string? and string! respectively. We might do one or both (or neither of course).

The value of these annotations would be to allow a developer to express intent, and to get errors or warnings when working against that intent.

2. Opt-in diagnostics

However, depending on various design and implementation choices, some of these diagnostics would be a breaking change to add. In order to get the full value of the new feature but retain backward compatibility, we therefore probably need to allow the enforcement of some or most of these diagnostics to be opt-in. That is certainly an uncomfortable concept, and adding switches to the language changing its meaning is not something we have much of an appetite for.

However, there are other ways of making diagnostics opt-in. We now have an infrastructure for custom analyzers (built on the Roslyn infrastructure). In principle, some or all of the diagnostics gained from using the nullability annotations could be custom diagnostics that you'd have to switch on.

The downside of opt-in diagnostics is that we can forget any pretense to guarantees around nullability. The feature would help you find more errors, and maybe guide you in VS, but you wouldn't be able to automatically trust a string! to not be null.

There's an important upside though, in that it would allow you to gradually strengthen your code to nullability checks, one project at a time.

3. Representation

The representation of the annotations in metadata is a key decision point, because it affects the number of diagnostics that can be added to the language itself without it being a breaking change. There are essentially four options:

Attributes: We'd have string? be represented as string plus an attribute saying it's nullable. This is similar to how we represent dynamic today, and for generic types etc. we'd use similar tricks to what we do for dynamic today.
Wrapper structs: There'd be struct types NullableRef<T> and NonNullableRef<T> or something like that. The structs would have a single field containing the actual reference.
Modreq's: These are annotations in metadata that cause an error from compilers that don't know about them.
New expressiveness in IL: Something specific to denote these that only a new compiler can even read.

We can probably dispense with 3 and 4. We've never used modreq's before, and who knows how existing compilers (of all .NET languages!) will react to them. Besides, they cannot be used on type arguments, so they don't have the right expressiveness. A truly new metadata annotation has similar problems with existing compilers, and also seems like overkill.

Options 1 and 2 are interesting because they both have meaning to existing compilers.

Say a library written in C# 7 offers this method:

public class C
{
    string? M(string! s) { ... }
}

With option 1, this would compile down to something like this:

public class C
{
    [Nullable] string M([NonNullable] string s) { ... }
}

A consuming program in C# 6 would not be constrained by those attributes, because the C# 6 compiler does not know about them. So this would be totally fine:

var l = C.M(null).Length;

Unfortunately, if something is fine in C# 6 it has to also be fine in C# 7. So C# 7 cannot have rules to prevent passing null to a nonnullable reference type, or prevent dereferencing a nullable reference type!

That's obviously a pretty toothless - and hence useless - version of the nullability feature in and of itself, given that the value was supposed to be in getting diagnostics to prevent null reference exceptions! This is where the opt-in possibility comes in. Essentially, if we use an attribute encoding, we need all the diagnostics that make nullability annotations useful be opt-in, e.g. as custom diagnostics.

With option 2, the library would compile down to this:

public class C
{
    NullableRef<string> M(NonNullableRef<string> s) { ... }
}

Now the C# 6 program above would not compile. The C# 6 compiler would see structs that can't be null and don't have a Length. Whatever members those structs do have, though, would be accessible, so C# 7 would still have to accept using them as structs. (We could mitigate this by not giving the structs any public members).

For the most part, this approach would make the C# 6 program able to do so little with the API that C# 7, instead of adding restrictions, can allow more things than C# 6.

There are exceptions, though. For instance, casting any returned such struct to object would box it in C# 6, whereas presumably the desired behavior in C# 7 would be to unwrap it. This is exactly where the CLR today has special behavior, boxing nullable value types by first unwrapping to the underlying type if possible.

Also, having these single-field structs everywhere is likely going to have an impact on runtime performance, even if the JIT can optimize many of them away.

Probably the most damning objection to the wrapper structs is probably the degree to which they would hamper interoperation between the different variations of a type. For instance, the conversion from string! to string and on to string? wouldn't be a reference conversion at runtime. Hence, IEnumerable<string!> wouldn't convert to IEnumerable<string>, despite covariance.

We are currently leaning strongly in the direction of an attribute-based representation, which means that there needs to be an opt-in mechanism for enforcement of the useful rules to kick in.

4. Potentially useful rules to enforce

Don't dereference C?: you must check for null or assert that the value is not null.

Don't pass null, C or C? to C!: you must check for null or assert that the value is not null.

Don't leave C! fields unassigned: require definite assignment at the end of the constructor. (Doesn't prevent observing null during initialization)

Avoid default(C!): it would be null!

Don't instantiate C![]: it's elements would be null. This seems like a draconian restriction - as long as you only ever read fields from the array that were previously written, no-one would observe the default value. Many data structures wrapping arrays observe this discipline.

Don't instantiate G<C!>: this is because the above rules aren't currently enforced on even unconstrained type parameters, so they could be circumvented in generic types and methods. Again, this restriction seems draconian. No existing generic types could be used on nonnullable reference types. Maybe the generic types could opt in?

Don't null-check C!: oftentimes using e.g. ?. on something that's already non-nullable is redundant. However, since non-nullable reference types can be null, maybe flagging such checks is not always so helpful?

We very much understand that these rules can't be perfect. The trade-off needs to be between adding value and allowing continuity with existing code.

5. Safely dereferencing nullable reference types

For nullable reference types, the main useful error would come from dereferencing the value without checking for null. That would often be in the shape of the null-conditional operator:

string? s = ...;
var l = s?.Length;
var c = s?[3];

However, just as often you'd want the null test to guard a block of code, wherein dereferencing is safe. An obvious candidate is to use pattern matching:

string? ns = ...;
if (ns is string! s) // introduces non-null variable s
{
    var l = s.Length;
    var c = s[3];
}

It is somewhat annoying to have to introduce a new variable name. However, in real code the expression being tested (ns in the above example) is more likely to be a more complex expression, not just a local variable. Or rather, the is expression is how you'd get a local variable for it in the first place.

More annoying is having to state the type again in ns is string! s. We should think of some shorthand, like ns is ! s or ns is var s or something else.

Whatever syntax we come up with here would be equally useful to nullable value types.

6. Generating null checks for parameters

There'd be no guarantees that a string! parameter actually isn't going to be null. Most public API's would probably still want to check arguments for null at runtime. Should we help with that by automatically generating null checks for C! parameters?

Every generated null check is performance overhead and IL bloat. So this may be a bad idea to do on every parameter with a non-nullable reference type. But we could have the user more compactly indicate the desire to do so. As a complete strawman syntax:

public void M(string!! s) { ... }

Where the double !! means the type is non-nullable and a runtime check should be generated.

If we choose to also do contracts (#119), it would be natural for this feature to simply be a shorthand for a null-checking requires contract.

noodlefrenzy commented 9 years ago

I definitely think option 1 (Attributes/Opt-in diagnostics) is the right approach. The opt-in model also allows you to defer the decision on which rules to enforce to the user - they could opt-in to increasing levels of pedantry.

One fringe benefit you don't mention with respect to the Attribute/Opt-in model is that it would provide a ready-made boilerplate for others looking to add similar attributes/checks to their code. Given that this is likely the most useful model for an organization looking to add their own custom checks, having that boilerplate would be handy.

HaloFour commented 9 years ago

I think the attribute route is right, too. Given that massive sweeping changes to the CLR and framework are likely out of the question non-nullable reference types seems better represented as annotative for analysis purposes. This is how it works in Java, although I think that with integrated compiler support it can be a little better in C#. IIRC, this is also how Apple Swift handles it since Cocoa and that entire framework has no concept of optional types. But, like with Cocoa, once a direction has been established I do believe that it is very important that the entire framework be updated to support these attributes.

Obviously going the attribute route brings limitations. Statements like default(T!) largely don't make sense, although if T had a parameterless constructor you could just convert it to an instantiation. I don't know if such behavior is worth it. And then what about List<T!>? That's probably not an uncommon use-case but how could that be represented? Maybe that's an argument for the struct wrapper class route.

As for dereferencing, I think that the compiler could intelligently determine scopes where the variable could not be null and permit direct dereference without having to assign to a new variable, e.g.:

string? x = ...;
if (x != null) {
    int i = x.Length; // legal, x can't possibly be null here
}

This is similar to how Java analyzers in IntelliJ deal with nullability. The pattern matching syntax also works well.

But do we really need both T? and T! for reference types? Seems that the former is implied.

MgSam commented 9 years ago

This feature feels like something that will be weak in practice, or worse, misleading, and thus rarely used. Adding syntax that has no real teeth lulls the programmer into a false sense of security, making them elide null checks and ?. dereferences which they otherwise might have been doing defensively, potentially leading to more unexpected NullReferenceExceptions.

Since the best we can do is make compile-time suggestions that a variable is nullable or not, why not just use attributes [Null] and [NotNull]? This then doesn't require any special, and misleading, new syntax. This is the tact ReSharper takes, and I think the suggestions it gives around null values are probably as good as you'll be able to get with your own analyzers.

Also agree with HaloFour that the T? syntax is unnecessary in either scenario. All reference types are already nullable- why introduce a redundant syntax?

I appreciate that this a commonly requested feature and you guys want to provide a solution here- but it seems pretty clear the solution is not going to be much better than the tools that are already available to the community. I'd much rather see design effort spent on new features which will have a large impact, rather than half-baked solutions to impossible problems.

orthoxerox commented 9 years ago

I agree this change sounds toothless. Backwards compatibility has always been Microsoft's forte, so I understand your reluctance to introduce breaking changes, but I don't think we need special syntax for something that cannot be enforced across assemblies. Attributes, perhaps with automatic Contract.Requires/Contract.ForAll insertion, sound like the least confusing change.

qrli commented 9 years ago

same opinion here. Attribute approach looks like the right direction. ? and [CanBeNull] are redundent, either ! or [NotNull] is enough. SInce it requires analyzers to enforce, maybe it should be considered together with contracts?

Przemyslaw-W commented 9 years ago

I don't think "string?" is redundant syntax. It clearly sets intention that null is allowed here. In opposite to just "string", which can be inherited from today code and which future compiler and diagnostics can not resonate around. Today we need to write "string" for both cases and future diagnostic cannot be sure which case - "string?" or "string!" we had in mind.

I personally think that not nullable reference types is as big and important feature as generics was for c# 2.0 So the ultimate approach would be CLR/IL changes. But it looks like this not going to happen, unfortunately. As such, I also agree attributes seem the best approach.

2015-03-28 13:54 GMT+01:00 qrli notifications@github.com:

same opinion here. Attribute approach looks like the right direction. ? and [CanBeNull] are redundent, either ! or [NotNull] is enough. SInce it requires analyzers to enforce, maybe it should be considered together with contracts?

— Reply to this email directly or view it on GitHub https://github.com/dotnet/roslyn/issues/1648#issuecomment-87224530.

mirhagk commented 9 years ago

I like all the ideas, but I'd also like to make the suggestion to always generate the null checks while in debug mode. Yes it's additional IL bloat and added overhead, but for debug mode that's okay. That will hopefully strike a good balance of having the checks while developing, but during release they don't cause performance issues.

In the long term is it possible to have the compiler aware of the analyzers that have run, or possibly even have optimizers be additional components like analyzers? That way the compiled code can be optimized with the advantage of the guarantees that the analyzer enforces. (The same idea would be applicable for purity analyzers, perhaps others as well)

sharwell commented 9 years ago

Additional diagnostics to consider (pulled from a NullUsageProcessor I implemented in Java).

A parameter C! cannot be used in a method override, where the inherited definition is C (warning) or C? (error).
A return type C? cannot be used in a method override, where the inherited definition is C (warning) or C! (error).
A parameter or return type C is used, where the inherited definition is C! or C?.

SriramSakthivel commented 9 years ago

How does that nullable reference type is going to help. Reference types are already nullable isn't it? What would be the difference between string and string? ?

mirhagk commented 9 years ago

@SriramSakthivel The point of string? is to make it clear you know that string could be null. In an ideal codebase there would be no string since you'd make a conscious decision of whether something could be null or not.

HaloFour commented 9 years ago

Given that we can't start over that would leave C#/.NET with effectively five different kinds of variables. That just seems way over complicated to me. Not to mention confusing because these new types will not play well with generics and cannot be enforced in a huge number of scenarios. On top of that, unless there is also a coordinated effort across the BCL we'll see little benefit across most code-bases, and we'll still see potentially tons of NREs popping up out of calls into the BCL due to parameters still being "raw" reference types.

If we're just going the attribute route I think that we should just skip the special syntax and do exactly what Java IDEs do, respect and enforce those attributes as warnings or errors. Add the attributes to the BCL and make Roslyn support issuing warnings based on their application to method parameters. I'd save syntax changes for when there can be a coordinated effort between all of the teams to actually modify the CLR to have real semantics for non-nullable reference types as a core concept of the framework and modify the BCL to support those types properly across the board.

dsaf commented 9 years ago

1. "...but you wouldn't be able to automatically trust a string! to not be null." "There'd be no guarantees that a string! parameter actually isn't going to be null."

It sounds very close to what ReSharper is doing:

http://stackoverflow.com/questions/23984569/should-i-use-both-notnull-and-contractannotationnull-halt

In that case it definitely should not be implemented on a syntax level, but rather use plain attributes.

2. Opt-in diagnostics.

Since null is a bad practice anyway, it means we will need to put the exclamation mark pretty much everywhere - without even trusting it completely. I tried it with ReSharper annotations and it sucks. I would rather agree with this obnoxious individual: http://simonhdickson.github.io/blog/2014/02/27/fsharp-31-as-csharp-60-strict-mode

Maybe some sort of compiler flag to turn opt-in into opt-out?

3. As a thought experiment: would it be possible to implement everything F# has to offer but with C# syntax? The only downside is that it would cannibalize every other .NET language.

davidwin commented 9 years ago

Once the BCL and other major libraries have been annotated with non-nullability, it might be interesting to introduce a "strict" C# mode, where the default reference type is non-nullable and type? is used for non-nullable references (including plain references from types in "loose" assemblies). The bang syntax would basically be a no-op, and there would be no "middle ground" for compatibility.

Hopefully, non-nullability is much more common than nullability in practice, so this feature shouldn't force correct code to! be! littered! with! exclamation! points! because of backward compatibility concerns.

MadsTorgersen commented 9 years ago

Thanks all, those are great comments!

There's a lot of disagreement it seems; I'll comment on some of the themes emerging:

Encoding: There seems to be agreement that if we do the feature, and as long as we cannot modify the CLR for a first class representation, we should take the attribute route.

Is the feature worthwhile?: To be honest we aren't sure ourselves. What we do know is that it is worthwhile aggressively pursuing a design when a feature is so highly requested. When a bunch of really smart people inside and outside of Microsoft have collectively produced the best possible design, then we can see if what we ended up with should go in. If our best just isn't good enough, then we should and will feel free to walk away.

Syntax vs. attributes: I appreciate the comments about dedicated syntax being misleading, or saving it for "when" the CLR can represent things natively. That said, there are very good reasons for having built-in syntax:

It is much more lightweight. If you are concerned about !s littering your code, try again with [NotNull]s! If we want to have any hope of broad adoption, we need to make the syntactic overhead minimal.
We want to be able to represent constructed types, such as string?[] and List<string!>. Since you cannot put attributes on type arguments, the attribute encoding for this is non-trivial, and not pretty to the naked eye. (We know, because we did it with dynamic.) In practice, if we don't add syntax, there won't be a reasonable way to express constructed types.

Are three kinds of types necessary?: Why add both string! and string?? Can't one of them be represented by today's unannotated string? We thought about that and we haven't entirely given up on the idea. I should have put something in the notes about it. We talked about a "two-type approach" versus a "three-type approach".

Among the comments, there isn't agreement on which of the two should be represented by the unannotated type. We'd have to pick wisely. There are good arguments for making string mean string!:

There is already a ? based syntax in the language - for value types
We think (as one of the comments also suggests) that the non-nullable intent is much more common than the nullable intent. So there'd be less need for adding new annotations when you decide to embrace nullability annotations.

Arguably, with a two-type approach you end up with a much better language in the end. However, the transition is more jarring: the minute you opt in (through whichever mechanism we land on), we would reinterpret all your as-yet unannotated reference types as non-nullable, causing possibly a slew of warnings. Wouldn't this critically discourage adoption?

Also, we end up in a world where the same code ends up meaning different things, depending on whether it's in C# 6 (and earlier) or C# 7 (and later). When you look at a piece of source text, how do you know which meaning is intended?

We're not entirely settled on the matter.

Annotating BCL types: I agree that this feature has much less value if BCL types aren't annotated. We can only annotate them if such annotation doesn't become a breaking change. The opt-in semantics help with that: if you just want to continue to compile against the BCL, well if you don't opt in you don't see the annotations and you aren't broken. Annotating existing libraries will only break people who opt-in. There's a reasonable YAFIYGI argument to be made that breaking people who opt in is exactly what they want!

MgSam commented 9 years ago

We think (as one of the comments also suggests) that the non-nullable intent is much more common than the nullable intent. So there'd be less need for adding new annotations when you decide to embrace nullability annotations.

Despite the outcry of requests for non-nullability, I don't believe this is the case. You always need something to represent an empty or unknown value; if you're not using null than it means that for your types you have to build your own sensible value to represent empty. While this is arguably a good practice, the amount of times I've actually seen it done are very rare. I definitely would not want an unannotated variable to be assumed to be non-nullable as it would break almost every C# codebase I've seen.

iskiselev commented 9 years ago

Probably it is possible to add new constraint to generic, that will specify not null constraint for type parameter (it can be encoded with attribute same as for dynamic). Looks like compiler has ability to restrict user with correct implementing such generic (modreq on all methods with input/output of such parameter, always assign value on initialization). In that case, it would be safe to trust such generics and user will be able to create G<T!>, if G has been created with such constraint.

gafter commented 9 years ago

There is a conflict between getting CLR support (so that the feature has more teeth), and compatibility. With CLR support the core BCL libraries are either retrofitted with this feature (potentially breaking compatibility with existing clients), or they are not retrofitted (in which we do not reap most of the benefits of the feature).

mirhagk commented 9 years ago

You always need something to represent an empty or unknown value;

That's simply not true. For instance the following method:

string GetUsersFavouriteColour(string username)

This method makes no sense to call it on a null user. I don't know if anyone has done the statistics, but on the majority of projects I've worked on, null was very rarely needed.

HaloFour commented 9 years ago

@mirhagk And if the user doesn't have a favorite color? Even if you use an empty string you're relying on a sentinel value for the sake of representing a non-value.

Either way, my points aren't that either side is wrong. It's that the ship has long since sailed. Adding it after the fact is inherently problematic. I do like the concept when it is handled as an intrinsic part of a new language. It's kind of slick in Apple Swift.

Short of some painful and massive CLR changes what I'd rather see come out of this is the concept of methods advertising what represents legal arguments, enforcing those arguments and having the analysis to identify that at compile time. Nullability is just the tip of that iceberg. Why can't Decimal.Divide warn me that I might call it with a second parameter of 0?

mirhagk commented 9 years ago

And if the user doesn't have a favorite color?

Then the correct signature would be:

string? GetUsersFavouriteColour(string! username)

You have to pass a username in always.

the concept of methods advertising what represents legal arguments, enforcing those arguments and having the analysis to identify that at compile time

Yes ideally this should work in conjuction with contracts, and have ! merely be syntactic sugar for the most common contract.

Although the method of having an analyzer check contracts and enforce them doesn't require any C# changes, it only requires a roslyn analyzer, along with ensuring that analyzer is actually used (perhaps include it in the default visual studio templates).

dsaf commented 9 years ago

@MadsTorgersen

It is much more lightweight. If you are concerned about !s littering your code, try again with [NotNull]...

How about this:

[assembly: NullOptOut()]

Opting out of nullability assembly-wide, opting in only where necessary - can't get any lighter :).

...you cannot put attributes on type arguments...

Maybe that should be addressed? If there was some way of specifying custom attributes in more places then it would make units of measure possible/easier as well:

https://msdn.microsoft.com/en-us/library/dd233243.aspx

If they wouldn't make sense as post-compilation meta-data then they could be erased.

HaloFour commented 9 years ago

Using attributes for dynamic is less problematic because conversion to dynamic is inherently widening and safe. In this case it would be narrowing. Unless the compiler did always emit the run-time checks (what really is the cost of a brnull anyway? and is an ArgumentNullException that much better than a NullReferenceException?) the code should still be written defensively to expect and gracefully handle null.

svick commented 9 years ago

@gafter What about a compromise between the two? There would be CLR-enforced ! and also [NotNull]. If you opt-in, the compiler will convert [NotNull] into !.

This would mean that the feature would have teeth (! cannot actually be null, because the CLR enforces it) and it would be compatible (code that doesn't opt-in will just ignore [NotNull], so it will behave the same).

The issues I can see with this solution:

It's more complicated to understand (there are two different ways to mark non-null).
If a [NotNull] method actually returns null, you will get an exception. But I think that throwing exceptions for buggy library code is okay.

HaloFour commented 9 years ago

@svick Assuming that the compiler uses the same encoding method used for DynamicAttribute it might be a bit tricky to know that to represent List<string!> you'd have to use NotNull(false, true).

That said, I don't really see a reasonable way for the compiler to ever enforce non-nullability on generic types. Assuming the parameter type of List<string!> where could the compiler possibly ensure that the list doesn't contain any null references? At best I think the compiler could emit checks around any method that would accept/return the generic type and wrap it with a null check, but then you're still dealing with an exception that occurs mid-method.

qrli commented 9 years ago

@MadsTorgersen For my experience using Resharper annotations. I use the default which treats parameters and return values as NotNull unless annotationed with CanBeNull. Because I find the CanBeNull case is much less, so I only need to annotate a few methods.

So suppose I start to use C# 7 with string means string!, and it still relies on analyzer to enforce, then it would be totally fine with the existing code. Compiler/analyzer will not complain because they treat parameters and return values as non-nullable. Then I can change the [CanBeNull] annotations to things like string?, so that compiler/analyzer can start to complain for missing null checks. So it will be opt-in when I start to use string?.

So I don't see "jarring" at all, because it is an attribute based approach, not enforced to the metal anyway. Maybe I missed something?

ErikSchierboom commented 9 years ago

In my experience, the default is almost always that an instance should be non-null. Therefore, having string! equal string makes perfect sense to me.

paulomorgado commented 9 years ago

@ErikSchierboom, how would you represent what you know to be the city where I live?

ErikSchierboom commented 9 years ago

@paulomorgado Depends on the context. If I keep a list of people living in cities, it would be a non-nullable string, otherwise it would be a nullable.

My point is that when I use reference types, most of the time what I actually want is for them to be non-nullable.

marchy commented 9 years ago

Are you guys analyzing how Kotlin has gone about this problem? They are dealing with the same issues with an unchangeable JVM but have made it work beautifully with solutions that I'm not seeing being discussed here (ie: see points 2 & 3 below).

1. Opt-in is not the way I strongly urge you to not take the opt-in approach if this feature is to have any real adoption and if the future of C# is going to remain as bright as its past. Requiring developers to add in the "!" will severely impact the adoption of the feature: senior devs will be rigorous and disciplined and will adopt the "!" (just as they are likely already using an attribute or precondition library approach). Intermediate & rookie developers however likely won't understand enough about the importance of non-nullability to make the extra effort of "polluting" their code with "!" symbols everywhere. There is little value in building a feature for the senior slice of the .NET developer base. I can already see the debates generated across organizations as they battles forge out as senior devs/architects try to sell their development teams on why adding the "!" everywhere throughout the code is a good idea. This is not pragmatic by a long shot. In reality many people will forget the "!" or it will be used very inconsistently – just as is the "readonly" modifier which really should be used on many object fields but in practice is not. Opt-in-ness is simply NOT an acceptable compromise for backwards compatibility and side-by-side support of libraries written in previous versions of C#. It will not achieve wide adoption of nullability in C# and is likely not much better than doing nothing at all.

2. Different Versions of C# problem The .NET ecosystem has a centralized, de-facto IDE (Visual Studio) for a reason. Let's use it! If you open a file in an assembly which is C# 6 built, use a UX mechanism to clearly inform the user which meaning of nullability that file was written under. That UX mechanism could be a big, bold banner at the top of the file tab informing the programmer that types have C# 6 nullability semantics; or better yet use the file decorations introduced into Visual Studio a while ago (in the move to the WPF engine?) to add presentation-time characters to the code so that types in C# 6 files get displayed as "Type?" instead of "Type". This doesn't have to affect the raw characters on disk, it can be a purely presentation-time addition so that the programmer doesn't have to flip between the 2 semantics in their head as they browse C# 7 and C# 6 code.

3. The undeclared type solution As opposed to the the opt-in approach, a third type could be introduced (ie: "Type??") which has the exact same semantic as nullable "Type?" but explicitly calls out that the programmer/library has not decided whether this should be nullable or not yet. This type is purely for backwards support, allowing libraries to be updated in partial manner and makes migration of old code very sane. The moment you mark a file/library as C# 7 Visual Studio should add "??" to all the types in that file. This doubly acts as a highly visible TODO list for code that still needs to be migrated to C# 7 and switched into the new nullable "Type?" or not-nullable "Type" (non-nullable) syntax. When mixed with point #2 above any file or return value coming form a C# 6 codebase can show up as "??" in the IDE (both in the editor as well as in Intellisense return values). Taking this approach would ensure real, effective impact on adoption by developers of all levels by ensuring nullability stands out and is brought front and centre in the language. It would also guarantee we do the right thing for C# and its future as opposed to get chained down conservatively by its past.

NOTE: Kotlin takes this very approach and it works beautifully. See the Platform Types concept here: http://kotlinlang.org/docs/reference/java-interop.html. They use the syntax "Type!" for this semantic, meaning "either Type or Type?" instead of the proposed "??" above. I find their syntax a tad awkward and think something like "??" would be more appropriate. The browsing non-attributed Java code and return values in the IntelliJ IDE shows up using the "platform type" syntax. This same approach can be applied to C#, specifically since Microsoft controls both the C# language and the de facto IDE of the ecosystem.

Guys let's please not sacrifice the future of C# because of conservative fears for which there are creative solutions for. We can have our cake and eat it too!

Joe4evr commented 9 years ago

Requiring developers to add in the "!" will severely impact the adoption of the feature

As Mads says, though,

[W]e end up in a world where the same code ends up meaning different things, depending on whether it's in C# 6 (and earlier) or C# 7 (and later). When you look at a piece of source text, how do you know which meaning is intended?

The .NET ecosystem has a centralized, de-facto IDE (Visual Studio) for a reason. Let's use it! If you open a file in an assembly which is C# 6 built, use a UX mechanism to clearly inform the user which meaning of nullability that file was written under.

Except OmniSharp is a thing these days, so there's half a dozen other IDEs not under Microsoft's control that will have to include such UX features. I estimate the odds of that to be slim to none.

Guys let's please not sacrifice the future of C# because of conservative fears for which there are creative solutions for. We can have our cake and eat it too!

I'd like to see this as well (maybe), but the big discussions surrounding non-nullability indicate that the implementation should be done as complete and correct as possible from the first stable release onward. I'd rather see them postpone a game-changer like this until they can get it right (or dropped altogether if needed), rather than having it be a half-baked solution resulting in it not being used at all.

dsaf commented 9 years ago

@Joe4evr

...rather than having it be a half-baked solution resulting in it not being used at all.

When Code Contracts failed* at least there was some comfort in knowing that it's just an API. When syntax-based LINQ failed*, it will stay in the language forever.

(*) failed, as in not getting wide popularity on it's own. Code Contracts is too slow and verbose to be worth it, even Microsoft doesn't want to use it - look at EF7 repo. LINQ query is competing with LINQ extension methods and not winning, there is not even a query-based way to say to list: http://stackoverflow.com/questions/7767473/linq-query-syntax-and-extension-methods

GeirGrusom commented 9 years ago

LINQ query is competing with LINQ extension methods and not winning, there is not even a query-based way to say to list

Because it would be meaningless in a query.

dsaf commented 9 years ago

@GeirGrusom what about first? SQL has TOP.

GeirGrusom commented 9 years ago

@dsaf that is a perfect example :)

HaloFour commented 9 years ago

@dsaf I'm not sure how one would compile the metrics but I would be very surprised if the LINQ query extensions weren't extensively used. Using a single project like EF7 as an anecdote isn't very representative. I personally switch between them depending on the situation. For the most part I prefer the extension method syntax, but I think that the query syntax is more succinct for group joins, outer joins or projecting a sub-expression (let clause).

Aside that, I agree. I'm sure that the compiler team has to weigh the reality of features failing to gain traction and ending up as cruft grammar within the language. Understanding why Code Contracts didn't take off is probably significantly more important than what behavior the proposed language feature might have.

dsaf commented 9 years ago

@HaloFour I was referring to Code Contracts when I mentioned EF7 - they have opted for JetBrains annotations instead (very sparingly). I generally agree with what you are saying about LINQ query syntax, but think more can be done to make developers prefer it, hence I just created #1937 and #1938.

HaloFour commented 9 years ago

@dsaf Oh, well, in that case we agree. JetBrains probably has it (mostly) right. More important than what it does if the parameters are invalid is how it helps to avoid them from being invalid. The bread-and-butter cases overlap perfectly with the not-nullability proposals.

Also agreed about LINQ. There are plenty of common use cases where people do mix both and doing so within a single expression isn't the most pleasant syntax in the world. It's hard to say where to draw that line since LINQ is so open-ended. VB.NET does offer Skip, Skip While, Take, Take While, Distinct and Aggregate which are useful. Some common terminal clauses could be useful, too.

var dictionary = from person in people
    where person.Age >= 21
    take 10
    select person
    into dictionary by person.Name;

Wading into some kind of convention of decorating static extension methods with attributes to support extending the query syntax would be interesting but probably become very messy very quickly.

axel-habermaier commented 9 years ago

@HaloFour: Are you aware of how F# solves this problem quite elegantly?

paulomorgado commented 9 years ago

@axel-habermaier, I'm not. Would you care to share how?

axel-habermaier commented 9 years ago

@paulomorgado: Well, F# has this concept of computation expressions which are roughly comparable to monads. The syntax for those expressions is user-extensible and they use it to implement sequence expressions (like yield return in C#), query expressions (like LINQ in C#), and async expressions (like async/await in C#), among other things. For a full list of query operators supported by F#, see the F# documentation. And if you don't like what F# provides out-of-the-box, just roll your own! While I'm not sure whether you can just extend F#'s query expression class (never tried), remember that it's just a library feature and the library is open source, so you can easily copy it to your own code and extend it.

However, I'm not so sure that a similar concept would fit well into C#, but it might be worth looking into it. Maybe there is a way to generalize the implementations of those features, make them user-extensible and all of that while still retaining backwards compatibility.

qrli commented 9 years ago

I think it is easily possible to make LINQ query operators extensible via libraries: just project extension methods to query operators. That at least solves majority simple cases. e.g.

var dictionary = from person in people
    where person.Age >= 21
    Take 10
    select person
    ToDictionary person.Name;

Note the above Take and ToDictionary are plain extension methods, which compile can project them as contextual query operators, without adding new keywords.

Taking this way, it seems only from and in need to be keywords to start a query expression (maybe also join, let, etc. to simplify complex expression), all the others can be done by extension methods.

svick commented 9 years ago

@qrli If you want a syntax like that, I believe you would somehow need to distinguish between Take, which keeps projection variables (like person) intact and ToDictionary, which destroys all projection variables, and thus effectively ending the query.

F# uses an attribute for that.

qrli commented 9 years ago

@svick I'm thinking in a simpler way. The person variable is just a syntax sugar to simplify lambdas, so that it only needs to be defined once. So it does not need to be really kept through out the query. Just use it as a clue to identify lambda inputs, so that query will be naively transformed to:

var dictionary = people
    .Where(person => person.Age >=21)
    .Take(10)
    .Select(person => person)
    .ToDictionary(person => person.Name);

And then some optimization to remove the Select(). Of course, if let or join is used, the transform will not be this simple. But should be still achievable.

gafter commented 8 years ago

Design notes have been archived at https://github.com/dotnet/roslyn/blob/future/docs/designNotes/2015-03-10%2C17%20C%23%20Design%20Meeting.md but discussion can continue here.

dotnet / roslyn

C# Design Notes for Mar 10 and 17, 2015 #1648

C# Design Meeting Notes for Mar 10 and 17, 2015

Agenda

1. Nullable and non-nullable reference types

2. Opt-in diagnostics

3. Representation

4. Potentially useful rules to enforce

5. Safely dereferencing nullable reference types

6. Generating null checks for parameters