[Proposal] Introduce "var?" for nullable type inference

LWChris commented 4 years ago

Problem

Consider this code:

#nullable enable
public void Foo()
{
    var val1 = p.Count; // int
    var val2 = p.Max; // int?
    var ref1 = p.GetOrCreate(1); // TheType
    var ref2 = p.TryGet(2); // TheType?

    // val1 = null;
    val2 = null;
    ref1 = null;
    ref2 = null;
}

With #nullable enable, the line ref1 = null causes a warning, because the type of ref1 was inferred from GetOrCreate whose return type isn't nullable. The line val1 = null would obviously not even compile.

In order to have all variables nullable, I'd either have to change the return type of GetOrCreate and Count to nullable types (which is bad because both will never return null, so why declare otherwise), or I have to use explicit type, which can range from int? val1 = p.Count; to TheTypeThatRequiresAnImport<WithGenerics, WhichRequire, EvenMoreImports>? ref1 = p.GetOrCreate(1);

Proposal

If var ref1 can be resolved to TheType ref1, then var? ref1 could be resolved to TheType? ref1.

Considerations

This syntax can theoretically be allowed for reference types and value types. For both cases, var? will ensure that the declared variable is nullable, regardless of the right-hand side's nullablity (for value types) respectively nullable annotation (for reference types).

For value types, the behavior will be:

public void Foo()
{
    var? val1 = p.Count; // int? val1 = new Nullable<int>(p.Count);
    var? val2 = p.Max; // int? val2 = p.Max;
}

which is equivalent to

public void Foo()
{
    var val1 = MakeNullable(p.Count);
    var val2 = MakeNullable(p.Max);
}

private static T? MakeNullable<T>(T value) where T : struct => new T?(value);
private static T? MakeNullable<T>(T? value) where T : struct => value;

For reference types, the question mark behaves the same like the one in TheType? obj:

var? ref1 = p.GetOrCreate(1); // TheType? ref1 = p.GetOrCreate(1);
var? ref2 = p.TryGet(2); // TheType? ref2 = p.TryGet(2);
#nullable disable
var? ref3 = p.GetOrCreate(3); // Warning: The annotation for nullable reference types should only be used in code within a '#nullable' annotations context.

HaloFour commented 4 years ago

This will no longer produce a warning in C# 9.0 as the language team has decided to always treat var as nullable and to rely on flow analysis to determine when to warn. Assigning null will no longer warn, but trying to deference after that assignment will.

Example

LWChris commented 4 years ago

1) Your example shows a warning for me at ref1 = null. Do I have to set the language to something like "C# 9.0 Preview" somewhere on the page? 2) Apart from that I thought solving the question of whether something will ever be null is equivalent to solving the halting problem, I think "always treat var as nullable" sounds dangerously close to "always treat references as nullable". To me, this looks like a step in the opposite direction of where the whole concept of "non-nullable-references" was aiming at. In my opinion, you should treat var as what it is - that is "the type of the right-hand side", including the nullability.

HaloFour commented 4 years ago

@LWChris

Your example shows a warning for me at ref1 = null. Do I have to set the language to something like "C# 9.0 Preview" somewhere on the page?

If the selected branch is "master" then it should be using the C# 9.0 bits. I do not see the warnings there, but I do see the warning if I change the branch back to "Default".

I think "always treat var as nullable" sounds dangerously close to "always treat references as nullable". To me, this looks like a step in the opposite direction of where the whole concept of "non-nullable-references" was aiming at.

The flow analysis of NRTs is pretty good at determining when it's safe to dereference the variable and it's already a core aspect of the experience. The behavior of the variable will remain the same. If the method returns a non-nullable, the compiler will consider the variable to not be null and will not warn on dereferencing. If you assign null to the variable the compiler will not warn at the point of assignment, but it will warn if you attempt to dereference the variable again without first checking for null:

string SomeMethod() => "";

var s = SomeMethod();
var length = s.Length; // does not warn
s = null; // does not warn
length = s.Length; // warns
if (s != null) {
    length = s.Length; // does not warn
}

The nullability annotations aren't that important on locals, the compiler will infer from usage. It's at the boundary where the annotations are much more important because the compiler can't know the expected state of the value crossing that boundary.

In my opinion, you should treat var as what it is - that is "the type of the right-hand side", including the nullability.

That is what the language team thought, too. But the experience and feedback from that decision is that the behavior is suboptimal, especially around existing code, so the team decided to reverse that decision. The warning on assignment isn't particularly helpful if that variable isn't dereferenced again without checking for nullability.

Anyway, here's the meeting notes from when the language team already considered the addition of var? and decided on this behavior for C# 9.0 instead:

https://github.com/dotnet/csharplang/blob/master/meetings/2019/LDM-2019-12-18.md#var

sharwell commented 4 years ago

@HaloFour the behavior you link to is enabled for C# 8 as well, provided the compiler version is new enough.

LWChris commented 4 years ago

s = null; // does not warn I was really hoping this would change once and for all when I saw non-nullable references found their way into C#. IMO, it should. It's a pity, but oh well. C# is a language, not my language, so it's not my choice to make.

I'd rather opt into being pedantic about forcing you to express what you wanted by using var or var?, than relying on inherently weaker local context analysis. It's one thing to not mess with old code just for convenience (protected virtual event NotifyCollectionChangedEventHandler? CollectionChanged;), but another to mess with it to prevent future errors, and when the "fix" is merely adding the ? to the var keyword a few lines earlier. Even more so, if adding that one character doesn't get rid of all warnings immediately, you apparently just identified a risk you never thought about. Sounds promising, not annoying to me.

Thanks a lot for the link though.

CyrusNajmabadi commented 4 years ago

I was really hoping this would change once and for all when I saw non-nullable references found their way into C#. IMO, it should. It's a pity, but oh well. C# is a language, not my language, so it's not my choice to make.

You can always write an analyzer yourself to find and report thsi if you want that extra restriction :)

I'd rather opt into being pedantic about forcing you to express what you wanted

We looked into this, and it was effectively awful. So much code that was null safe and was written in idiomatic C# was then flagged as needing additional work.

This violates my own goal for the feature which is that good, existing, coding practices should not feel like something that is now punishing you.

LWChris commented 4 years ago

You can always write an analyzer yourself to find and report thsi if you want that extra restriction :)

Not every programmer has the skill or time to write custom analyzers to mitigate shortcomings of the native compiler. ;)

So much code that was null safe and was written in idiomatic C# was then flagged as needing additional work.

I understand your point and see where you're coming from. But quite honestly - with this behavior for #nullable enable, the C# language team might as well consider to remove the whole feature again. After all the point of non-nullable-references is to introduce references that cannot be null. Now they can be null, and we still have to rely on flow analysis. But flow analysis can be done with or without #nullable enable, right? So what will be the point of having #nullable enable scopes in the first place?

If x = null is an error for int x, then it should be an error for SomeReferenceType x when #nullable enable is defined.

If it bothers me that I must write var? x = Foo(); x = null; instead of var x = Foo(); x = null;, because I know I have always checked for null when I use x later on, then why did I even enable the feature in the first place? I might as well disable it and rely on flow analysis anyway, since apparently flow analysis is "good enough" to catch all issues.

And maybe, after killing #nullable, C# may introduce #nonnullable, this time not with SomeReferenceType? vs. SomeReferenceType, but with SomeReferenceType vs. SomeReferenceType! or so.

HaloFour commented 4 years ago

@LWChris

The goal of NRTs is to avoid NREs, not nulls. A null assigned to a local that is never dereferenced is innocuous. You can argue that this makes annotating locals with their nullability is pointless if flow analysis can make this determination, and I agree with you. The annotation makes sense at the boundary where flow analysis can't track what happens. But I disagree that this decision somehow weakens NRTs or makes them irrelevant. NRTs were never intended to be a change to the type system, they're meant to be guardrails, and they need to be a very pragmatic solution so that can be practically applied to tons and tons of existing code.

333fred commented 4 years ago

But flow analysis can be done with or without #nullable enable, right?

No, that is not how the design works. You must, at minimum, turn on the flow analysis with #nullable enable warnings. Nullable does not run in disabled locations.

LWChris commented 4 years ago

#nullable enable

public class C {
    readonly string S = "s";

    public void A() {
        string s = S;
        bool flag = s == null; // 1
        int length = s.Length; // 2
    }
}

Question 1: What is the point of having a wrong warning "Dereference of a possibly null reference." in line 2 instead of a correct warning "Expression is always false." in line 1 with #nullable enable? Question 2: If we establish that showing the warnings as described above was somehow useful, then what is the point of not showing them without #nullable enable?

In other words: if NRT are meant to be "guardrails", then why are those rails installed in a way that it is easy to make them ineffective (by adding a useless null-check), and if we deemed them sensible nevertheless, why would I have to explicitly ask for the guardrails to be installed?

HaloFour commented 4 years ago

@LWChris

The guardrails exist to steer you away from NREs. That may warn spuriously, but it won't NRE. More interesting would be non-pathological cases where the compiler doesn't warn and can result in an NRE.

333fred commented 4 years ago

Consider instead a public API: It's annotated as not accepting null, but of course someone could be calling you from a nullable disabled context or they could put a ! on it because they're pretty sure they're not passing null. However, you still want to check the input for null, and if you do so, we should make sure you actually handle it. Or perhaps the return type of a function is not annotated, or the person who annotated it was wrong. Nullable is not perfect: it's designed to take your input into account. We assume that you know what you're doing, and that if you check for null then there's a real reason to do so.

CyrusNajmabadi commented 4 years ago

Not every programmer has the skill or time to write custom analyzers to mitigate shortcomings of the native compiler

I disagree. The cost to doing something like this is extremely small. So either this is a significant issue for you, in which case writing hte analyzer is a cheap and easy way to solve the problem. Or it isn't an significant issue, in which case it doesn't really matter if it solved :)

After all the point of non-nullable-references is to introduce references that cannot be null.

This was never the point. The point was to diminish the quantity of null reference exceptions, and to introduce a system that had a low amount of false positives and false negatives.

in line 2 instead of a correct warning "Expression is always false." in line 1 with #nullable enable?

It would be fine to introduce an analyzer to give such a warning. No one is stopping that.

LWChris commented 4 years ago

This was never the point.

If that is so, this is the root cause of my problems with that behavior, and apparently some others. Because everything that has been named, said, documented or marketed about this feature promises exactly what you say "was never the point".

Just read the introduction of the official documentation:

C# 8.0 introduces nullable reference types and non-nullable reference types that enable you to make important statements about the properties for reference type variables:

A reference isn't supposed to be null. When variables aren't supposed to be null, the compiler enforces rules that ensure it's safe to dereference these variables without first checking that it isn't null:

The variable must be initialized to a non-null value.

The variable can never be assigned the value null.

A reference may be null. When variables may be null, the compiler enforces different rules to ensure that you've correctly checked for a null reference:

The variable may only be dereferenced when the compiler can guarantee that the value isn't null.

These variables may be initialized with the default null value and may be assigned the value null in other code.

[...] With the addition of nullable reference types, you can declare your intent more clearly.

The term "non-nullable reference types" is misleading, because they are still considered nullable enough to allow assignment without warning, even if you explicitly declared otherwise at every possible location.
"that enable you to make important statements about the properties for reference type variables", except that these statements are going to be ignored at the next best opportunity.
"you can declare your intent more clearly", but the compiler is very eager to ignore your declaration and replace it with its own assumptions.
"the compiler enforces rules that ensure it's safe to dereference these variables without first checking that it isn't null", which turns out to be "you can shatter that enforcement by writing x == null somewhere" and then we will urge you to introduce more checks instead of allowing you to get rid of unneccesary checks.
"The variable can never be assigned the value null.", except you wrote x == null somewhere, or you used the var keyword like advised. And it's not just "not an error", it's not even a warning anymore.

What's even worse, the documentation continues to compare it to the "earlier" behavior:

This new feature provides significant benefits over the handling of reference variables in earlier versions of C# where the design intent can't be determined from the variable declaration. The compiler didn't provide safety against null reference exceptions for reference types:

A reference can be null. The compiler doesn't issue warnings when a reference type is initialized to null, or later assigned to null. The compiler issues warnings when these variables are dereferenced without null checks.

A reference is assumed to be not null. The compiler doesn't issue any warnings when reference types are dereferenced. The compiler issues warnings if a variable is set to an expression that may be null.

Comparing this to the current behavior:

"earlier versions of C# where the design intent can't be determined from the variable declaration" - now I can seemingly declare design intent at declaration, but actually somewhere the code I may accidentially change the believed intent, so we're back at "you have to read all the code to know if a variable is considered "not null" or "nullable".

Statements about when a reference "can be null":

"The compiler doesn't issue warnings when a reference type is [...] later assigned to null." - yup, got that behavior back.
"The compiler issues warnings when these variables are dereferenced without null checks." - yup, got that behavior back as well.

Statement about when a refereces "is assumed to be not null" (in fact, the feature was marketed to get rid of "assume" and introduce "declare", which is stronger):

"The compiler doesn't issue any warnings when reference types are dereferenced." - you may now see warnings even if you declare "not null", so this got even worse.
"The compiler issues warnings if a variable is set to an expression that may be null." - not if you used the var keyword, or checked for null anywhere above. Got worse as well.

I hope you can see what a gigantic let down this feature appears to have become, compared to what was promised. I want you to take a step back and not look at why a decision was made, but look at the bigger picture of what the feature now brings to the table in comparison of what is said it brings to the table.

Optimizing the behavior under the premise of "show as few warnings as possible" and "signal 'everything is fine, you don't have to change a thing' as long as possible" is creating a meaningless tool. NRT/NNRT promised and is marketed to allow me to get rid of vague assumptions and replace those with rigid declarations. But everything that has been done ever since were softening those "declarations" back to "assumptions". These new assumptions may be smarter than before, which is a fine thing in itself, but why even introduce "better algorithms" as a new feature, not as improvement of the already existing feature.

I'm genuinely disappointed. It was announced as "Declare if a reference can be null." and has devolved into "We try to hopefully figure out correctly most of the times whether a reference may be [allowed to be set to] null and will show you some warnings at some places where we assume they are appropiate, and if you think long enough about them you may even realize that the problem is not actually where we mark your code as problematic, but at an utterly different place in your code that has nuked our fragile heuristics that silently overrode your declared intents."

LWChris commented 4 years ago

In short:

The goal was to create a feature that allows developers to escape the "might be null" situation.
This would have enabled developers to code in a context where they don't have to worry about NREs, even without tedious null checks at every place.
Using the premise "null checks are implicit declarations of nullability" instead of "a declaration of non-null is a declaration of non-null" creates a confusing behavior and inhibits the goal of creating a null-safe context, because now, checking for null on a non-null variable is not considered unnecessary, but instead destroys the whole null-safe context, re-introducing the necessity to check for nulls everywhere else.

LWChris commented 4 years ago

I can really not stress enough how much of a logical break in compiler behavior this is. Consider this code:

var x = 4;
var y = 2 << x;
x = x * 1.0;

In the first line, the compiler infers the type of x to be int, so the second line compiles fine. The third line, albeit uncritical under mathematical aspects, will yield an error, because this line of code is considered wrong when x is an int. The compiler doesn't bother to implicitly assume "everything will be fine", even if it could technically check that.

Now watch what happens when I declare the type of x to be double, like so:

double x = 4; // The type is declared "double"
var y = 2 << x; // Writing "int << double" is wrong, let's make this an error
x = x * 1.0; // Assigning a "double" value to a "double" variable is fine

As you can see, the compiler will treat my explicit type declaration as given, and therefore say "line 2 is wrong".

Now look at what the compiler does with my declared non-nullability:

#nullable enable
string s = "s"; // The type is declared "non-nullable"
bool flag = s == null; // This only makes sense if "s" is nullable; let's assume "nullable" instead
int length = s.Length; // You should not dereference a nullable variable

If in my example from above the compiler behaved in the same way like NRTs are implemented, then this would be the result:

double x = 4; // The type is declared "double"
var y = 2 << x; // This only makes sense if "x" is an int; let's assume "int" instead
x = x * 1.0; // You cannot assign a double to x without casting

The idea of some analysis checking "What you do with that variable only makes sense if the variable is treated differently from what was declared", which then does not lead to the conclusion "you should not do this with that variable" but instead to "I assume a different type than what is stated at the declaration"... that's insane!

HaloFour commented 4 years ago

@LWChris

You're comparing inference of type to inference of nullability, which aren't the same. NRTs aren't a part of the type system, they are only a linter.

LWChris commented 4 years ago

@HaloFour

I am aware that NNRTs vs their respective NRTs are not two different types, like VTs vs belonging NVTs are. I know they are realized with annotations in order to maintain backwards compatibility, and thus make the behavior a question of individual annotation analysis opposed to existing type logic. But that doesn't mean that to-be-introduced analysis may not strive to produce a behavior for NNRTs that is consistent with the compiler behavior for VTs.

The decision to treat a null-check as more declarative of a variable's nullability than the actual declaration is just that - a decision. It's not a necessity introduced by the way nullability had to be implemented under the hood.

It may be clear to you why nullability for reference types is not implemented the same way like for value types, but there is no reason why it may not be treated the same way. The written code provides all necessary information to accurately infer nullability and then treat the variable identically to how it is done for value types. The decision to treat var differently for RTs than for VTs, and to treat Type vs Type? differently from int vs int? was a (sensible) decision made after establishing the assumption that "a null-check implies nullability". But maybe establishing that assumption was wrong in the first place, and should be exchanged for "a variable declared non-null is non-null". This achieves the same result for NPEs, but more accurate recommendations for the user and consistent behavior, which is what people likely expected and what was described/promised/marketed.

HaloFour commented 4 years ago

@LWChris

The decisions made around NRTs were made giving priority to making them easy to adopt with minimal code changes. Generating a ton of warnings around common existing code patterns that are not at risk for producing NREs is not a good experience. The patterns for dealing with nullability in reference types have been established over 18 years, whereas nullability in value types wasn't at all possible until NVTs were added to the runtime and language. NRTs are a giant pile of compromises optimized around pragmatism and a low barrier to entry to adoption and ecosystem penetration.

I view NRTs like I do generics in Java, which were designed with similar considerations to the existing language and ecosystem. Both can be pretty trivially intentionally defeated by the developer, but for non-pathological code they effectively guard against incorrect usage of the values the vast majority of the time.

LWChris commented 4 years ago

Thanos travelled the whole universe to collect all Inifinity stones. He created the most powerful tool one can imagine. And then he used it to cure the symptoms, not the cause.

They created a tool that had the power to abolish the cause (reference type variables might be null), and they chose to cure the symptom (NPEs).

The result is giant pile, but that pile is a mess, with behavior that is unexpected, unnecessary, and inconsistent with everything you know about nullability and C#.

int num = 1;
bool flag = num == null;
string text = num.ToString();

string text2 = "s";
bool flag2 = text2 == null;
int length = text2.Length;

There is no reason why num == null has a warning, but text2 == null doesn't, why text2.Length has a warning, but num.ToString() doesn't.

string text = "s";
int length = text.Length;
bool flag = text == null;
int length2 = text.Length;

There is no reason why text.Length is okay in the second line but not in the fourth.

#nullable enable
string? text = "s";
bool flag = text == null;
int length = text.Length;
#nullable disable
int length2 = text.Length;

There is no reason why the fourth line should issue a helpful warning, but not the sixth.

Feel free to implement this bunch of mildly-useful helpers, but name them "better reference type nullability annotations" which "hint" something and don't waste/abuse the Type? syntax when you actually mean to implement [MaybeNotNull] Type.

What is implemented and planned is certainly not introducing anything close to the concept of "non-nullable-reference-types" which "enforce" something.

If it must be so convenient and frictionless to adopt the feature, then why even give users a choice? If the ruleset and logic has to be softened so much that basically 99% of the code stays warning-free when enabled, then why make it an opt-in feature at all?

HaloFour commented 4 years ago

@LWChris

And then he used it to cure the symptoms, not the cause.

They created a tool that had the power to abolish the cause (reference type variables might be null), and they chose to cure the symptom (NPEs).

The cause of an NRE is dereferencing null, not assigning null. nulls are perfectly valid values, and almost completely unavoidable in the runtime. Would NRTs be designed differently if we were designing .NET and C# 1.0 right now? Almost certainly yes. But we're not designing C# 1.0. We're designing C# 8.0+, and we don't throw out 18 years of existing patterns and practices for the sake of idealism.

The result is giant pile, but that pile is a mess, with behavior that is unexpected, unnecessary, and inconsistent with everything you know about nullability and C#.

Everything you knew about nullability was that every reference variable could be null. Now you get some hints to help you identify where they might be null where you thought it couldn't be. The reality is that this feature has found and identified potential NREs in many existing codebases, including Roslyn itself. If it generated a ton of false positives and I had to question the compiler on every warning I'd be less inclined to trust its judgment. If I had to rewrite swaths of code to enable it my adoption rate would plummet and I'd likely introduce a whole slew of new bugs.

There is no reason why text.Length is okay in the second line but not in the fourth.

Your pathological code is giving the compiler reason to doubt its flow analysis. It assumes you know better. I'd suggest not writing pathological code.

You know how easy it is to get a String into a List<Integer> in Java? Trivial. Know how often that's actually a problem? Negligibly. The perfect is the enemy of the good.

There is no reason why the fourth line should issue a helpful warning, but not the sixth.

There absolutely is. Issuing a new warning is a breaking change.

I'm sorry that you don't like NRTs. I don't like everything about them either. But they are the culmination of more than six years of public discourse on the subject and comparison of a very wide range of options.

CyrusNajmabadi commented 4 years ago

Optimizing the behavior under the premise of "show as few warnings as possible" and "signal 'everything is fine, you don't have to change a thing' as long as possible" is creating a meaningless tool.

We've found literally dozens of cases in our own product where unintentional null refs could occur. We've been able to find and fix them because of nrt. Furthermore, the design has allowed us to gradually adopt net (generally on a per file basis, but sometimes when more granularly than that). That adoption had been reasonably painless due to the amount of false negatives and positives being quite low.

You say this is a meaningless tool. However, based on the results alone in our codebase I would say that is absolutely not the case.

LWChris commented 4 years ago

Let me be clear. What has been done was good and fruitful and is certainly a good thing. I do not at all question that. But I don't see why this is treated as the whole solution for NRT. As I said, if the design is chosen as is, under the premise of avoiding to show any warning that is not most likely valid and definetely show a warning where there is a mistake, then this analysis should be its own separate opt-in feature or not being opt-in at all (depends on the C# definition of "breaking" change; to me a compile- or runtime error is a breaking change, but not a design time warning).

But the other opt-in feature that actually introduces what is called "non-nullable reference types" and the syntax of Type? for "nullable reference types", that should be what it says. And even if enabling that requires me to change the code at appropiate places by either adding ? or deleting an unnecessary null-check - if this comes in exchange for getting warrantied runtime safety without cluttering my code with lots of unnecessary null checks, this is still a pretty strong incentive to me.

CyrusNajmabadi commented 4 years ago

But I don't see why this is treated as the whole solution for NRT.

It has not been treated as the whole solution for NRT. We've already done more work in the NRT space, and i expect that we will continue to do so for many more versions of the language.

CyrusNajmabadi commented 4 years ago

As I said, if the design is chosen as is, under the premise of avoiding to show any warning that is not most likely valid and definetely show a warning where there is a mistake,

Your premise is incorrect. There has never been a design that would "definitely show a warning where tehre is a mistake". The designs have all been around a balance of getting lots of true positives/negatives and minimizing false positives/negatives. There has never been a design of where a warning is always shown when there is a mistake.

CyrusNajmabadi commented 4 years ago

that should be what it says

This is fine to have as a personal opinion on your part. However, you need to recognize that it is not shared by other people (including myself). The goal is to take pragmatic and not dogmatic approaches here to maximize the utility of this for the vast majority of the user base. Strict adherence to rules that would just provide more pain to users, while providing no additional benefit in terms of correctness is seen as a net negative.

Look, i totally get your views and why you want it to be this way :) Indeed, your views on this are how we initially implemented this feature, thinking that would be the right way to go. However, immediately we could see across the board how bad an experience this was causing for users. Simply ignoring that and sticking with the position of "that should be what it says" would have done nothing good for this feature. It would have made it much harder for people to adopt, and it wouldn't have produced any less null-refs for hte people that did adopt it. It was, objectively, a strictly worse outcome for the entire ecosystem.

CyrusNajmabadi commented 4 years ago

this is still a pretty strong incentive to me.

And htere's the rub. That's your assessment of the pros/cons. But they apply to you. We have to holistically look at the entire ecosystem out there. Thousands of companies, millions of devs, and tons of sources of feedback indicating a wide variety of views on this topic. Your pros/cons tell you that you would prefer the more strict view of things. And trust me, you're not alone. However, you're not a majority or plurality.

In language design, this is the norm. Every decision we make always has some group left out or feeling like their own personal weighting has gone negative with our decision. That's simply part of hte process, and can feel sucky when you're in that group. Trust me, i'm on the LDM and i've been in that group likely at least once per language revision. However, i recognize that just because the decision made then might not be the best decision for my personal needs, that it's still likely the best one for the ecosystem as a whole.

LWChris commented 4 years ago

Thank you @CyrusNajmabadi - I appreciate your posts.

I understand as mentioned above that C# is a language, not my language.

I just hope the "stricter" ruleset will become available due to some setting similar to current #nullable enable at some point. My dream has always been that int? and Type? will behave in the same way, so that

var x = A;
x = B;
x.Foo();

will mean

x = B is an error when A is a non-nullable-reference type and B is nullable
x.Foo() will warn if A and B are nullable
x.Foo() will not warn if A is nullable but B is not nullable

:)

CyrusNajmabadi commented 4 years ago

I just hope the "stricter" ruleset will become available

This is a prime case for an analyzer. Indeed, any time the topic is around "the language should be more restrictive and produce errors in this code that was previously legal", all that really means is "write an analyzer to find and mark the patterns you don't want to allow".

Now you don't need to wait on anything. You could write this today and have the stricter semantics you desire.

markm77 commented 3 years ago

I realise the decision has some time ago been taken that inferred var reference types are always nullable and therefore use of var effectively strips off nullability information.

To work around this, I now use explicit types instead of var. This works great if you can put up with the verbosity. You simply write initial code using var then use your IDE tooling to convert to explicit types and then manually strip off unnecessary question marks to make types non-nullable.

Perhaps (hopefully) in future the C# team will introduce a nullable-smart var, e.g. say with <Nullable2>enable</Nullable2>. 😀

HaloFour commented 3 years ago

@markm77

Perhaps (hopefully) in future the C# team will introduce a nullable-smart var

IMO that's what we already have. The compiler is smart enough to know that it's not the assignment that will result in a NullReferenceException, it's the attempt to use it after such an assignment. So we've removed an extraneous warning that hindered adoption without reducing the safety that the feature intends.

markm77 commented 3 years ago

Hi @HaloFour I understand. But I'd rather not have to work backwards and replace var with explicit types one by one to solve nullable warnings in highly nested code.

Using explicit types also has another benefit - it makes nullable types stand out - for me these days all reference types should be non-nullable unless there is a good reason and seeing these exceptional cases clearly is helpful. Although I'd rather have a nullable-aware var. 😀

HaloFour commented 3 years ago

@markm77

But I'd rather not have to work backwards and replace var with explicit types one by one to solve nullable warnings in highly nested code.

Under what circumstance do you have to do this? The compiler uses flow analysis to determine the state of nullability of the variable at the point of usage. It doesn't matter if var is inferred to be string? if that variable is definitely assigned to a non-nullable value then there would be no warning.

markm77 commented 3 years ago

Thanks for your comment @HaloFour . I realise with some experiments I may have not understood the situation correctly. For example see code below which shows where warnings occur.

It would appear the flow analysis is willing to check a type annotation on assignment but then proceeds to completely ignore it for further checking.

So maybe I need to review my thoughts on var....

            string? s1 = (string?) "hi";
            string t1 = s1; // CS8600 warning

            string? s2 = "hi";
            string t2 = s2;

            string s3 = (string?) "hi"; // CS8600 warning
            string t3 = s3; // CS8600 warning

            string s4 = "hi";
            string t4 = s4;

sharwell commented 3 years ago

for me these days all reference types should be non-nullable unless there is a good reason and seeing these exceptional cases clearly is helpful

The current definition of var automatically supports this case. The variable will have a nullable value if there is a good reason (i.e. a nullable value is assigned to it), and it will have a non-nullable value otherwise. The static type of the local variable is irrelevant and does not appear in the compiled code at all.

markm77 commented 3 years ago

Yes thanks @sharwell , seems like flow analysis nullability of a variable is set at object creation and by forcing (e.g. via a cast) but crucially not by assignment. Seems incorrect assignment triggers a CS8600 warning but the flow analysis nullability of a variable is not reset.

I use Rider as my IDE so I'm not sure I can see the flow analysis nullability of a variable - my understanding is this is possible in Visual Studio. I guess this is crucial information if using var and trying to resolve warnings.

HaloFour commented 3 years ago

@markm77

seems like flow analysis nullability of a variable is set at object creation and by forcing (e.g. via a cast) but crucially not by assignment.

It should definitely be set by assignment. See the following example, the C# compiler should only produce a single warning:

SharpLab

markm77 commented 3 years ago

Hi @HaloFour , I was referring to case like below where assigning to non-nullable s3 does not make s3 non-nullable for flow analysis purposes. Hence we get two warnings.

string s3 = (string?) "hi"; // CS8600 warning
string t3 = s3; // CS8600 warning

HaloFour commented 3 years ago

@markm77

That is a strange case. I don't know why the second assignment would also cause a warning. Not sure what that has to do with var, though.

333fred commented 3 years ago

Hi @HaloFour , I was referring to case like below where assigning to non-nullable s3 does not make s3 non-nullable for flow analysis purposes. Hence we get two warnings.
string s3 = (string?) "hi"; // CS8600 warning
string t3 = s3; // CS8600 warning

This is a case of the compiler assuming you know what you are doing. You took a value that was non-null and cast it to something that could be null (string -> string?). The compiler sees that cast and says "Alright, well, the programmer told me this could be null. I'm going to treat it as such."

markm77 commented 3 years ago

Sure, @333fred , it would seem flow analysis decides that (string?) "hi" is nullable as per the programmer's wishes. I guess I could have used any nullable string in place of this. (Although it is interesting that a cast resets flow analysis nullability.)

I guess the main effect of interest to me was that then assigning this to non-nullable s3 didn't reset the flow analysis nullability. Hence the second warning.

To respond to @HaloFour , the relevance to var is that if assignments from non-nullable to nullable don't reset the flow-analysis nullability then they don't strip nullable info off types which was my main concern about using var.

markm77 commented 3 years ago

Small suggestion to anyone on the C# or complier team that it might be really helpful to document somewhere the difference between static nullability and flow analysis nullability of a reference type - and in particular what rules determine when the flow analysis nullability is reset.... I don't think I saw this anywhere and I've read quite a lot of stuff on NRT....

Anyway thanks everyone for improving my understanding of NRT tonight!

markm77 commented 3 years ago

Nice example below showing the same variable causing the opposite problems due to two notions of "nullability"....

            string s3 = (string?) "hi"; // CS8600 warning
            string t3 = s3; // CS8600 warning (assume due to flow analysis nullability of s3 being "nullable")
            s3 = null; // CS8600 warning (assume due to static nullability of s3 being "non-nullable")

333fred commented 3 years ago

Nice example below showing the same variable causing the opposite problems due to two notions of "nullability"....

            string s3 = (string?) "hi"; // CS8600 warning
            string t3 = s3; // CS8600 warning (assume due to flow analysis nullability of s3 being "nullable")
            s3 = null; // CS8600 warning (assume due to static nullability of s3 being "non-nullable")

These two cases aren't meaningfully different though. Both of these warnings are because you're assigning something that could be null to something that does not allow null. Note that you can observe the flow state of a variable by hovering over a usage. So for example:

QuickInfo shows you both the declared type, string?, and the current flow state, 'test' is not null here. This screenshot is vscode, but VS does the same thing.

markm77 commented 3 years ago

Thanks for this @333fred , yes complier behaviour makes sense and flow state being visible is great and reduces downside of always-nullable var.

I'm thinking about whether to experiment using var again instead of explicit types. I currently use Rider as my IDE and am not sure if that exposes flow state. I guess in terms of catching code issues the main downside of using var is that unwanted nullable assignments to a var variable will not be flagged (although you might pick issues up later when calling a function or passing a return value).

LWChris commented 3 years ago

Using explicit types also has another benefit - it makes nullable types stand out - for me these days all reference types should be non-nullable unless there is a good reason and seeing these exceptional cases clearly is helpful.

But it has the disadvantage of requiring more usings, plus if the type is more complex, e. g. a return type from a complex LINQ statement, it gets really wordy. I really hope at some point in the future "var" will be "dumb" again, i. e. simply take the right hand side, without any assumptions or pseudo-clever guesswork.

Return type of my method is int?, var should take int?. Return type is IEnumerable<Task<int?[]>>, var should take IEnumerable<Task<int?[]>>, not IEnumerable<Task<int?[]>>?.

LWChris commented 3 years ago

I just hope the "stricter" ruleset will become available

This is a prime case for an analyzer. Indeed, any time the topic is around "the language should be more restrictive and produce errors in this code that was previously legal", all that really means is "write an analyzer to find and mark the patterns you don't want to allow".

Now you don't need to wait on anything. You could write this today and have the stricter semantics you desire.

Can you link me to such an analyzer, i. e. one where declaring string s strictly inhibits s = null with #nullable enable? I tried looking into this, but having never done anything remotely akin to writing an analyzer, this task is way above the amount of time I can spend on learning how to do this right now.

markm77 commented 3 years ago

I really hope at some point in the future "var" will be "dumb" again, i. e. simply take the right hand side, without any assumptions or pseudo-clever guesswork.

Chris, obviously I was also disappointed about var being always nullable given my strong preference for "non-nullable by default and where possible".

However in terms of the future I wonder if a more realistic and in some ways more useful improvement might be a new non-nullable var with a new keyword (say varn) which allows the programmer to express that a new variable should be non-nullable with inferred type. varn would allow accidental nullable assignments (including at point of declaration) to be flagged without losing the convenience of inferred types (var/varn). This would also allow the programmer to see nullability at a glance (no hover required 😀) due to variables being var or varn. Just a thought....

markm77 commented 3 years ago

Here's another way use of var in nullable contexts could be improved today without language change.

What if the IDE used different syntax highlighting for the var keyword to reveal whether a variable is non-nullable convertible or not?

That way any unwanted nullable assignments would change the colour of the variable's var keyword and the "truly nullable variables" in a file would be clearly flagged to the programmer by the colour of the var keyword. (I suspect this is very possible because in Rider when using explicit types the "non-nullable convertible" variable types have a greyed-out ?.)

Something changing colour is not as good as a warning but this would not involve language change and seems feasible.

Edit: Or, dare I suggest it, instead of colour change maybe the IDE could suffix var with a ? annotation when the variable is not "non-nullable convertible".

markm77 commented 3 years ago

Here's another option for improving var with NRT in future which would not break existing code.

Why not make the var inferred type nullable or not based on both the RHS of the declaration and subsequent assignments? I.e. set it based on "flow engine nullability"?

Then the IDE could indicate when the var inferred type is nullable with a suffixed ? annotation?

This would remove the completely un-necessary confusion of "flow engine nullability" differing from "static nullability" in the case of var.

(Optionally, we could also consider e.g. nonnullable var and nullable var to force the nullability to fix detection issues or express a requirement. Explicit types also do the job but are not great for re-factoring and sometimes quietly convert to casts.)

dotnet / csharplang