dotnet / csharplang

The official repo for the design of the C# programming language
10.94k stars 998 forks source link

Champion: Simplified parameter null validation code #2145

Closed jaredpar closed 2 years ago

jaredpar commented 5 years ago

Specification: https://github.com/dotnet/csharplang/blob/main/proposals/param-nullchecking.md

In short though this allows for standard null validation on parameters to be simplified using a small annotation on parameters:

// Before
void Insert(string s) {
  if (s is null)
    throw new ArgumentNullException(nameof(s));

  ...
}

// After
void Insert(string s!!) {
  ...
}

LDM history:

Thaina commented 5 years ago

Shouldn't this be string! ?

And also doesn't C# will always warning on nullable reference type from now on?

jaredpar commented 5 years ago

@Thaina

Shouldn't this be string! ?

No. This feature is for runtime value checking only. It does not affect the type system in any way. Hence the syntax annotation is on the value identifier, not the type.

And also doesn't C# will always warning on nullable reference type from now on?

Nullable reference type checking is an opt-in feature. Also it is designed to warn on potential null references in the code base but it doesn't affect the execution semantics of it.

qrli commented 5 years ago

Maybe over simple, though :laughing: And it looks like a declaration rather than implementation

DavidArno commented 5 years ago

Great to see this proposal has resurfaced, and has been championed. I think it an excellent idea but thought it had been lost amongst all the other work going on around NRTs. Thanks, @jaredpar. 👍

Joe4evr commented 5 years ago

Shouldn't this be string! ?

No. This feature is for runtime value checking only. It does not affect the type system in any way. Hence the syntax annotation is on the value identifier, not the type.

You say that, but the original idea does put it on the type.

GeirGrusom commented 5 years ago

It's the dammit operator placed on the argument. It makes sense that it is applied to the value rather than the type.

alrz commented 5 years ago

It makes sense that it is applied to the value rather than the type.

that logic works only for parameters not property setters (that's why we'd need to invent syntax like set!). to me, this is more of a very specific code generation problem (what if I just want to Assert?), IMO the functionality should be a part of method contracts, or alternatively, some special attribute (which could be extended to other validation logics as well).

Thaina commented 5 years ago

@Joe4evr If I understand it right (from the jaredpar's comment after mine) this is actually newer feature unrelated to nullable reference type. It just a syntactic sugar to throw exception while nullable reference type is compile time warning

Logerfo commented 5 years ago

is this feature supposed to ship with 8.0 as well?

DavidArno commented 5 years ago

@Logerfo, my money is on "unlikely, but not impossible". Seems a candidate for 8.1+ to me. But I could be wrong...

bbarry commented 5 years ago

In isolation I think this is great but I'd much rather see a more comprehensive contract annotation system with a potential way to expand it for pattern matching validations and potentially even arbitrary code validation as well as return value validations, even if those advancements don't come as part of a first version of this.

In short I worry that this:

ReadOnlySpan<char> Foo(string s!, int n) 
{
    ...
}

means something like this will never happen:

ReadOnlySpan<char> Foo(string s, int n)
    requires s != null // perhaps: requires s! 
    requires s.Length > 0
    requires n >= 0
    requires n < s.Length
    ensures value.Length > 0 
{
    ...
}
Joe4evr commented 5 years ago

If I understand it right (from the jaredpar's comment after mine) this is actually newer feature unrelated to nullable reference type.

You're not wrong in that this feature could be achieved without NRTs, but the initial idea of having the compiler generate null checks for parameters (which is an extremely common scenario) came up during discussions of the NRT design. (Or rather, that was the moment the LDT would take it seriously. Theoretically, we could've had a syntax for generated null-checks ages ago.)

And while it's true that the check is about the value and not the type, applying the ! to the type 1) makes it more symmetrical, and 2) would probably make a lot of edge cases moot. (Granted, the reverse index syntax shows how much they care about symmetry. ¯\_(ツ)_/¯)

GrabYourPitchforks commented 5 years ago

I spoke with @tannergooding a little regarding this. He suggested that it might be more sensible for the compiler to emit a call to a helper method to perform the throw instead. I agree with this, but for somewhat different reasons.

In my idealized scenario, the null check would actually be emitted by the JIT / codegen, not the C# compiler. The reason for this is that the JIT can generally handle this more optimally since it's already responsible for setting up things like exception dispatch. Consider the following method, whose prologue is quite common throughout the framework.

public void DoSomethingWithArrays(byte[] buffer, int offset, int count)
{
  if (buffer is null) { /* throw */ }
  if ((uint)offset > (uint)buffer.Length || (uint)count > (uint)(buffer.Length - offset)) { /* throw */ }

  /* actual method logic goes here */
}

With the proposed syntax the method declaration DoSomethingWithArrays(byte[] buffer!, int offset, int count) would allow the developer to elide the first line, and the C# compiler would emit essentially the same thing.

What I would love to see is the JIT elide the check entirely. Instead, optimistically and immediately attempt to dereference buffer.Length without performing a null check:

mov eax, dword ptr [rcx + 8h] ; rcx := buffer, move buffer.Length into eax

If rcx (buffer) is null, this will immediately trigger an interrupt just like any other null ref would have. The runtime would need to perform some sleight of hand to deduce that this should become an ArgumentNullException instead of a standard NullReferenceException. The end result is that the original method is faster in the happy path of no exception being thrown - less code gen to emit, smaller method sizes, fewer branches - at the expense of making exception dispatch a little more complicated. Generally this seems like a good tradeoff to me.

Coming back to the start of this comment, the reason this matters from a compiler perspective is that we should plan for a future JIT which might have these optimizations. There are a handful of ways we could do this: put a special attribute on the arguments, call a helper method to perform the null check, or smuggle some other metadata that allows the JIT to recognize and remove the compiler-emitted checks. I don't really have a vested interest in which particular mechanism is used, but I wanted to make the compiler team aware of the implications of the feature design.

HaloFour commented 5 years ago

@GrabYourPitchforks

And what if this method doesn't immediately attempt to use the non-nullable argument and instead does some body of business logic which definitely should not be happening if the argument is null? That seems like a lot of tricky optimization that would only apply to very specific codepaths.

GrabYourPitchforks commented 5 years ago

@HaloFour In most cases I've seen the null check is only meant to avoid a null ref exception on the line immediately following, which generally is a candidate for this optimization. This should serve the majority of developers well by improving the efficiency of their code. Any other more complex code would still go down the standard "perform an explicit null check" path.

HaloFour commented 5 years ago

@GrabYourPitchforks

So the specific optimization would only apply to a method accepting a single reference argument that shouldn't be null where the code immediately attempts a deference? Is that really so common? And if it is, wouldn't it be simple enough for the JIT to recognize and optimize without requiring a new special handshake from the compiler?

I'm not making the argument against making things faster for a common case, especially if it's common in the BCL itself, this just seems oddly specific. Even in your use case above, what happens if the C# code validates that offset is >= 0 before accessing buffer.Length, does that defeat this potential optimization?

HaloFour commented 5 years ago

@GrabYourPitchforks

Rereading your original comment it does sound like you're suggesting that this is something that the JIT could do, aided by the compiler emitting a call to a helper method to throw the exception instead of throwing it directly. If that's the case, sounds good to me. :)

jaredpar commented 5 years ago

@GrabYourPitchforks

Coming back to the start of this comment, the reason this matters from a compiler perspective is that we should plan for a future JIT which might have these optimizations

In this case I think we are fairly safe to change the implementation of the throw at a later date. Consider if the runtime introduced a helper Runtime.ParameterNullCheck<T>(T arg). A newer version of the compiler could bind to that if it existed and fall back to a generated helper if it didn't. There isn't much of a compat concern here as there is no user code involved.

Korporal commented 5 years ago

I think C# is getting unwieldy, newer features are leading to more and more idiosyncratic syntax and non intuitive behavior, consider the assymetric Index concept and the odd way ref and var behave. The language is becoming more and more inelegant each year. Its time for a new language with a richer grammar, these C based grammars are crippled.

HaloFour commented 5 years ago

@Korporal

What does any of that rant have do with this proposal?

Korporal commented 5 years ago

@Korporal

What does any of that rant have do with this proposal?

@HaloFour - I'm voicing an opinion on the contrived use of "!" which usually means "not" but will now have another meaning. This is being done because the options are limited due to the inflexible grammar.

I then go on to suggest that perhaps a new language with a better more flexible grammar is the way to go rather than performing syntactic acrobatics, for example a grammar that has no reserved words would perhaps offer much greater options for adding new features over time.

HaloFour commented 5 years ago

@Korporal

Funny, your rant doesn't mention that at all.

Syntax often has multiple meanings in many different languages. Many languages overload the definition of !, such as Rust and Swift.

I then go on to suggest that perhaps a new language

You're free to start a new language if you want, or use any of the other many, many languages in the ecosystem, .NET or otherwise. But I'd suggest that design discussions for languages that are not C# aren't appropriate for a repo specific for the design of the C# language, nor are you likely to recruit many like-minded individuals.

CyrusNajmabadi commented 5 years ago

I then go on to suggest that perhaps a new language with a better more flexible grammar is the way to go rather than performing syntactic acrobatics,

Can you please take that discussion elsewhere? It is specifically out of scope and unnecessary in this specific issue. You are free to do any of the following as it suits you:

  1. take it to gitter.im. That's a good place to wax philosophical about what you want the language to be.
  2. blog about it on any platform of your choice.
  3. create a honey-pot issue for that discussion. while i would prefer it not be in csharplang at all, this would at least localize the dicsussion to a single space instead of leaking out and getting mixed up into real issues that are being worked on.
MrJul commented 5 years ago

I find it weird that this feature is being championed at a time where another feature makes it almost unnecessary.

I believe that a few years from now, when nullable reference types and C# 8 are completely adopted by the BCL and most popular libraries, this type of check will be redundant: why would you want to check for a null argument, when you already know you're being passed a non-null one (unless you're an existing public library method already documented to throw ArgumentNullException)?

I know about the implementation details of null reference types, and that everyone is still "free" to pass null to non-null arguments. In practice, once everything is correctly annotated, that won't really happen.

I don't have any crystal ball, but I already saw a similar effect in a large codebase. Everything in this code base (and I mean every field, property, parameter and return type of every method) is annotated with ReSharper's [NotNull] and [CanBeNull] attributes. (I can't wait to convert them to C# 8 nullable/non-nullable references!) At first, any public method had the classic if (arg == null) throw new ArgumentNullException() for every argument, and private/internal methods had debug assertions.

We never encountered those. With everything annotated and compile-time checked we didn't see any NullReferenceException, ArgumentNullException or null assertions fired in years in first party code. Believe in the tools! :) We started to gradually remove all those checks, since they sometimes take more lines that the method itself, plus they have a (very small) runtime cost for what is basically dead code.

Please, at least consider using an attribute like [NullCheck] instead, keeping the simple suffix syntax available in the future for other purposes. This also seems the kind of feature which won't be needed with proper code generators / AOP.

qrli commented 5 years ago

Reference #2153 Especially, to make the two features to work as one: https://github.com/dotnet/csharplang/issues/2153#issuecomment-455779032

yaakov-h commented 5 years ago

With everything annotated and compile-time checked we didn't see any NullReferenceException, ArgumentNullException or null assertions fired in years in first party code.

My team has seen the same effect from hundreds of thousands of unit tests, and from Code Contracts.

However, we often have developers try and use code in new and exciting ways, like jumping into an assembly from PowerShell or C# Interactive or similar. Plus, a lot of code such as DI ends up going into Reflection and similar.

As such, the approach my team is taking is:

Given the above, I personally do still see a use for this, even with a history of correctness proofs.

fschmied commented 5 years ago

On my team, we are currently using Fody NullGuard to automatically generate argument null checks in conjunction with the Implicit Nullability ReSharper add-in, which implements static checking for non-nullable reference types. The nice thing is that with this combination (and a little bit of configuration), NullGuard already knows which method parameters need argument null checks and which ones don't. Switching to the feature of this proposal would require us to explicitly annotate the parameters again with no gain, which feels redundant and cumbersome to me.

You might argue that being explicit is better in this case, as you enable a side effect, but if you're embracing non-nullable reference types, the side effect is only the second line of defense (for Reflection calls, type checker insufficiency, etc) and, in my opinion, doesn't deserve syntax of its own.

About the proposed syntax, I feel it's too special-purpose and closed for future extension, as it can't be extended to more general contact checking.

And finally about the position of the exclamation mark: the proposal argues that it can't be put on the type because it's not part of the type in the type system. But it really could be part of the type system quite easily by making T! denote a runtime-checked, i.e., guaranteed, hard non-nullable type (as opposed to an ordinary non-nullable type, which isn't guaranteed to be non-null). For hard non-nullable types, the compiler and JIT could implement stricter rules, e.g., emit errors instead of warning or elide null-checks as an optimization when passing a guaranteed non-null value to a T! parameter.

So, T? would be a nullable reference type, T a soft non-nullable reference type, T! a hard non-nullable reference type. Very symmetric and, IMO, more useful than a non-extensible shorthand for a very specific element of contract annotations. Even if you don't plan to implement different behaviors in the compiler at first, the syntax would allow for evolution into this direction in the future.

MichaelKetting commented 5 years ago

@fschmied I'm not a fan of the post-parameter syntax, either, but I can understand where @jaredpar is comming from when he explains why the LDT prefers the annotation on the parameter value instead of the type. It does work hand in hand with the postfix annotation for overriding the compiler's nullablity reasoning and declaring a value as non-null at runtime.

Declaring an explicitly non-nullable type for type system (or rather not doing it) goes back to the early design decisions for the nullable reference type feature and the reasoning why the nullable reference types are the new feature and not the not-nullable reference types. Doing it just for parameter declarations could work, of course, but it would be inconsistent to use a different type declaration for parameters and everywhere else in the code base. And if you start to truly embrace explicit non-nullable reference types, you get exclamation points everywhere in the code.

So, yeah, I would love to be able to just annotate/configure on a type, file, and project level which parameters (and return value possibly) to get argument checks and skip all those redundant annotations, but it's a workable comprimise.

MichaelKetting commented 5 years ago

There is another aspect for preferring the current proposal of type name! instead of type! name: Type modifiers in C# propagate through all usages of a particular value. In this instance, the type of a parameter must be identical in across all interface implementations, overrides, etc and I do not believe we would want the argument check to be something that is a mandatory part of the interface declaration and vice versa has to be declared on the interface in order to be allowed to use in the implementations.

Just to clarify, I'm not arguming agianst Liskov here, it's just a matter of style, e.g. you might decide to just not care about a particular parameter's value and thus also not care to guard against it being null. Or you want to skip the check bcause you're doing high performance code and the possible NullReferenceException instead of an ArgumentNullException would be an acceptable tradeof.

mikernet commented 5 years ago

@MichaelKetting Interface implementations should just be free to include or exclude ! as desired. Interface methods with no default implementation shouldn't even allow ! since it's an implementation concern. Interface methods with a default implementation can allow it, but other implementations of that method should be free to drop the ! if desired.

DavidArno commented 5 years ago

@mikernet,

You raise a good point here that I'm not sure has been covered before. My understanding is that ! isn't part of the method signature; it's just a shorthand way of asking the compiler to add null checking to the body of the method. So it makes no sense to allow it to be used for abstract definitions, including interface definitions that lack a default implementation. The following should be a compilation error as it's meaningless:

interface Foo
{
    void Bar(string s!); // ! cannot be used here.
}
GeirGrusom commented 5 years ago

It could indicate that there should be an <exception type="ArgumentNullException" /> in the documentation or otherwise that its an error for an implementation to omit it.

Edit: though I think it's fairly useless though since functionally you should already be warned against passing a null value to the function, so it doesn't serve much purpose that isn't already shown.

MichaelKetting commented 5 years ago

@mikernet This depends on where the ! is used. On the parameter name, yes, I concur. Thing is, I was discussing the pros and cons for using the ! the parameter type instead of the parameter name. And allowing a signature change between declaration and implementation (or with an override) has not been part of C#'s langauge philosophy until now.

mikernet commented 5 years ago

@MichaelKetting It wouldn't matter, the implementation would be the same. Just like nullable reference types with ? won't be part of the signature, either would !.

mikernet commented 5 years ago

And for consistency with NRTs I definitely vote for having it on the type.

MichaelKetting commented 5 years ago

@mikernet Maybe there is a misunderstanding or the C# team has revised it's deisgn from the last public drop available on SharpLab (Oct 29th, 2018), but ? is very much part of the signature to the C# compiler:

using System;
#nullable enable
public interface I {
  void M(string? x);
}
public class C : I {
    public void M(string x) {
    }
}

gives your the following warning:

warning CS8614: Nullability of reference types in type of parameter 'x' doesn't match implicitly implemented member 'void I.M(string? x)'.

To me, this implies that the type of parameter 'x' is intended to include the nullability status just as it does with value types. The fact that it's just a warning, only comes into play during compile time and is presently (if ever) not a CLR feature isn't really important when it comes to reasoning about the semantics.

mikernet commented 5 years ago

@MichaelKetting It's a warning and it can be overridden. It's still not part of the method signature, it's just a warning the compiler gives you. If it was part of the method signature you wouldn't be able to get around it, it would be a compiler error. Regardless, this is different because it doesn't and shouldn't affect the analyzer the way ? does, it's purely a runtime implementation shortcut.

EDIT: sorry the last sentence was confusing, edited for clarity.

mikernet commented 5 years ago
using System;
#nullable enable
public interface I {
  void M(string? x);
}
public class C : I {
    #pragma warning disable CS8614 
    public void M(string x) {
    }
    #pragma warning restore CS8614 
}
mikernet commented 5 years ago

Nullable value types are different - it IS part of the method signature because an int? is a different type than int. There's no way to implement a method with an int? parameter with a method that takes an int parameter.

Having said that...one thing that has just dawned upon me while writing this is that there may be situations where you want to apply this to a nullable value type parameter. This would be very rare so I'm now torn about whether it makes sense to put the ! after the name for this use case, but consider this example:

using System;
#nullable enable
public interface I {
  void M(int? x);
}
public class C : I {
    public void M(int? x!) {
    }
}

There may be situations where you are overriding or implementing a method that takes a nullable value type but you still want to add a runtime null check. Though I guess there's always this

    public void M(int?! x) {
    }

Hahaha...it looks like a very confused and surprised int...

Someone above mentioned a more general contract syntax that would support more than just nullability checks but I haven't seen much discussion about that - is there anything on the horizon for that actually happening, i.e. a championed issue? I think this is ultimately the most important discussion to have here.

If more general contract syntax is where we ultimately want to end up, then does it even make sense to cloud the language with another way to do a subset of the same thing, or is this still worthwhile?

Right now I'm personally leaning towards "this is still worthwhile". I personally rather write this:

    public void M(string! x) 
        requires x.Length > 10
    {
    }

Than this:

    public void M(string x) 
        requires x != null
        requires x.Length > 10
    {
    }
MichaelKetting commented 5 years ago

@mikernet

Nullable value types are different - it IS part of the method signature because an int? is a different type than int. There's no way to implement a method with an int? parameter with a method that takes an int parameter.

Yes, that's true, but to me, this is merely an implementation detail, same as the descision to make the NRT-stuff just warnings. The LDT simply couldn't change the CLR and/or make NRT feature on the same level as generics or nullable value types without breaking the .NET eco-system. Thus, we are left with the next-best thing.

This in turn means (to me) that I should view C# 8 and the semantics expressed by the code in same light as I would nullable value types.

The second aspect, to me, is that generating a null-check for the argument is a descision that's bound to the parameter and not the type (I like the original reasoning of the feature proposal in that regard).

Anyway, I find your suprisingly confused integer interesing. Boolean would be even more fun, maybe even a Schrödinger's value :D Although, I'm not sure where I would ever be trying to do an argument null a nullable value type since this would mean a Liskov violation if the base type or an interface requires the NVT and could just change the contract on all other cases.

I did notice, too, that the contract discussion went quiet here. To me, the problem is mainly one of scope. From what I gather, the ! could make it into C#8 with NRTs, a contract syntax most definitely would be out of scope until at least the next major version. The other problem is, that there is still a lot of noise there, as you mentioned. Too much, for me, to use with ~10,000 argument null checks like I have in my code base (and no, I'm not joking about that number). So yeah, a shorthand will still be needed.

KalleOlaviNiemitalo commented 4 years ago

Is the void Insert(string s!) feature intended to be compatible with .NET Framework 4.8 and earlier versions, too? It looks like it could easily be made compatible, but this paragraph in Introducing .NET 5 makes me doubt: "C# will move forward in lock-step with .NET 5. Developers writing .NET 5 apps will have access to the latest C# version and features."

MichaelKetting commented 4 years ago

Well, there's no technical reason why it shouldn't be supported, and it would be a major bummer if it weren't. Anyway, my takeaway was that this refers to features that require runtime support (default interface implementations) or special APIs only available in .NET Standard 2.1 / .NET 5 (e.g Index/Range).

I'm looking forward to learn how the compiler will check what's supported and what's not, since right now a lot of the API stuff is simply handled in terms of availability checking.

Grauenwolf commented 4 years ago

Would it be better to look for a [NotNull] attribute? Reasoning...

  1. Many libraries already have these.
  2. It is easier to see than ! when looking at the code
  3. It establishes a pattern for other declarative attribute checks.

The last one is important because we often want to check for empty as well (strings/lists).

Grauenwolf commented 4 years ago

I would also like to add that attributes are easier for existing tools to see (static analyzers, documentation generators).

CyrusNajmabadi commented 4 years ago

Would it be better to look for a [NotNull] attribute? Many libraries already have these.

That could change the meaning of existing code.

Grauenwolf commented 4 years ago

True, but you could restrict it to a specific namespace.

EDIT: Actually, I take that back. It would be opt-in at the compiler level, so exisitng code would be safe.

mikernet commented 4 years ago

@Grauenwolf If you want more generalized parameter checking then I think the other proposals mentioned above for better contract syntax is much cleaner than attributes. That said, I would love to see AOP features built into the compiler for other uses of attributes, but that's a different topic.

Grauenwolf commented 4 years ago

AOP is something that we've been talking about since Roslyn was announced. Maybe .NET 9 will finally be our opportunity.

RikkiGibson commented 4 years ago

I noticed a bit in the spec that's probably a typo:

Intersection with Nullable Reference Types ...

void WarnCase<T>(
    string? name!, // Warning: combining explicit null checking with a nullable type
    T value1 // Okay
)

Seems like the parameter T value1 was probably meant to be T value!

joshlang commented 4 years ago

This is my favorite upcoming language feature

SOOOOOOOOOOOOOO many lines of code will be saved. Replaced with a single character. Holy moly I'm gonna be so efficient.

I like writing single-statement methods. Functional-ish type programming. Often, I don't just because the ?? throw … statements ruin the flow of what what I'm trying to do and make things less elegant/clear.

114% of all parameterized constructors start off with null checks or ?? null-checking assignments.

With a single character or lack thereof, I can instantly see whether I've forgotten a null check.

And I could totally abuse it by doing stuff like: void Yay(string[] awesomeStrings!) => AwesomeStrings = awesomeStrings.Select(x! => x).ToArray();

ALSO! You could add a compiler warning. If <Nullable>enable</Nullable>, the compiler could warn if a method signature is void Hello(string there) and suggest that it becomes void Hello(string there!)

Anyway... 8.0. PLEASE.