dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.02k stars 4.03k forks source link

Proposal: guard statement #11562

Closed gafter closed 8 years ago

gafter commented 8 years ago

When reviewing the scoping rules for out variables and pattern variables, the LDM discovered that there is sometimes an unfortunate interaction with a pattern of coding that we call the guard pattern. The guard pattern is a coding pattern where you test for the exceptional conditions early, handle them, and then bypass the rest of a method by returning, the rest of a loop by continuing, etc. The idea is to not have to deeply nest the "normal" control path. You'd write code like this:

    int parsedValue;
    if (!int.TryParse(s, out parsedValue))
    {
        // handle or report the problem
        // this block does not complete normally
    }

    // use parsedValue here

To take advantage of the out var feature, and avoid having to write the type of the out variable, you'd like to write

    if (!int.TryParse(s, out var parsedValue))
    {
        // handle or report the problem
        return;
    }

    // use parsedValue here

but that doesn't work because parsedValue isn't in scope in the enclosing statement. Similar issues arise when using pattern-matching:

    Tuple<string, int> expression = ...;
    if (!(expression is @(string key, int value))
    {
        // either the tuple was null or the string was null.
        // handle the problem
        continue;
    }

    // use key, value here

(The @ here assumes that tuple patterns are preceded by an @ character)

but that doesn't work because key and value aren't in scope after the if statement.

Swift addresses this category of issue by introducing the guard statement. Translating that into the concepts we've been proposing for C# with tuples and pattern-matching, the equivalent construct would be a new kind of statement:

statement
    : guard_statement
    ;
guard_statement
    : 'guard' '(' expression ')' 'else' embedded_statement
    ;

Open Issue: should we require parens around the expression as part of the syntax?

A guard statement is like an inverted if statement, or an if statement without a "then" block. It has the following semantics:

  1. The expression of a _guardstatement must be a boolean expression.
  2. The _embeddedstatement is reached when the expression is false.
  3. The statement following the _guardstatement is reached when the expression is true.
  4. The _embeddedstatement may not complete normally.
  5. Any pattern variables and out variables declared in expression are in the scope of the block enclosing the _guardstatement.

This would allow you to handle the previous examples as follows:

    guard (int.TryParse(s, out var parsedValue))
    else {
        // handle or report the problem
        return;
    }

    // use parsedValue here
    Tuple<string, int> expression = ...;
    guard (expression is @(string key, int value))
    else {
        // either the tuple was null or the key was null.
        // handle the problem
        continue;
    }

    // use key, value here

/cc @dotnet/ldm

mattwar commented 8 years ago

I'm thinking the keyword 'assert' would read better here than 'guard'. I'm not sure what guard is guarding.. it seems like it is named after the jargon-y term named after the common pattern of writing code. And I transpose the u and the a when typing guard all the time. Where as 'assert' is nice because it reads like I'm making a claim about the expression being true and implies that there is nothing special to be done when it is so.

gafter commented 8 years ago

@mattwar An assertion is making a claim about the expression, but the guard doesn't make such a claim. It tests whether the condition is true or not, expecting that it could be either, and provides code to handle the "not true" case and "get out of there".

CyrusNajmabadi commented 8 years ago

Why not just:

guard (expr)
    statement

?

The 'else' reads super weird to me.

With the above i can then write:

guard (n != null)
    throw new ArgumentNullException(n)

Or

guard (int.TryParse(out var v)) {
    return;
}

// use v
Pilchie commented 8 years ago

@CyrusNajmabadi That sounds like Perl's unless statement

mattwar commented 8 years ago

@gafter It seems to work just like a Debug.Assert expression, except it not debug and I get to say what happens when it fails. That's a strong association that every programmer already has.

Guard just doesn't tell me what it does. It tells me I have to do a web search to find out.

CyrusNajmabadi commented 8 years ago

Also, i thnk i like 'unless' as well. it reads nicely. "unless n is not null, throw this thing. unless this int parses, return from what i'm going." 'guard' doesn't immediately translate well in my head.

gafter commented 8 years ago

Plenty of discussion over on #8181 which however does not discuss the scope rules. Also related is #6400 and the let statement in the pattern-matching specification.

svick commented 8 years ago

Wouldn't it be simpler and more consistent if out var variables were in scope in the enclosing statement?

To me this makes the most sense: out parameters are always set. It's common that they are set to a useful value only under some conditions, but it shouldn't be up to the language to guess what that condition is.

gafter commented 8 years ago

@svick The same issues occur with pattern-matching, and for the same reasons. We're pretty comfortable with the tentative scope rules we have today, but this is one particular use case where they rub us the wrong way.

ljw1004 commented 8 years ago

I want to offer a counter-proposal which I think is simpler and satisfies most of the same scenarios:

_Proposal: Lifting of out variable declarations_. Whenever a variable is declared with the out modifier, its scope is expanded to the enclosing block.

Example: TryGet

dictionary.TryGet("key", out var value);

// easier than the existing code:
Tuple<Func<string,bool>, string, string> value;
dictionary.TryGet("key", out value);

Example: pattern matching

if (!(o is out string s)) return;
Console.WriteLine(s);

Example: guards

if (!int.TryParse(input, out int i)) throw new ArgumentException(nameof(input));
Console.WriteLine(i);

Unexample: limited scope out varaibles

if (int.TryParse(input, out int i)) { Console.WriteLine(i); }
// Under this proposal, there's no way to limit the scope of "out arguments" to just the 'then' block

if (o is string s) { Console.WriteLine(s); }
// But there is still a way to limit the scope of pattern variables to just the 'then' block

Example: loop

for (out int i=0; i<10; i++)
{
  if (stop_early) break;
}
Console.WriteLine($"We got to {i}");

Example: foreach loop

foreach (out var x in GetEnumeration(out var dummy)) { ... }
// Note that "x" isn't definitely assigned after the foreach (in case the enumeration was empty).
// I just put this example to show that the rules are uniform: the "out" modifier
// always pushes out the variable's scope, regardless if it's the loop variable
// or an out variable

Example: normal variable declaration

void f() {
  out int x = 15;
}
// Although technically meaningful, it has the same effect as just "int x = 15" here,
// and I think it should be disallowed as confusing.

The idea is that in many cases (e.g. patterns, "is" testing, "then" clauses) then there's no need to expand to the enclosing scope. But in many cases (e.g. guards) there is. This proposal gives the user control of whether the variable's scope expands out, in an at-a-glance understandable syntax.

As the "unexample" above shows, there's one corner where this proposal doesn't satisfy: out arguments with limited scope. I think this corner is small enough (and the penalties of omitting this corner small enough) that it's worth the sacrifice to get a simple feature.

alrz commented 8 years ago

This is useful for out variable declaration, but I'd prefer let if I want to match a potentially incomplete pattern.

let (string key, int value) = expression else continue;
guard (int.TryParse(str, out var parsedValue)) else continue;

In both of these examples, the end of embedded-statement must be unreachable, so in my opinion guard and else should be used here. In contrast, unless does not imply such requirement.

I'll also note that guard is a potential alternative mechanism for writing method contracts, because the proposed syntax in #119 just blows up the method signature. Rel: https://github.com/dotnet/roslyn/issues/11308#issuecomment-219213073

HaloFour commented 8 years ago

The one thing I like about guard is that it serves as a helper for validation logic to ensure that the method does bail out:

private void Foo(string s) {
    if (s == null) {
        Log.Warn("s is null!"); 
        // oops, forgot to return or throw
    }
    int l = s.Length; // boom
}

private void Bar(string s) {
    guard (s == null) else {
        Log.Warn("s is null");
        // compiler error, must use throw, return, break or continue to exit current block
    }
    int l = s.Length;
}

This proposed behavior that pattern and out variables survive into the enclosing scope is a welcome addition.

As for the guard keyword specifically, I don't really care, but I don't think that unless conveys the point well enough that the current block is not allowed to continue if the condition does not match. This fits more into validation. "Guard clause" does seem to be the accepted CS term for this specific form of condition, although outside of Swift (which has nearly identical syntax to this proposal) it seems to relate more to pattern guards.

qrli commented 8 years ago

It seems all we need is to invert the scope of the out var, so why not:

    if not (int.TryParse(s, out var parsedValue))
    {
        // handle or report the problem
        return;
    }

    // use parsedValue here

In this way, you don't need to introduce a completely new statement, but only an alternative of if: the compound if not. It also solves another minor issue: the ! before the expression is hard to see and usually you need an extra level of parenthesis.

HaloFour commented 8 years ago

@qrli You're still introducing a new statement and a new keyword. An "inverted if" also doesn't imply that it would force exiting the current block.

Thaina commented 8 years ago

@qrli I have request if! syntax in the past

But somehow I think it could possible if we just use else(expression)

Normally we must use else if or else { } plainly. So we could add else() as another way to use it

I'm not native english speaker so if this sound weird I would support unless instead of if not or guard

dsaf commented 8 years ago

Doesn't this encourage e.g. pushing argument validation down the call chain?

I would personally like to see method contracts #119 implemented first. The concepts are different of course, I understand that.

Alternatively, how about D-style contracts, but changed so that variables declared in the in contract block are accessible to the body?

qrli commented 8 years ago

@HaloFour It adds only a contextual keyword instead of a keyword, which is already better. Exiting can still be enforced by compiler, if necessary. However, I don't think it is necessary to enforce exiting. There is nothing to continue and access them, as out variables are definitively assigned anyway. e.g. I may log the invalid input but continue process with default value.

Porges commented 8 years ago

@gafter Regarding "that doesn't work because key and value aren't in scope after the if statement"... couldn't we fix this by broadening the scope of the variables declared in the if statement? I think this would work because we already have wording in the spec that the variable is only "definitely assigned" if the match returns true.

More formally:


Currently the patterns proposal contains the following wording:

If the pattern appears in the condition of an if statement, its scope is the condition and controlled statement of the if statement, but not its else clause.

I suggest that the scope of the variable is broadened to be as if the variable was declared just before the if statement. This is because we already have the following restriction:

Every identifier of the pattern introduces a new local variable that is definitely assigned after the is operator is true (i.e. definitely assigned when true).

So if you write the following, the semantics are already as desired:

if (maybeX is Some x)
{
    // x is in scope and definitely assigned
}
else
{
   // x is in scope, but not definitely assigned (unusable)
}

The way the specification is currently written precludes the possibility of useful code such as:

void Handle(Option<T> maybeX)
{
    if (!(maybeX is Some x)) return;
    // x is now in scope and definitely assigned
}

Or:

if (!(dictionaryFromTheFuture.TryGetValue(key) is Some x))
{
    x = defaultValue;
}

// x is in scope and definitely assigned

And this code from the original post is now valid:

    Tuple<string, int> expression = ...;
    if (!(expression is @(string key, int value))
    {
        continue;
    }

    // use key, value here - they are in scope and definitely assigned

These examples also suggest that an unless (if (!...)) keyword could work quite nicely :grin:

gafter commented 8 years ago

@Porges The reasons for the current scoping rules are, among other things, to help you with code such as a series of if-then-else, so you don't have to invent a new name for each branch.

Porges commented 8 years ago

@gafter hmm, I would have thought that that would be a rarer situation than things like the above. Thanks for the reasoning, though.

Perhaps you should be permitted to shadow a non-definitely-assigned variable? :grinning:

gafter commented 8 years ago

Perhaps you should be permitted to shadow a non-definitely-assigned variable?

We need to know the scoping of the variables (so we know what is being assigned) before figuring out whether they are definitely assigned or not, so that doesn't work.

Thaina commented 8 years ago

Maybe we could use return if

void Guard(string s)
{
    return if(!int.TryParse(s, out var value))
    {
        // handle or report the problem
        // auto return after this block
    }

    // use value
}

int GuardError(string s)
{
    return if(!int.TryParse(s, out var value))
    {
        // handle or report the problem
        // ERROR it must return something
    }

    // use value
}

int Guard(string s)
{
    return if(!int.TryParse(s, out var value))
    {
        // handle or report the problem
        return 0; // need to return value
    }

    // use value
}

int GuardReturn(string s)
{
    // Maybe this syntax
    return (0) if(!int.TryParse(s, out var value))
    {
        // handle or report the problem
    }

    // use value
}

int GuardReturn(string s)
{
    // Or this syntax
    return if(!int.TryParse(s, out var value))
    {
        // handle or report the problem
        0; // Auto return last line
    }
   /* Above code would be transpiled
    if(!int.TryParse(s, out var value))
    {
        // handle or report the problem
        return 0;
    }
   */

    // use value

    // So we could
    return if(!int.TryParse(s, out var value))
    {
        int.Parse(0); // Auto return last line of block
    }
}
bondsbw commented 8 years ago

@Thaina I kind of like that syntax, I assume reusing existing keywords is generally preferred to inventing new ones so long as they don't make things more confusing.

Would that syntax support throw as well?

And how about a statement form if returning is all it needs to do?

void Guard(string s)
{
    return if(!int.TryParse(s, out var value));

    // use value
}
Thaina commented 8 years ago

I think this guard/returnif is just a syntax to enforce error and will be ignore at compile time so everything will work normally include throw

qrli commented 8 years ago

@Thaina Or use break if

break if (!int.TryParse(s, out var value))
{
    return 0; // or throw new ArgumentException();
}
Thaina commented 8 years ago

@qrli I think that interesting. But would it be confused in while or switch block?

Also I like return more if it would auto return on void

bondsbw commented 8 years ago

Agreed, break can be a valid keyword in the same locations but it is semantically different from return. That could make readability more difficult.

Miista commented 8 years ago

Couldn't this be fixed by simply declaring value before the if statement?

HaloFour commented 8 years ago

@Miista

That would only handle the scope issues if using out var, but that wouldn't work for type-switch or variable patterns where you are not permitted to assign the result into an existing variable and the scope of the variable is limited to a prescribed scope:

if (!(foo is Bar bar)) {
    // bar is in scope here, but not definitely assigned
}
else {
    // bar is not in scope here
}
// bar is not in scope here either

guard (foo is Bar bar) else {
    // bar is not in scope here
}
// bar is in scope here and is defined

The other feature that guard has is to not permit code to fall out of the scope of the compensating block. It would be a compiler error if the block doesn't throw, return, break or continue.

omariom commented 8 years ago

Trying ensure :eyes:

ensure (int.TryParse(s, out var parsedValue)) else return;

ensure (ptr != null) else
{
    Environment.FailFast();
    return;
}
qrli commented 8 years ago

@omariom I think the issue with ensure is that it reads like part of code contract, while code contract logic may be completely removed in release build.

aluanhaddad commented 8 years ago

Just to :bike::house: how about

when (foo == null) {
    return; // throw, break or continue
}
bondsbw commented 8 years ago

That doesn't seem different enough from if to make me think it would behave differently.

aluanhaddad commented 8 years ago

@bondsbw Fair enough, I just want to avoid the else clause in some way.

omariom commented 8 years ago

Everyone seems like the idea. But naming is hard.. At least we don't have to invalidate cache :smile:

qrli commented 8 years ago

It is not only about naming.

1st: The guard statement is mostly an inverted if. Without out vars, it is more or less interchangeable with if. It means we will have two ways to do the same thing, which reminds of the headache of C++.

2nd: "Invert if" is a routinely used refactor. With guard, there will be two ways to invert when no out var, and one way to invert to guard when there is out var. This is a bit confusing.

I'd like keep it as simple if if possible. One not-so-good idea is to allow if without body:

if (s == null) else {
  return;
}
AdamSpeight2008 commented 8 years ago

Why is guards needed? Since the out variable brings that variable into scope. The user can use && or other logic operator in the expression. Which acts as a guard.

if( Int.TryParse(s, out x) && (x > 0)

Only case is in switch blocks (or match expressions), which current does support condition switch statements. This could be provide with a when condition.

switch (obj)
{
  case obj Is Int64 x when x > 0
// etc
HaloFour commented 8 years ago

@AdamSpeight2008

The guard statement above is not related to pattern guards.

With if and switch the scope of out variables and variable patterns are limited to within the then-stmt or switch-section respectively:

if (int.TryParse(s, out var x)) {
    // x in scope here
}
// x no longer in scope here

As guard is intended to be used for validation where the condition would be negated. If using if in this scenario you could not bail out of the current method while retaining the out variable in scope:

if (!int.TryParse(s, out var x)) {
    Log.Warn("The string is not a number!");
    return;
}
// x no longer in scope here

So guard changes the scoping rules to allow the variable to be used in the enclosing scope:

guard (!int.TryParse(s, out var x)) else {
    Log.Warn("The string is not a number!");
    return;
}
// x is in scope here

And since guard is intended for validation purposes the embedded-statement in the else clause is required to exit the enclosing scope via return, throw, break or continue.

stepanbenes commented 8 years ago

@HaloFour Didn't you forget to write out var x? Changes in the scoping rules apply only to the newly created variables in out argument. In your example x is already in scope.

HaloFour commented 8 years ago

@stepanbenes That I did, I will fix the examples.

AdamSpeight2008 commented 8 years ago

@HaloFour Thank you for explaining the difference but I feel that my point still stands. Since in the negated case (the proposed usage case).

It would easier to just define the variable to use in the out variable position. In the existing scope.

int value = 0;
if( !Int.TryParse( text, out value ) )
{
// log stuff
return ;
 }
// do stuff with value

The example of guards proposed assume the usage ( of out variables ) will be in a conditional expression like if or switch etc.

bool result = Int.TryParse( text, out var value );
// value is defined and assigned ( possibly with default(T)

Also having guard statements change the scoping rules in this one use case, would get confusing where guards and non guard usage of out variable in the same section.

AdamSpeight2008 commented 8 years ago

@ljw1004

Inclusive Scope The out variable is accessible in the enclosing scope.

if (int.TryParse(input, out int i)) { Console.WriteLine(i); }
// Under this proposal, there's no way to limit the scope of "out arguments" to just the 'then' block

Exclusive scope The out variable is exclusive to the next enclosed scope, and not the enclosing scope. I propose that put is use to indicate this.

if (int.TryParse(input, put int i)) { Console.WriteLine(i); }
// Under this proposal, there's no way to limit the scope of "out arguments" to just the 'then' block
HaloFour commented 8 years ago

@AdamSpeight2008

It would easier to just define the variable to use in the out variable position. In the existing scope.

For out variables you're right, you can just declare the variable in the parent scope. However, with variable patterns that becomes impossible as the spec explicitly forbids assigning them to existing variables.

bool result = Int.TryParse( text, out var value );
// value is defined and assigned ( possibly with default(T)

Per the out declaration spec, value would not be in scope after that statement.

Also having guard statements change the scoping rules in this one use case, would get confusing where guards and non guard usage of out variable in the same section.

Considering that's mostly the point of guard I'm not sure that would really be confusing. This same feature with very similar syntax already exists in Apple Swift and it seems to be well received.

AdamSpeight2008 commented 8 years ago

@HaloFour Maybe the spec should be changed? To promoting the out into existing scope, reusing and existing variable, for the implicit case. As it supports both use cases. Then have some optional way of designating the two explicit cases. eg using attributes.

Explicitly Local

if( Int.TryParse( text, [local] out var value ) )
{
  // value is defined and assigned within here.
}
else
{
 // value is not define here.
}

Guard

if( !Int.TryParse( text, [guard] out var value ) )
{
  // value is not defined here.
}
else
{
 // value is defined and assigned
Thaina commented 8 years ago

I would go against attribute syntax. It too dirty and complicate

For local scope I think maybe we could add more syntax, maybe out let ?

if( Int.TryParse( text, out let value ) )
  // value is defined and assigned within here.
else // value is not define here.

// value is not define here ? maybe

For guard I still stand with return if We don't actually want to guard, we actually want it to return for sure. So why not just return

AdamSpeight2008 commented 8 years ago

@Thaina It's not about return the value, but into scope(s) should the out var exist in afterwards. return !if looks likes the function ends and returns there, and the code after is potentially unreachable.

Also I don't think it would go against attribute syntax, as I see the attribute is attached to the out var parameter argument.

Thaina commented 8 years ago

@AdamSpeight2008 Yes I know it not about return value but it about return out of scope so return if is doing just that. And it is a new syntax so we could explain in document that it will do work in if block before return

And about attribute, I would not like it if attribute can cause syntax error IMHO. It should be syntax keyword that could go that far

AdamSpeight2008 commented 8 years ago

@Thaina It's subtle change the prevailing usage or return. I'd go with a contextual keyword.

if( Int.TryParse( text, local out var value ) )
if( !Int.TryParse( text, guard out var value ) )
Thaina commented 8 years ago

if( Int.TryParse( text, out let value ) )

or at least if( Int.TryParse( text, out local value ) )

Shorter and cleaner

guard is another problem, it cannot do that syntax. it not parameter specific. it is scope specific

qrli commented 8 years ago

@gafter @HaloFour I just got a new idea inspired by "Destructuring (assignment?) statement #6400" The let else statement is very similar to this guard else, except it takes a pattern instead of a conditional expression. So it feels like the pattern expression can be tweaked.

let bool(true) = int.TryParse(s, out var parsedValue) else return;
// scope of out var is treated the same as the returned variable, so they are all accessible here

But the pattern part looks a bit weird. I guess the pattern syntax could extended to support function call, though I do not have a concrete idea yet. The ideal imagination for me is:

let int.TryParse(s, out var parsedValue) else return;