Discussion: Conditional branches, `return when (...) with (...)`

lachbaer commented 7 years ago

Idea

Many consider it a bad habit to break the code flow with jump instructions like return, break and continue. One valid reason is, that the points of the jump are easily overseen.

A very common scenario are conditional returns, one e.g. being the check for null.

I want to discuss a syntax, that would allow for a cleaner appearance of the conditional 'jumps'.

See the following (made-up) code example:

public static IEnumerable<T> GetValueTypeItems<T>(
  IList<T?> collection, params int[] indexes)
  where T : struct
{
  if (collection == null)
  {
    yield break;
  }

  foreach (int index in indexes)
  {
    T? item = collection[index];
    if (item != null) yield return (T)item;
  }
}

With a syntax like

return when (condition) with (return-value)

the above code could be written like

public static IEnumerable<T> GetValueTypeItems<T>(
  IList<T?> collection, params int[] indexes)
  where T : struct
{
  yield break when (collection == null);

  foreach (int index in indexes)
  {
    T? item = collection[index];
    yield return when (item != null)
                 with ((T)item);
  }
}

when and with can be interchanged.

    yield return with ((T)item)
                 when (item != null);

Syntax

The paranthesis after the when and with expressions are mandatory for the lexer and parser.
In case no return value is needed, like in void methods or the yield break above, the with part must be ommited
You should also be able to return with (default) when (...), where default is the implied default of the methods return type.
Like with traditional if clauses, a (block-) statement can be followed after with or when. This block/statement is executed before the branch/return.
Any locals declared in the when portion are visible within that block statement.

I chose when and with, because they are more contextual than if. with sounds quite verbose to me and does in this context not conflict with the with for record types.

Practical example

In theory the idea might look unnecessary. But lease look at the following code, copied from https://github.com/dotnet/roslyn/blob/master/src/Compilers/CSharp/Portable/Binder/Binder_Operators.cs#L1415

I have rewritten that function (below) using the proposed syntax. I think that the expression of the code is much better there!

private ConstantValue FoldBinaryOperator(
    CSharpSyntaxNode syntax,
    BinaryOperatorKind kind,
    BoundExpression left,
    BoundExpression right,
    SpecialType resultType,
    DiagnosticBag diagnostics,
    ref int compoundStringLength)
{
    Debug.Assert(left != null);
    Debug.Assert(right != null);

    if (left.HasAnyErrors || right.HasAnyErrors)
    {
        return null;
    }

    // SPEC VIOLATION: see method definition for details
    ConstantValue nullableEqualityResult = TryFoldingNullableEquality(kind, left, right);
    if (nullableEqualityResult != null)
    {
        return nullableEqualityResult;
    }

    var valueLeft = left.ConstantValue;
    var valueRight = right.ConstantValue;
    if (valueLeft == null || valueRight == null)
    {
        return null;
    }

    if (valueLeft.IsBad || valueRight.IsBad)
    {
        return ConstantValue.Bad;
    }

    if (kind.IsEnum() && !kind.IsLifted())
    {
        return FoldEnumBinaryOperator(syntax, kind, left, right, diagnostics);
    }

    // Divisions by zero on integral types and decimal always fail even in an unchecked context.
    if (IsDivisionByZero(kind, valueRight))
    {
        Error(diagnostics, ErrorCode.ERR_IntDivByZero, syntax);
        return ConstantValue.Bad;
    }

    object newValue = null;

    // Certain binary operations never fail; bool & bool, for example. If we are in one of those
    // cases, simply fold the operation and return.
    //
    // Although remainder and division always overflow at runtime with arguments int.MinValue/long.MinValue and -1 
    // (regardless of checked context) the constant folding behavior is different. 
    // Remainder never overflows at compile time while division does.
    newValue = FoldNeverOverflowBinaryOperators(kind, valueLeft, valueRight);
    if (newValue != null)
    {
        return ConstantValue.Create(newValue, resultType);
    }

    ConstantValue concatResult = FoldStringConcatenation(kind, valueLeft, valueRight, ref compoundStringLength);
    if (concatResult != null)
    {
        if (concatResult.IsBad)
        {
            Error(diagnostics, ErrorCode.ERR_ConstantStringTooLong, syntax);
        }

        return concatResult;
    }

    // Certain binary operations always fail if they overflow even when in an unchecked context;
    // decimal + decimal, for example. If we are in one of those cases, make the attempt. If it
    // succeeds, return the result. If not, give a compile-time error regardless of context.
    try
    {
        newValue = FoldDecimalBinaryOperators(kind, valueLeft, valueRight);
    }
    catch (OverflowException)
    {
        Error(diagnostics, ErrorCode.ERR_DecConstError, syntax);
        return ConstantValue.Bad;
    }

    if (newValue != null)
    {
        return ConstantValue.Create(newValue, resultType);
    }

    if (CheckOverflowAtCompileTime)
    {
        try
        {
            newValue = FoldCheckedIntegralBinaryOperator(kind, valueLeft, valueRight);
        }
        catch (OverflowException)
        {
            Error(diagnostics, ErrorCode.ERR_CheckedOverflow, syntax);
            return ConstantValue.Bad;
        }
    }
    else
    {
        newValue = FoldUncheckedIntegralBinaryOperator(kind, valueLeft, valueRight);
    }

    if (newValue != null)
    {
        return ConstantValue.Create(newValue, resultType);
    }

    return null;
}

Rewritten with return when () while ():

private ConstantValue FoldBinaryOperator(
    CSharpSyntaxNode syntax,
    BinaryOperatorKind kind,
    BoundExpression left,
    BoundExpression right,
    SpecialType resultType,
    DiagnosticBag diagnostics,
    ref int compoundStringLength)
{
    Debug.Assert(left != null);
    Debug.Assert(right != null);

    return when (left.HasAnyErrors || right.HasAnyErrors) with (null);

    // SPEC VIOLATION: see method definition for details
    ConstantValue nullableEqualityResult = TryFoldingNullableEquality(kind, left, right);
    return when (nullableEqualityResult != null) with (nullableEqualityResult);

    var valueLeft = left.ConstantValue;
    var valueRight = right.ConstantValue;
    return when (valueLeft == null || valueRight == null) with (default);

    return when (valueLeft.IsBad || valueRight.IsBad) with (ConstantValue.Bad);

    return when (kind.IsEnum() && !kind.IsLifted())
           with (FoldEnumBinaryOperator(syntax, kind, left, right, diagnostics));

    // Divisions by zero on integral types and decimal always fail even in an unchecked context.
    return when (IsDivisionByZero(kind, valueRight)) with (ConstantValue.Bad)
        Error(diagnostics, ErrorCode.ERR_IntDivByZero, syntax);

    object newValue = null;

    // Certain binary operations never fail; bool & bool, for example. If we are in one of those
    // cases, simply fold the operation and return.
    //
    // Although remainder and division always overflow at runtime with arguments int.MinValue/long.MinValue and -1 
    // (regardless of checked context) the constant folding behavior is different. 
    // Remainder never overflows at compile time while division does.
    newValue = FoldNeverOverflowBinaryOperators(kind, valueLeft, valueRight);
    return when (newValue != null)
           with (ConstantValue.Create(newValue, resultType));

    ConstantValue concatResult = FoldStringConcatenation(kind, valueLeft, valueRight, ref compoundStringLength);
    return when (concatResult != null) with (concatResult)
    {
        if (concatResult.IsBad)
        {
            Error(diagnostics, ErrorCode.ERR_ConstantStringTooLong, syntax);
        }
    }

    // Certain binary operations always fail if they overflow even when in an unchecked context;
    // decimal + decimal, for example. If we are in one of those cases, make the attempt. If it
    // succeeds, return the result. If not, give a compile-time error regardless of context.
    try
    {
        newValue = FoldDecimalBinaryOperators(kind, valueLeft, valueRight);
    }
    catch (OverflowException)
    {
        Error(diagnostics, ErrorCode.ERR_DecConstError, syntax);
        return ConstantValue.Bad;
    }

    return with (ConstantValue.Create(newValue, resultType))
           when (newValue != null);

    if (CheckOverflowAtCompileTime)
    {
        try
        {
            newValue = FoldCheckedIntegralBinaryOperator(kind, valueLeft, valueRight);
        }
        catch (OverflowException)
        {
            Error(diagnostics, ErrorCode.ERR_CheckedOverflow, syntax);
            return ConstantValue.Bad;
        }
    }
    else
    {
        newValue = FoldUncheckedIntegralBinaryOperator(kind, valueLeft, valueRight);
    }

    return when (newValue != null)
           with (ConstantValue.Create(newValue, resultType));

    return null;
}

CyrusNajmabadi commented 7 years ago

But the type returned by the block must be an ErrorEnum.

If you really needed that, then you could just create a local function that returns ErrorEnum, and then call that. But, that said, i have no idea why you would actually need these annotations...

As I said, it needn't be that comprehensive. This is a discussion and not a very concrete proposal. It should only show a way that could be gone.

You're not being very convincing :) I can't even see why i would ever want these or what value they would add. It would be akin to me saying:

"Hey! wouldn't it be cool if you could write this:"

[result: Sum(a, b)]
return a + b;

[result: If(GreaterThan(Sum(a, b), c), d, e)]
if ((a + b) > c)
{
    return d;
}
else 
{
    return e;
}

So... great... i can do this. But what purpose does it actually serve? Why would i want to do this? Why is it valuable? What benefit does it give the user? It just seems like it's redundantly stating exactly what the code is doing directly next to it...

CyrusNajmabadi commented 7 years ago

I said I want it forced on everyone for newly written code.

That is a still a break. If people cannot take existing code and compile it in a new project, then that's not ok. Now you have to explain to them that there are 'dialects' of C# and that they cannot mix/match.

Again, if you want this then use an analyzer. You can write your own, and I know some already exsit for this purpose. We are not going to change the default for C# to break on code that people have been reasonably writing and maintaining for 10+ years just because you don't like it :)

lachbaer commented 7 years ago

I quote myself:

In this concrete proposal marking a statement or block with [return: Return] would serve two purposes.

Unwinding the "deeply hidden" return statement to the beginning of the statement. That way it is much more easy to spot return points.
Assuring that the block actually returns at appropriate places - kind of a built-in test. With if (...) { / many other statements / return; } you cannot accomplish a code-analysation for a return without marking the statement itself.

Especially point 1. seems important to me!

CyrusNajmabadi commented 7 years ago

Unwinding the "deeply hidden" return statement to the beginning of the statement.

I see nothing 'deeply hidden'.

if (a > b) 
{
     return c;
}

There is nothing hidden about this... :)

CyrusNajmabadi commented 7 years ago

Assuring that the block actually returns at appropriate places - kind of a built-in test.

I'm really struggling to see the value here of that. If you want a test, just write a test. It sounds like you want some sort of contract feature, but that you want that contract to literally apply to each and every statement. i.e.

[ItsOkToUseAnIfHere]
if (...) 
{
}

[YouHaveToReturnHere]
return;

I'm really not seeing hte value here. Contracts as a way of describing the obervable functionality of your API make sense. Contracts as a way of restating your flow and mandating that that flow be exactly that seem to serve no purpose.

CyrusNajmabadi commented 7 years ago

you cannot accomplish a code-analysation for a return without marking the statement itself.

I don't understand why this is something i would want... What sort of 'code analysis' on the return do you want to have?

lachbaer commented 7 years ago

Yes it it. The return is "hidden" behind the if and two lines below it.

if (a > b)
{
    Console.WriteLine("A is larger than B, nothing to do here...");
    return;
}

Now it is already 3 lines below the if.

[return: Return]
if (a > b)
{ ...

It doesn't matter where the return is in this statement, but it gives me - the reader the opportunity to immediately see, that this conditional block is there for the sole purpose of returning the method.

CyrusNajmabadi commented 7 years ago

Yes it it. The return is "hidden" behind the if and two lines below it.

I think you and i have different definitions of the word "Hidden". :)

but it gives me - the reader the opportunity to immediately see, that this conditional block is there for the sole purpose of returning the method.

Why is that valuable? Why is that more valuable than anything else that might happen in the if block? For example:


[ThisIfMayThrowAnExceptionInside]
[ThisIfCallsThroughAnInterfaceMethod]
[ThisIfContainCodeThatAllocates]
[ThisIfHasAnotherIf]
[ThisIfDeclaresVariablesInsideOfIt]
[ThisIfHasALabelInItWhichCanBeReachedFromAnotherBlock]
[ThisIfDoesNotDefinitelyAssignAllVariablesThatFlowIntoIt]
if (...)
{
    ...
}

:)

Great, now i have a lot of up front information about the 'if' :) Now what?

CyrusNajmabadi commented 7 years ago

but it gives me - the reader the opportunity to immediately see, that this conditional block is there for the sole purpose of returning the method.

I recommend you write an analyzer for this purpose. You can simply create comments that you then check in your code for this sort of thing. For example,

//! return
if (...)
{
     ...  if there is a return in here and there is no //! return on the if, then fail the analyzer
}

CyrusNajmabadi commented 7 years ago

With the analyzer approach you have full flexibility to define and dictate 100% of the sorts of things you think should be called out ahead-of-time with all your code. As you discover new and interesting things you want to avoid/ban, you can add them. If people find these valuable for their own projects they can get your analyzer and use it themselves.

lachbaer commented 7 years ago

I'm really struggling to see the value here of that. If you want a test, just write a test. It sounds like you want some sort of contract feature, but that you want that contract to literally apply to each and every statement.

😢 😭 What am I doing wrong that I am misunderstood all the time.... ?!?!?!?! 😭

What sort of 'code analysis' on the return do you want to have?

Question: how would you, Cyrus Najmabadi, ensure with a test or code analysis, that a (long) if block with further nested statements always returns from the method? And don't tell me to outsource the contents of the if to a further method! In Roslyn I see long if blocks everywhere - I don't condemn that, but it simply shows that the real world is bit different from the raw theory.

lachbaer commented 7 years ago

Great, now i have a lot of up front information about the 'if' :) Now what?

You are exaggerating now! This is for the concrete case of returns only...

lachbaer commented 7 years ago

how would you ensure with a test or code analysis, that a (long) if block with further nested statements always returns from the method?

Okay, I see, by using a comment.

lachbaer commented 7 years ago

I think you and i have different definitions of the word "Hidden". :)

And here we go again => human factors 😄

CyrusNajmabadi commented 7 years ago

What am I doing wrong that I am misunderstood all the time

You're asking for things without giving clear explanations for why they would be valuable, or why I would want to use them in my own code :)

ensure with a test or code analysis, that a (long) if block with further nested statements always returns from the method?

You've lost me already. Why would i want to test for this or analyze this? I don't approach problems going: "i need to ensure that this if statement must return."

That's not how I code, nor does it even make sense to me as something i would care about. What i care about is what funcitonality a 'method/class/module/component' provides, and testing that it does so properly. Asking how i would test if there is a return is like asking me to test a component by validating that it is implemented under the covers with lambdas. I simply don't care. It's a not a relevant or appropriate question that even comes into my mind.

And don't tell me to outsource the contents of the if to a further method! In Roslyn I see long if blocks everywhere

I don't understand. Roslyn does what it does because we don't see a problem here. If you have a problem with this style of coding you are free to code differently (for example, the way i mentioned). We don't do that ourselves because we're not the ones complaining about this sort of thing :)

CyrusNajmabadi commented 7 years ago

You are exaggerate now! This is for the concrete case of returns only...

Why are 'returns' any more important or interesting than anything else i mentioned?

CyrusNajmabadi commented 7 years ago

And here we go again => human factors

Ok. So again, why would we just do this for 'returns'. When the next person comes along and says "i want to know if other code jumps into this if block, give me an attribute for that", what do we say? Or "does ths code allocate?" or "does this code call out to other methods?" etc. etc.

You've basically said "this is a problem that i have, provide me a solution to for exactly that issue". But that's not how we do language design. Our language features need to be either:

broadly applicable. i.e. a huge amount of our user base will want this and benefit from it.
hugely beneficial. i.e. it might be a smaller group (but still need to be a lot of people), but the gain they get is enormous.

If you propose a feature which is not only applicable to just you (and maybe a handful of other people), and provides little benefit (i.e. tells you tehre is a return, when you can see the return 3 lines lower)... then that's not going ot be something anyone is champing at to get in the language.

lachbaer commented 7 years ago

If you have a problem with this style of coding you are free to code differently

I do not have a problem. I wanted to point out that probably many others would always vote for outsourcing with many valid arguments. But in the real world other teams have other opinions. Where your team doesn't see a problem, another team does. I'm completely liberal on this point!

CyrusNajmabadi commented 7 years ago

Okay, I see, by using a comment.

Yes. You're effectively saying "i want to put special documentation in my code, and have it validated". You may also be saying (though i'm not sure) "i want hte absense of this documentation to be considered a problem".

Both of these goals are solved by analyzers. Write up whatever documentation you want for your code. Write whatever analyzers you want. Make it publicly available if you want. Roslyn is there for you to get this today.

CyrusNajmabadi commented 7 years ago

I wanted to point out that probably many others would always vote for outsourcing with many valid arguments.

I'm not sure i know what this means.

lachbaer commented 7 years ago

You haven't commented on my comment on [Condition.CLang] ;-)

It was just a syntax sample of statement attributes with no further meaning. Though that one could help to copy-paste legacy C-code algorithms into an unsafe block of a C# program without having to spend too much time for language agnostic adoptions (thus making the 'C' in C# a bit prouder of itself ;-)

CyrusNajmabadi commented 7 years ago

@lachbaer See above:

Our language features need to be either:

broadly applicable. i.e. a huge amount of our user base will want this and benefit from it.

hugely beneficial. i.e. it might be a smaller group (but still need to be a lot of people), but the gain they get is enormous.

If:

we had tons of people tyring to copy/paste legacy C into C# and
those people were persistently complaining that the differences in the language were a problem and
we felt we could effectively smooth things out here then

maybe we'd do something here**. In the absence of that, we're not going to take on a language feature just because it seems "nifty" :)

--

**

It's worth nothing that, at times, we have done precisely this. 'ref-returns' are an example of how we've worked to make our language work better with the patterns native systems like C use to get very low overhead, very high performance systems. We did not go for an approach where you could just 'drop in C' (that really isn't something that could even work given how C# ends up executing at the end of the day). But we took inspiration from how many native systems work, and we produced something that would provide similar benefits, while still working within the constraints of our system (i.e. where we want to ensure safety by default).

lachbaer commented 7 years ago

Would you just consider to validate the following argument for statement-attributes?

Allowing attributes on statements and blocks would allow for custom code analyzers
Analyzers don't have to look for comments, because...
Comments can stay comments and don't serve a programmatic purpose
Attributes underly a concrete syntax, whereas comments can have unseen typos

Whether I use it for decorating return blocks and analyze them is then up to me. But it gives all programmers/team a syntax supported pattern.

Addendum: Especially now with Roslyn allowing us to do comprehensive analysis further places for attributes would be nice

CyrusNajmabadi commented 7 years ago

Proposal for Attributes-Everywhere is here: https://github.com/dotnet/roslyn/issues/6671

lachbaer commented 7 years ago

No place yet in dotnet/csharplang?

CyrusNajmabadi commented 7 years ago

Doesn't look like anyone has pushed on it to go anywhere.

lachbaer commented 7 years ago

But the soup of ifnot and until is not eaten yet! 😉

Our language features need to be either:

broadly applicable. i.e. a huge amount of our user base will want this and benefit from it.

hugely beneficial. i.e. it might be a smaller group (but still need to be a lot of people), but the gain they get is enormous.

Whereas for ifnot we can still argue a lot, until seems SO intuitive to me, much better than leading new programmers to while (!...). And all experienced developers will intuitively know what it means. Again that is human factors, with human meaning a broad mass of existing people - not just me ;-)

lachbaer commented 7 years ago

Doesn't look like anyone has pushed on it to go anywhere

Would you mind to do so? Now that it came up here.

CyrusNajmabadi commented 7 years ago

I am personally not interested in championing that feature. Sorry :). You can check with StephenToub though on that bug that i linked to.

lachbaer commented 7 years ago

As I will close the topic anyhow shortly, Cyrus, have you already started on the property-scoped fields? I tried to start on semi-auto-properties with the field keyword and thought that a new PropertyScopeBinder could be helpful here.

CyrusNajmabadi commented 7 years ago

@lachbaer I put in some effort a while back, but ran into major 'scope' issues. Specifically that the size of the feature kept growing larger and larger. For example, once i started into it it became clear that while fields were a nice start, some people would then want a shared method that only the accessors could use. And then there were cases where you'd want to be able to access the fields/other-members from derived types, etc.

The scope grew so large that i had to put it on hold while i focused on other things.

lachbaer commented 7 years ago

@CyrusNajmabadi Complex topic! My opinion is to keep it all KISS, meaning that everything declared within a property is by definition private. For anything more complex the programmer should fall back to 'good old C#' 😃 Property-local functions are a logical "MUST" now that local functions are available. But those, too, should just be private.

CyrusNajmabadi commented 7 years ago

That's definitely an approach we could go :)

lachbaer commented 7 years ago

Though this proposal has a very broad agreement 😁 I'm closing this issue.

The best alternative (to me) are code analyzers that operate on attributes.

dotnet / csharplang