Proposal: `tail return` recursive calls

agocke commented 5 years ago

There are a number of instances where it may be clearer to express your algorithm via recursion, especially tail recursion, but unfortunately all recursion in C# currently requires O(n) space for just the call stack, meaning that your algorithm cannot take less than O(n) space.

It is proposed that C# provide a new statement form and a new expression form to allow tail-recursive calls to consume O(1) space.

Syntax and Semantics

The statement form would be tail return <expr>, analogous to yield return, where the expr target is required and must be a tail recursive call.

The expression form would be tail <expr>, where the <expr> would have the same restrictions as the statement form, and the tail <expr> expression would only be legal as the expression in an expression-bodied member.

Codegen

There is a tail instruction in the CLR, but if that instruction is not reliable or desired, the compiler can lower this into a goto after assigning the new arguments for the recursive call.

For instance,

void M(int x)
{
    ...
    tail return M(x - 1);
}

Could be rewritten to

void M(int x)
{
    start:
    ...
    x = x - 1;
    goto start;
}

Limitations

Because the compiler cannot reliably rewrite inter-procedural calls, this would only currently be legal for tail-recursive calls. It would not be allowed for e.g., mutually tail recursive calls. If the CLR can reliably perform tail calls without significant downsides on all supported platforms, the feature could be broadened from the current specification.

agocke commented 5 years ago

cc @gafter

YairHalberstadt commented 5 years ago

Has any analysis been done to see how many locations in eg. Roslyn could benefit from this?

Is a new keyword necessary, or would an attribute be sufficient, as this doesn't change the semantics of the program?

YairHalberstadt commented 5 years ago

Would rewriting mutually recursive local functions be considered?

CyrusNajmabadi commented 5 years ago

So, presumably you can write;

if (x)
{
    tail return MyMethod(a, b, c);
}
else
{
    tail return MyMethod(d, e, f)
}

But would you be able to do:

tail return x ? MyMethod(a, b, c) : MyMethod(d, e, f);

?

agocke commented 5 years ago

@CyrusNajmabadi Not as written, no.

agocke commented 5 years ago

@YairHalberstadt Regardless of what we could do for local functions, the restrictions as listed would not allow mutual tail recursion.

As for attributes, the important part is really the call, not the definition, right? We can't put attributes on expressions, so if the call is really the part we want to annotate, attributes are not suitable.

CyrusNajmabadi commented 5 years ago

@agocke Understood. Probably not critical to support. but it would be nice if it was relaxed to teh generalized definition of tail-calls. Namely that they must be the last thing the function executes. In such a formalization, conditionals would be fine.

Just for my own edification, would this be challenging to support?

agocke commented 5 years ago

Probably not. The nice thing about tail calls is you don't really care about all the stuff nested inside the expression, you only need to know that 1) the outer expression is a call and 2) the call is recursive.

For conditional, that would just add an introspection "through" the conditional to both sides. From a language point of view it's a bit unfortunate to dig through expressions, but it's probably necessary for the broader semantics to work properly with conditional.

CyrusNajmabadi commented 5 years ago

For conditional, that would just add an introspection "through" the conditional to both sides. From a language point of view it's a bit unfortunate to dig through expressions, but it's probably necessary for the broader semantics to work properly with conditional.

Makes sense. Thanks for clarifying!

HaloFour commented 5 years ago

It seems a little weird trying to generate explicit tail call behavior in the C# language given that IL supports it and RyuJIT supports it as an optimization implicitly. I can understand the concerns with the reliability of those features and explicit language support would make it more predictable.

Given that tail recursion and functional programming seem to go hand-in-hand I'd want to ensure that a feature like this would cover those common bases, like being embedded in a switch expression. My (limited) experience with @tailrec in Scala is that it's often somewhat tricky/obnoxious to find the right order of operations to satisfy the compiler.

CyrusNajmabadi commented 5 years ago

One more question:

What if you do:

void M(int x)
{
    ...
    tail return other.M(x - 1);
}

Seems like this should be supported. This would just be equiv to slamming over the 'this' param with 'other' and then tailing, right?

Or will this only be allowed for static methods, or tail-methods with 'this' as the receiver? Thanks!

CyrusNajmabadi commented 5 years ago

It seems a little weird trying to generate explicit tail call behavior in the C# language

I actually don't have a problem with this. It's less about the compiler actually generating something appropraite, and it's more about actually ensuring you either get tail calls, or you fail to compile. For example, in my programming life i've made constant mistake around code i thought should be tail-called. It is so simple to do something accidental that makes tail calls stop working, and you can end up shooting yourself in the foot.

So, i see the presence of tail in the language much more of a check to say "you better not have screwed this up!".

It's similar to a static local function for me. Yes, i can just not capture anything. But static makes sure that if i mess that up in the future it just won't compile.

HaloFour commented 5 years ago

I know that C# does not yet have list patterns, but something like this would be critical to support:

public int Sum(int seed, List<int> list) => list switch {
    [ ] => seed,
    [value ... rest] => tail Sum(seed + value, rest),
};

Edit: Reordered the cases so that the tail call happens last. I believe Scala requires this. Would be nice if C# didn't have to.

HaloFour commented 5 years ago

@CyrusNajmabadi

It's less about the compiler actually generating something appropraite, and it's more about actually ensuring you either get tail calls, or you fail to compile.

This proposal seems to be about emitting the explicit goto IL to implement the tail call and skipping whatever IL/JIT support there is. I'd agree that I'd probably prefer a tail operator (or attribute+analyzer) to simply enforce what the runtime supports, but that's probably a difficult target to nail down. IIRC the runtime is free to ignore the tail. opcode prefix, and implicit support varies quite a bit.

CyrusNajmabadi commented 5 years ago

but something like this would be critical to support:

I would just like the proposal relaxed to "return points". Then, it should "just work" for conditional-exprs, switch-statements (though that likely doesn't need anything special), switch-exprs, expr-bodies, and anything else that has that concept.

agocke commented 5 years ago

@HaloFour Good point. That would not work with this proposal but it does seem unfortunate that it wouldn't.

That's really the same problem as conditional, where making this work requires digging through the semantics of the switch, which is a little gross 😕

HaloFour commented 5 years ago

Scala example:

@tailrec
private def sumWithAccumulator(list: List[Int], accumulator: Int): Int = {
    list match {
        case Nil => accumulator
        case x :: xs => sumWithAccumulator(xs, accumulator + x)
    }
}

CyrusNajmabadi commented 5 years ago

This proposal seems to be about emitting the explicit goto IL to implement the tail call and skipping whatever IL/JIT support there is.

I'm breaking it apart in my head. First, i'm just assessing what the language feature is. And i think it would be valuable to have this language feature (for the reason i specified above). If the language feature is good, it's definitely worth discussing what the right emit-strategy is. Honestly, given how simple the 'goto' form should be, i really would have no problem with the langauge just spitting that out. However, if there's high confidence that the .tail instruction works for the entirety of the ecosystem, i think it's fine if that's the emit strategy.

CyrusNajmabadi commented 5 years ago

which is a little gross

From a language perspective, it seems utterly natural to me :) Or do you mean 'gross' in the sense of the impl?

agocke commented 5 years ago

First, i'm just assessing what the language feature is.

This proposal is primarily about getting guaranteed O(1) space growth for tail call recursion.

Any codegen strategy that can meet that requirement is up for grabs.

CyrusNajmabadi commented 5 years ago

Minor point, but syntactically, we could just consider this being a 'tail expression' and then only allowing the 'tail expression' in certain constructs.

that way, you would write it more like this return a ? tail Foo() : tail Foo(). You would also do Foo() => x switch { y => tail Foo() ....

In this formalization, expression-bodies are not special cased. They just naturally fall out.

Note: this is how we handled 'ref' as well. i.e. we don't have a ref-conditional internally. We just allow 'ref expressions' and those expressions are only allowed in special constructs. From an impl perspective, this also makes it much easier to do error tolerance.

CyrusNajmabadi commented 5 years ago

Note: the approach as currently stated seems like it would not allow using 'tail' for void methods. Maybe that's not a big deal, but it seems a bit odd. If this is relaxed to "the last op must be a call to the enclosing method", then this can support tail recursive void methods fine, where the last call is just a recursive call to yourself.

agocke commented 5 years ago

Note: this is how we handled 'ref' as well. i.e. we don't have a ref-conditional internally. We just allow 'ref expressions' and those expressions are only allowed in special constructs.

Nope, that's not how it works. In the language we created new special "ref" variants of each of those language forms and then special-cased those behaviors, creating an incredibly complex ref matrix. You're thinking of how we implemented it in the compiler, where there is such a thing as a "ref expression" (which implements said incredibly complicated ref matrix 😄).

HaloFour commented 5 years ago

@CyrusNajmabadi

If the language feature is good, it's definitely worth discussing what the right emit-strategy is. Honestly, given how simple the 'goto' form should be, i really would have no problem with the langauge just spitting that out. However, if there's high confidence that the .tail instruction works for the entirety of the ecosystem, i think it's fine if that's the emit strategy.

I agree, it's worth exploring. It's just weird to me that we're adding a third mechanism for tail recursion. But maybe the third time is the charm? 😁

@agocke

In these cases would we want to allow the same method to invoke itself normally? If not, why not take a page from Scala and make it a modifier of the method itself? Or an attribute that the compiler recognizes? That would involve no changes to the grammar of the language, I think.

I wouldn't have much of an opinion one way or the other, I'm only making the comparison with Scala. I can see a benefit to a compiler-enforced modifier/attribute that ensures that the entire method supports tail recursion. But I can also see an advantage to the escape hatch to support normal recursion in an alternate code path.

Also, I think it's worthwhile to bring in the F# folks and discuss the scenarios in which that compiler will emit the tail. IL prefix, their experiences as to when the runtime actually supports it and whether similar goto-based codegen was discussed.

agocke commented 5 years ago

@HaloFour I see no reason why you couldn't mix tail recursive calls and regular recursive calls.

HaloFour commented 5 years ago

I didn't realize that the F# compiler also rewrites the function into a loop. I thought they emitted tail. and let the runtime optimize it. Perhaps they had the same experience of it being unreliable?

let rec sum values accum =
    match values with
    | [] -> accum
    | head :: tail -> sum tail (accum + head)

https://sharplab.io/#v2:DYLgZgzgNALiCWwA+wCmMAEAnVBjDEArgLYYBuAhsIahBhbriRgLwCwAUBtxsRTLgAW5KjToB3eDEGceGJBgDaAXQwBaAHz1GJWTwWDUFACYYQIDDAqJ1WoqSs2AFAyakA1BkMmAlJyA

Note: Unlike Scala, F# doesn't require the tail call to be in the last case.

CyrusNajmabadi commented 5 years ago

Nope, that's not how it works. In the language we created new special "ref" variants of each of those language forms and then special-cased those behaviors, creating an incredibly complex ref matrix. You're thinking of how we implemented it in the compiler, where there is such a thing as a "ref expression" (which implements said incredibly complicated ref matrix 😄).

We originally considered entirely different syntactic forms here where the 'ref' wouldn't necessarily even be in a position that it could be interpretted to be part of hte expression. We then with a language form that had the benefit that you could think of it either as a specialized form, or as a 'ref expression'. We went with that because of the attractive nature at the grammar and the impl level.

I'm arguing we should do the same. Instead of thnking about 'tail' as being related to 'return', think about 'tail' as being something you do with a call. That is both a nicer way to conceptualize it (imo) and it leads to a nicer implementation. It's win/win for me.

YairHalberstadt commented 5 years ago

I think it's probably useful for proposals like this to have a number of real life examples of cases where it might be applied, so that the LDM can be sure they've considered all potential use cases.

Here is an example of a method I'm writing which both uses void tail recursion, and mixes tail recursive and non tail recursive calls:

        internal static void VisitRankSpecifiers<TArg>(this TypeSyntax type, Action<ArrayRankSpecifierSyntax, TArg> action, TArg argument)
        {
            switch (type)
            {
                case ArrayTypeSyntax arrayTypeSyntax:
                    arrayTypeSyntax.ElementType.VisitRankSpecifiers(action, argument);
                    foreach (var rankSpecifier in arrayTypeSyntax.RankSpecifiers)
                    {
                        action(rankSpecifier, argument);
                    }
                    break;
                case NullableTypeSyntax nullableTypeSyntax:
                    nullableTypeSyntax.ElementType.VisitRankSpecifiers(action, argument);
                    break;
                case PointerTypeSyntax pointerTypeSyntax:
                    pointerTypeSyntax.ElementType.VisitRankSpecifiers(action, argument);
                    break;
                case TupleTypeSyntax tupleTypeSyntax:
                    foreach (var element in tupleTypeSyntax.Elements)
                    {
                        element.Type.VisitRankSpecifiers(action, argument);
                    }
                    break;
                case RefTypeSyntax refTypeSyntax:
                    refTypeSyntax.Type.VisitRankSpecifiers(action, argument);
                    break;
                case GenericNameSyntax genericNameSyntax:
                    foreach (var typeArgument in genericNameSyntax.TypeArgumentList.Arguments)
                    {
                        typeArgument.VisitRankSpecifiers(action, argument);
                    }
                    break;
                case QualifiedNameSyntax qualifiedNameSyntax:
                    qualifiedNameSyntax.Left.VisitRankSpecifiers(action, argument);
                    qualifiedNameSyntax.Right.VisitRankSpecifiers(action, argument);
                    break;
                case IdentifierNameSyntax _:
                    break;
                case OmittedTypeArgumentSyntax _:
                    break;
                case PredefinedTypeSyntax _:
                    break;
                case AliasQualifiedNameSyntax _:
                    break;
                default:
                    throw ExceptionUtilities.Unreachable;
            }

An extra optimization that tail recursion would make possible, would be to avoid copying argument on every call, which is helpful for large structs. I imagine though that is an implementation detail, and wouldn't be guaranteed by the spec.

Ideally I would want to rewrite it as:

        internal static void VisitRankSpecifiers<TArg>(this TypeSyntax type, Action<ArrayRankSpecifierSyntax, TArg> action, TArg argument)
        {
            switch (type)
            {
                case ArrayTypeSyntax arrayTypeSyntax:
                    arrayTypeSyntax.ElementType.VisitRankSpecifiers(action, argument);
                    foreach (var rankSpecifier in arrayTypeSyntax.RankSpecifiers)
                    {
                        action(rankSpecifier, argument);
                    }
                    break;
                case NullableTypeSyntax nullableTypeSyntax:
                    tail nullableTypeSyntax.ElementType.VisitRankSpecifiers(action, argument);
                    break;
                case PointerTypeSyntax pointerTypeSyntax:
                    tail pointerTypeSyntax.ElementType.VisitRankSpecifiers(action, argument);
                    break;
                case TupleTypeSyntax tupleTypeSyntax:
                    foreach (var element in tupleTypeSyntax.Elements)
                    {
                        element.Type.VisitRankSpecifiers(action, argument);
                    }
                    break;
                case RefTypeSyntax refTypeSyntax:
                    tail refTypeSyntax.Type.VisitRankSpecifiers(action, argument);
                    break;
                case GenericNameSyntax genericNameSyntax:
                    foreach (var typeArgument in genericNameSyntax.TypeArgumentList.Arguments)
                    {
                        typeArgument.VisitRankSpecifiers(action, argument);
                    }
                    break;
                case QualifiedNameSyntax qualifiedNameSyntax:
                    qualifiedNameSyntax.Left.VisitRankSpecifiers(action, argument);
                    tail qualifiedNameSyntax.Right.VisitRankSpecifiers(action, argument);
                    break;
                case IdentifierNameSyntax _:
                    break;
                case OmittedTypeArgumentSyntax _:
                    break;
                case PredefinedTypeSyntax _:
                    break;
                case AliasQualifiedNameSyntax _:
                    break;
                default:
                    throw ExceptionUtilities.Unreachable;
            }

Note that the tail calls are always last to be executed, but since this is a void method, do not return.

Given that in many cases (aprox 77% in corefx) a generic type will only have one type argument, it would also be useful if we could make the last call of a loop tail recursive. I imagine to do that though we would have to rewrite the loop explicitly as:

for(int i = 0; i < genericNameSyntax.TypeArgumentList.Arguments - 1; i++)
    genericNameSyntax.TypeArgumentList.Arguments[i].VisitRankSpecifiers(action, argument);
if(genericNameSyntax.TypeArgumentList.Arguments.Count > 0)
    tail genericNameSyntax.TypeArgumentList.Arguments[genericNameSyntax.TypeArgumentList.Arguments.Count - 1].VisitRankSpecifiers(action, argument);
break;

and it would be out of scope for this feature to do that automatically.

orthoxerox commented 5 years ago

Obligatory reference to an earlier discussion: https://github.com/dotnet/roslyn/issues/1235

While I agree that interprocedural tail recursion and tail recursion modulo cons would be great, I'd rather the LDM hammer out the simplest case first: tail call support in non-virtual self-recursive methods. That alone would solve like 80% of all cases.

PathogenDavid commented 5 years ago

How would the goto codegen interact with stackalloc? As far as I can find, there's no way to explicitly free memory allocated with the localloc instruction.

On the tail. side of things, according to the spec, it can't be used inside of any exception blocks.

jnm2 commented 5 years ago

@CyrusNajmabadi

I'm arguing we should do the same. Instead of thnking about 'tail' as being related to 'return', think about 'tail' as being something you do with a call. That is both a nicer way to conceptualize it (imo) and it leads to a nicer implementation. It's win/win for me.

This makes complete sense to me.

Also, there are times I want a tail call when I can't use the return keyword because the return type of the current method and the called method are both void.

john-h-k commented 5 years ago

You can simply guarantee the O(1) recursion size by translating it to a loop

gafter commented 5 years ago

@agocke The standard computer-science meaning of a tail call is a call which is the final action. It is not necessary for it to be a self-call to be a tail call. The last "Limitations" paragraph of the OP appears to assume "tail call" means "tail self call". It is probably worth clarifying earlier in the proposal that this is only intended to support self tail calls.

Many of the places where I would love to use tail calls in the Roslyn compilers are in the visitors. Many of the methods on the call stack in the binder or flow analysis are simple methods that end in a (not self) tail call, where it would be valuable to discard the stack frame.

agocke commented 5 years ago

@gafter That's true. That's what I meant by "recursive" call. It's technically true that recursion does not require direct recursion, but I was trying to keep it simple

samuel-vidal commented 5 years ago

Comment moved to : https://github.com/dotnet/roslyn/issues/1235#issuecomment-493623126

I think it is an excellent idea

With C# supporting a very confortable style of functional programming it becomes increasingly valuable to have a construct to help guaranteeing stack safety.

If we agree this is the purpose of that proposal, I think we should precise its semantic as follows:

The proposed 'tail' syntax should mean exactly that : the compiler is asked to check that the call will be optimized by the Jit and emit the necessary IL.

In my opinion it would be preferable to make the IL .tail work than to make it a compiler code rewrite (this is way too narrow, and way more complex).

A tail call is one that is performed just before exiting a method. Either returning (nothing) directly if that call returns void or return the result of that call in case it returns something. No operation on the return value should be performed (not even implicit cast, although upcast should be fine).

(In particular, not all recursive call are tail-recursive, and not all tail-calls need to be part of recursion.)

An other key point is that for a call to be tail, both methods have to share the same return type (can be relaxed a bit with covariance).

Also, assuming the criteria is implemented in the compiler, the .tail IL should be emitted even in the absence of the tail constraint.

proposed syntax : tail qualifiedmethodname(parameters); // (the syntax is not tied to the return statement).

examples

void method1()
{
    tail method1();     // tail call
    return;
    ...
    method2();      // tail call
    return;
}

void method2()
{
    method1();      // tail call
    return;
    ...
    method3();      // Not tail
    return;
}

int method3()
{
    method1();      // Not tail
    return 1;
    ...
    return method3();       // tail call
    ...
    return method4() + 1;       // Not tail
}

int method4()
{
    return tail method3();      // tail call
    ...
    tail method3();     // compilation fails with an error (or a warning)
    return 1;
    ...
    retrun x == j ? tail method3() : method4() + 1; // only the first call is tail.
}

A few points:

The real difficulty here, I think, will be the interaction with ref/in/out arguments (let alone stack alloc etc) rather than the tail call being to the same method. A point to have in mind is that a virtual method calling itself is actually performing another virtual dispatch (could be confusing for the user).

Scope : The compilation should fail or produce a warning if the condition for guaranteeing tail call optimization by the Jit are not met (or too hard to figure out e.g. in the case of in/out/ref args etc).

john-h-k commented 5 years ago

the compiler is asked to check that the call will be optimized by the Jit

compiler can't check that. most it can do is emit .tail prefix

CyrusNajmabadi commented 5 years ago

compiler can't check that. most it can do is emit .tail prefix

Compiler can also literally just do the tail recursion itself. Which would actually be my preference. I wouldn't want to ever do tail return but then end up at runtime stack overflowing.

john-h-k commented 5 years ago

yeah, but it cannot check the JIT optimization or JIT behaviour in any way

john-h-k commented 5 years ago

what do you mean by "do the tail recursion itself" tho - the .tail prefix?

gafter commented 5 years ago

Compiler can also literally just do the tail recursion itself

Not without changing the signature of methods in the case of mutual recursion.

CyrusNajmabadi commented 5 years ago

what do you mean by "do the tail recursion itself" tho - the .tail prefix?

Actually performing the optimization in the method. i.e. not calling the recursive method, but instead just converting it to a goto directly. See the codegen part of the proposal in the OP.

CyrusNajmabadi commented 5 years ago

Not without changing the signature of methods in the case of mutual recursion.

i believe andy's proposal put that out of scope. but i could be mistaken. I thought this only covered self-recursion.

samuel-vidal commented 5 years ago

The design I would favor, would be to strengthen the contract with the runtime in a way that all CLR language could benefit from. (sorry new here: would you suggest me to make a separate proposal or this discussion is a fine place).

CyrusNajmabadi commented 5 years ago

No worries. Given your interest, i do personally think a coreclr issue would be more relevant if this is about ensuring behavior around the .tail instruction.

samuel-vidal commented 5 years ago

Also, I should not have been that specific. I could change:

The proposed 'tail' syntax should mean exactly that : the compiler is asked to check that the call will be optimized by the Jit and emit the necessary IL.

to

The proposed 'tail' syntax should mean exactly that : the compiler is asked to check that 1) the call is tail and 2) guarantee that the call will be optimized (either by the Jit through emitting the necessary IL or otherwise).

john-h-k commented 5 years ago

@samuel-vidal

That contract already exists

john-h-k commented 5 years ago

Also, that is not the runtime @samuel-vidal , that is just the compiler. There are various different runtimes and JITs

The proposed 'tail' syntax should mean exactly that : the compiler is asked to check that 1) the call is tail and 2) guarantee that the call will be optimized (either by the Jit through emitting the necessary IL or otherwise).

Compiler cannot check JIT behaviour as i said before. It can ONLY:

Emit the tail. prefix itself, which guarantees desired behaviour according to ECMA 335
Emit the tail call itself, transforming it to a loop

samuel-vidal commented 5 years ago

@johnbeisner I think that is fair enough. Since the contract exists and is guaranteed to be honored by the runtime then point 2) is simply emitting tail. prefix.

I was under the impression that the tail call optimization is subject to a long list of Byzantine conditions in order to have the actual desired effect. So it would be valuable to have the compiler warn the user if any of those conditions are not met. I came accross that list for example (nb it is quite old and I have no idea how relevant it still is).

samuel-vidal commented 5 years ago

Also interaction with try/catch is another can of worms, a return tail call(param) within a try is not exactly a tail call, you can't erase all the stack state for the caller method. (so the conditions you noted above can't be met). That would have to be one of the case when the compiler says no : 'Tail constraints within a try block are not allowed'.

Thaina commented 5 years ago

Totally +1

I would also wish that the compiler will just error when we put tail into any line that not possible to be tail called

Also, could it be that we would just have tail as a standalone keyword. And support for tail block?

void M(int i)
{
    if(i <= 0)
        return i;

    tail { // tail call all this block
        int tmp = M(i - 1);
        DoSomething(ref tmp);
        return tmp;
    }
}

dotnet / csharplang