carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
https://github.com/carbon-language/carbon-lang/blob/trunk/README.md
Other
32.27k stars 1.47k forks source link

`tail return` proposal #1761

Open josh11b opened 1 year ago

josh11b commented 1 year ago

I think the Carbon project would welcome a proposal to add an explicit syntax that forces a tail call. This is aligned with our goals of predictable performance and giving the user explicit control.

My syntax suggestion: tail return F(...);.

Background:

pache-Ak commented 1 year ago

I agree, we need something to help complier. But I don't want it is a keyword. In my opinion, attribute is a good choice. The program will run the same whether we use this flag or not. But if we use it, the compiler might give us better performance. This matches the meaning of the C++ attribute. I think keywords should have an effect on program semantics, but it tail doesn't.

OlaFosheimGrostad commented 1 year ago

Try to avoid "stealing" names that are commonly used in algorithms: head, tail, next, prev, iter, etc…

dialupmars commented 1 year ago

Could it work if we made it so if you used t.return or something similar, it would then function like a normal tail return? Or could be similar to print and printf, the extra letter changes what the argument is.

emlai commented 1 year ago

I agree with using an attribute, to avoid stealing common variable names, for example: @tailcall return F(...);.

Also I think it's too early for this proposal. There are much more important things that would be better focused on right now.

geoffromer commented 1 year ago

The "Tail Calls" section of this C++ proposal might also be a useful reference, if you'll pardon the self-citation.

I personally agree that tail return is probably not the best syntax, but syntax is easy to change, so I think we shouldn't focus too much on that. The more interesting questions to consider are things like:

josh11b commented 1 year ago
OlaFosheimGrostad commented 1 year ago
  • What requirements does tail return impose on the caller and the callee? Tail call elimination only works if no code at all is executed after the tail call, including things like destructors. So do we disallow local variables with nontrivial destructors, or do we run destructors before the tail call is entered? This feeds back into the previous point, because running the destructors early could observably change the behavior of the program, which makes it harder to argue this is merely an optimization.

Is the idea that you can write a tail return statement anywhere in the function and require that the compiler attempts to do a program transform that makes tail call optimization possible, while preserving externally observable effects? Basically some sort of program synthesis with verification of effects?

That is quite heavy duty… and you would need full access to all source code used. But, having better functional programming support than C++ would be a great selling point.

OlaFosheimGrostad commented 1 year ago
  • I think we should generate an error if any code would execute after the tail call.

If you can move the constructor/destructor pair out of the resulting loop and get the same behaviour, why not? If the only observable effect is that you get fewer calls to malloc()/free() because reuse is possible, why not?

Added: You could define a concept of «internal side effects» as a change in behaviour that is contained to the runtime/standard library. Then allow all of those «internal side effects» in functions that request tail return and other 'high level optimizations'. That could also include doing things like reserving a predicted buffer size for std::vector before the loop and use shrink_to_fit after loop in order to prevent many increasing reallocations inside the loop.

dilijev commented 1 year ago

In the ES spec (ECMA Script / javascript) proposals for tail call/return (and then revising the spec in that area for practical reasons) there were a few big issues for discussion and debated in terms of implementation:

  1. Should tail call behavior be the default? (Largely agreed no, except by the functional programming community)
    • My answer: not default behavior, opt-in.
  2. Should there be a syntax to opt-in? (or, if (1) is yes, a syntax to opt-out)
    • My answer: yes, syntax to opt-in.
  3. Should the opt-in be (a) per-call-site, or (b) per function (attribute or syntax at point of declaration)
    • My answer: opt-in per call-site.
    • This is a question worth debating, IMO. At least the ECMAScript community had a fair amount of debate over it, as I recall.
    • Per function at declaration means any caller of a library API doesn't need to know the function is tail-callable and gets a performance benefit. This seems broadly a bad idea because it would be surprising.
    • Per call site means the programmer calling the function is aware of the tail call semantics and will not be surprised.
  4. Should there be an error if a tail-call-function [marked at declaration as tail call [i.e. 3b]] is called but not in tail position (probably not -- unless the programming paradigm truly expects all tail call functions to be optimized even if it's clearly not tail position)
    • My answer: if 3(b), I think such a function should still be allowed in non-tail position, because the programmer will be aware of the semantic difference if they understand tail calls - IOW it will not be a surprise that it works, and it will not block the programmer from doing what they want (using a function to perform its function in a reasonable context regardless of tail position)
  5. Must a tail call be part of a return statement? What if the function's return is void? Can we / should we spec tail-call of functions in tail call position when there is no return?
    • Having tail return be the only type of tail call simplifies things because there's no implicit function in tail position with suddenly confusing behavior (potentially confusing in the event of a crash or debugging missing frames you'd expect to be there, if you aren't thinking about tail call when you write the code)
    • In most languages functions that have a void return type can issue a return; statement to exit early - if we allow a function that returns void to return void; as a trivial case, that allows return <expr-evaluating-to-void>; then, syntactically, we can also tail return <void>; - that's an elegant solution to requiring tail call to be a return in the context of void functions.
  6. Debugging and crashes (backtraces)! If you tail call do you lose all the debugging context? What context can be kept? (e.g. Number of tail call frames overwritten, the number of mutually recursive functions involved if it's not directly self-recursive, etc.) Obviously you'd want to collapse it somehow for the same reason that tail call wants to get rid of O(n) recursive frames.
  7. Are there to be any restrictions on the complexity or feature set of a function that can be tail-called? For example, functional programming generally likes immutable parameters, will pointer arguments be allowed?
nigeltao commented 1 year ago

I think tail return should require (emphasis added) the implementation to perform tail-call elimination.

There's the Carbon explorer (interpreter) and a production Carbon compiler is (or will be) based on LLVM. It's not obvious to me whether it will be feasible to transpile Carbon to C++ the way the original CFront transpiled C++ to C code (not to object code). Feasible meaning not just a proof-of-concept, but something you could use in production.

C++ is Turing complete, but still, requiring TCE (Tail Call Elimination) could make transpilation harder.

There's no definitive right answer and it's all trade-offs. I was just wondering whether it was an explicit goal or non-goal to rule in or out a CFront-like thing. (This question could possibly spin out as a separate issue).

geoffromer commented 1 year ago

I was just wondering whether it was an explicit goal or non-goal to rule in or out a CFront-like thing.

See the interoperability goals proposal:

The Carbon toolchain will support, as an option, generating a C++ header and object file from a Carbon library, with an ABI that's suitable for use with non-Carbon toolchains.

But see also the following paragraph, about how C++ code generation will not necessarily support all Carbon features with full fidelity. This is only intended as an interoperability tool; we definitely don't expect or intend to go through a phase where all Carbon code is transpiled to C++, the way all C++ code was transpiled to C in the CFront era.

josh11b commented 1 year ago

@chandlerc Would you like to weigh in on the CFront-style transpiler is something we want to support?

OlaFosheimGrostad commented 1 year ago

Just for reference: ATS is using fn for regular functions and fnx for mutually recursive functions that should be tail call optimized: http://ats-lang.sourceforge.net/DOCUMENT/INT2PROGINATS/HTML/x715.html

chandlerc commented 1 year ago

@chandlerc Would you like to weigh in on the CFront-style transpiler is something we want to support?

I'm not opposed to some folks working on this if they want, but I think there are good reasons why neither the explorer nor the main toolchain should take this direction.

For the explorer, I think its primary value is in simplicity and very clearly and unambiguously matching the intended semantics of the language. I would worry about trying to generate code via mapping into C/C++ source code here being very difficult to avoid accidental semantic effects from the target language. The most effective way I know to do this is to map it into a very low level of the target language, which works but is also somewhat difficult and labor intensive. It's not clear this provides enough value on top of the fairly simple and effective high level interpreter in the explorer. But this is just my impression, and I don't have the deepest understanding of the explorer so I may be missing things.

For the toolchain, our goal is to allow this to become a production quality toolchain as the experiment matures and undertakes increasingly significant real-world evaluations. I suspect that compiling down to C/C++ source code would significantly hamper this because it would be hard to realize some of the specific value propositions of Carbon with this design. For example, I would expect such a toolchain to be somewhat brittle, have significantly worse compile times, and potentially generate poor performing code because of being forced to indirect through a source language rather than directly working with the underlying compiler infrastructure.

The generics model of Carbon would also have to be entirely handled before emitting the low-level code, and I suspect a very large fraction of the complexity of any toolchain will be consumed in the generics and type checking infrastructure that won't be saved by lowering to C/C++.

But again, while these points may argue against using this strategy for these two efforts, they don't necessarily argue against having such an implementation for other reasons.

nigeltao commented 1 year ago

while these points may argue against using this strategy for these two efforts, they don't necessarily argue against having such an implementation for other reasons.

I don't think it's controversial that neither the explorer nor the main toolchain uses transpliation. I'm more concerned about other toolchains.

My question is, if it comes down to "(1) e.g. Mandatory TCE (Tail Call Elimination); (2) Transpiling Carbon to C++ is viable; choose one", is there early guidance on Carbon's choice?

chandlerc commented 1 year ago

while these points may argue against using this strategy for these two efforts, they don't necessarily argue against having such an implementation for other reasons.

I don't think it's controversial that neither the explorer nor the main toolchain uses transpliation. I'm more concerned about other toolchains.

My question is, if it comes down to "(1) e.g. Mandatory TCE (Tail Call Elimination); (2) Transpiling Carbon to C++ is viable; choose one", is there early guidance on Carbon's choice?

This is a useful distillation and gets to my underlying hesitancy with transpilation.

I do not think Carbon's semantics should be constrained by what we can transpile to C/C++.

So I think that means if the only tradeoff is the above, it's option (1) all the way.

There are lots of fancy, kludgy ways to transpile to get very precise functionality like this. If Carbon benefits from guaranteed TCE, then those will be necessary, and that's the cost of trying to transpile. I really want carbon to be free to innovate semantically here.

nigeltao commented 1 year ago

It's tangential from tail return (and Carbon) but, speaking of transpilation and CFront, https://github.com/hsutter/cppfront was recently mentioned at CppCon.

github-actions[bot] commented 1 year ago

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.

Akashgola123 commented 1 year ago

I want to work on this issue please assign this issue to me

jonmeow commented 1 year ago

@Akashgola123 We typically won't assign issues. I've added some documentation for why. Please feel to work on it if you have time, but please through the proposal process to get a better understanding of what to expect.

jonmeow commented 1 year ago

Also, just to note -- I've checked with josh11b, and as proposals can be a little more complex, I've unmarked this as a "good first issue".