tail optimization does not seem to be applied when `--debug` flag is used

aturley commented 6 years ago

System Description

OS: Mac OS X 10.13 ponyc version: 0.20.0-18533c587 [release], compiled with: llvm 3.9.1 -- Apple LLVM version 9.0.0 (clang-900.0.37)

Description of Issue

When I compile a program that uses tail call optimization with the --debug flag and run it, it will eventually crash due to a bus error. If I compile the same program without the --debug flag and run it, it will not crash.

I ran the debug version in LLDB and when it crashed I got the backtrace which showed that the function was being called recursively without optimizing out the recursion.

Steps to reproduce

Compile this program with the --debug flag:


primitive HasTailCall
fun apply(limit: U64, current: U64 = 0): U64 =>
if limit == current then
  current
else
  apply(limit, current + 1)
end

actor Main new create(env: Env) => env.out.print(HasTailCall(500_000_000_000).string())


2. Run the program.

## Expected Behavior

The program should run and print out `500000000000`.

## Actual Behavior

The program crashes.

SeanTAllen commented 6 years ago

Technically TCO is an "optimization" and in that sense, this isn't surprising. That said, I think most folks at this point think of TCO as a language feature. On the sync call. Both @jemc and I agreed that if we can provide TCO when --debug is used, we should do that. This would require some investigation.

At this point we are calling this a bug that needs more investigation (although you could easily call it an enhancement).

Praetonus commented 6 years ago

TCO is currently performed by the LLVM optimisation passes. There are some issues with making it a guaranteed optimisation, namely because it isn't available on ARM. More details on the specific requirements are available here.

If we want to make TCO a language feature we'll have to handle it ourselves in the compiler frontend. Handling every kind of TCO would be very hard (and probably pretty much useless) but we could limit the feature to only handle tail recursion, which is the relevant thing here and would be easier to implement.

That said, it would still require some non-negligible efforts and I'm not convinced of the benefits in a language with loops as a first class concept. In a language without loops TCO is essential since it's the only way to "loop" without smashing your stack. If your language has loops then you can rewrite your tail recursive algorithm into an iterative one trivially and avoid the problem entirely.

I'd also suggest to recategorise the issue as an enhancement since without TCO as a language feature the crash is the result of an out of memory problem, which we don't handle.

jemc commented 6 years ago

Reclassified.

@Praetonus can you elaborate on the "non-negligble efforts" for handling tail recursion in the compiler frontend? Do you already have an idea of what this would look like?

Praetonus commented 6 years ago

We'd need to

Detect tail-recursive functions. Aside from the bit explained below where we need to be careful with partial functions, this is an easy step.
Special-case code generation for tail-recursive calls to generate a loop instead. The hard part here would be to figure out how to map the debug information from a recursive source to an iterative IR in a sane way.

Also, we won't be able to transform tail-recursive functions that are partial and contain a try else block which itself contains the recursive call since we'd need to retain information about the parent frame in order to correctly setup the destination frame.

jemc commented 6 years ago

Thanks.

In the short term, I think we do need to update to the tutorial to make this point clear - that it is not a language feature, but is an optimization that is applied only in release mode.

7sharp9 commented 6 years ago

In F# tailcalls are normally not normally turned on in debug mode either.

ponylang / ponyc