idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.5k stars 375 forks source link

Tail-call optimize the standard libs #2112

Open stefan-hoeck opened 2 years ago

stefan-hoeck commented 2 years ago

Summary

I suggest to add transform rules to tail-call optimize most of the recursive functions in the standard libs, if possible with a proof that the tail-recursive versions behave the same as the non-tail-recursive ones.

Motivation

Many functions on List, Vect, and SnocList use recursion but are not tail-recursive (this includes for instance range, filter, mapMaybe, Vect.reverse, (<><), and many more). All of these functions are mostly useless for bulk operations on backends with limited stack size, such as the JS backends. This issue has been brought up in the performance channel on discord (where it was shown that this would have a beneficial impact even on the Chez Scheme backend), as well as the mailing list and some pull requests (for instance #2100).

The proposal

I propose to add tail-recursive implementations for these functions as suggested for the List.length function in #2100. Where necessary, we keep the original non-tail-recursive versions, since these are often easier to use in proofs, but add a transform rule to use the optimized version at runtime. In addition, we add tests for the Node backend making sure that the transform rules behave as expected. We also no longer accept PRs adding new recursive functions to the standard libs without such transform rules, unless for the rare cases where it is not reasonably feasible to come up with a tail-call optimized implementation.

Examples and Technical Implementation

See #2100.

Alternatives considered

An alternative would be to have an optimization step in the compiler taking care of these issues. This would of course have the advantage that it would take the burden from Idris programmers to care about all of this. However, right now it seems unclear (for me at least), what the impact of such an optimization step would be on performance (compiler and generated code) and readability / debugability of the generated code. Nor is it clear (for me at least), how and at what IR such an optimization step should be implemented.

On the other hand, such an optimization step might also be able to transform and tail-call optimize recursive monadic code, something that is not possible with the Monad interface we have in the prelude (see idris2-tailrec for a possible workaround).

Conclusion

As a long-term goal it might be better to have an optimization step for these issues in the compiler, but since it is not yet clear if and when this going to happen I suggest we start optimizing existing (and new!) recursive functions, to make them useful for as many backends as possible now.

Note: If people agree on this, I could work on these transformations during this December's developers meetup. I won't be able to attend to a lot of talks or activities during daytime, as I'll be mostly occupied with lectures at work, but adding a bunch of transformation rules throughout the week should be possible.

galtys commented 2 years ago

Please could you correct my thinking that the problem with tail-optimization and stack overflow can be solved by a series of transformations as described by Bartosz Milewski

The transformations would be as follows:

  1. Transform all recursion into tail recursion via CPS (Continuation Passing Style)
  2. CPS are really "lambda" closures, but they can be further transformed into named functions with the enviromnent passed as a parameter
  3. Turn the enviroments into a recursive sum type, which effectively replaces the stack
  4. After this transformation, what is left is a stack based machine and a non-recursive iteration (no closures)

Since the Idris2 IR's available for backend development are "the whole" program, the transformation steps could be applied to the whole program.

References: Bartosz Milewski – Replacing functions with data: https://www.youtube.com/watch?v=wppzFzzD4b8 John C. Reynolds - Definitional Interpreters for Higher-Order Programming Languages : https://surface.syr.edu/cgi/viewcontent.cgi?article=1012&context=lcsmith_other James Koppel - The Best Refactoring You've Never Heard Of: https://www.pathsensitive.com/2019/07/the-best-refactoring-youve-never-heard.html