Closed gitfoxi closed 8 years ago
I'm a schemehead and appreciate TCO as much as anybody, but we're not actually planning to implement this so I'm just going to close this. Apologies.
There's a volcano of conversation coming from julia-dev. I'll definitely search it as well is Issues in the future before bringing something up. Clearly this topic has been discussed. I think the final word is:
Jeff Bezanson Nov 24 No, no update. It's not a high priority, but eventually we will probably do some of the optimizations, but not the semantic guarantee.
On Wed, Nov 27, 2013 at 6:24 PM, Isaiah notifications@github.com wrote:
See https://groups.google.com/forum/#!searchin/julia-dev/tail-call/julia-dev/POP6YXCnP-k/qjEw2OiFVd8J
— Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4964#issuecomment-29435198 .
Michael
Oh hey Jeff. I just quoted you from julia-dev but you had already responded to me. How are you doing that?!? You actually showed up in person before I finished reading the thread where you rejected the idea last year.
I want to figure out how this community is so engaged. A few hours before thanksgiving and people are online. This is like the Reddit of ... I'm out of metaphors. I need to go soak my head. What? Wow.
On Wed, Nov 27, 2013 at 7:14 PM, Jeff Bezanson notifications@github.comwrote:
I'm a schemehead and appreciate TCO as much as anybody, but we're not actually planning to implement this so I'm just going to close this. Apologies.
— Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4964#issuecomment-29436759 .
Michael
I refuse to believe you're out of metaphors. We're all enjoying them :)
Yes, we are enjoying all the metaphors. Please keep them coming. :-)
I must admit, I'm not sure why or even how I've gotten so involved with Julia. I think it has something to do with the other contributors, but the experiments proving this are still running.
I refuse to believe you're out of metaphors. We're all enjoying them :)
+1. I'm confident and hopeful that you can come up with something. I haven't yet seen one that involves both bicycle grease and wildebeests :-).
I must admit, I'm not sure why or even how I've gotten so involved with Julia. I think it has something to do with the other contributors, but the experiments proving this are still running.
I can't figure it out either, but it's extraordinarily addictive. Perhaps because it's so rewarding.
I can't figure it out either, but it's extraordinarily addictive. Perhaps because it's so rewarding.
Julia specializes in having a very high result-to-effort ratio :-)
Mentioning TCO and "tail call optimization" here so that we can avoid future duplicates of this issue.
If someone wants to go ahead and implement tail call elimination, then feel free to go for it. I'll leave this open for that eventuality.
+1
Also, language support for lazy evaluation is the other side of that coin. LazySequences.jl demonstrates the power but not the performance that you could have.
On Thu, Jan 9, 2014 at 5:51 AM, Stefan Karpinski notifications@github.comwrote:
Reopened #4964 https://github.com/JuliaLang/julia/issues/4964.
— Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4964 .
Michael
Look, we don't really need this and it is just a huge pain. It also conflicts with exposing C-callable entry points.
How about supporting the LLVM sibling calls? It doesn't mess with calling conventions and it would already go a long way, assuming the last criterion, which I don't understand, isn't too bothersome.
This might also be relevant (and not engaging if you do use fastcc somewhere):
Yes, we might as well mark some calls as tail calls and turn on this optimization. But that last criterion is pretty strict; the caller and callee have to have some of the same arguments. I don't know how often that happens. It seems like it excludes even tail-recursive counting loops.
For anyone stumbling across this as I did, I implemented a little macro called @rec
for Lazy.jl that may help you out.
using Lazy
@rec function fryeggs()
return fryeggs()
end
fryeggs() # may take a few hours
@rec
re-enables Julia's unique cooking facilities (compatible metal-cased laptop sold seperately, disable fan for best results).
Less interesting example:
@rec reduce(f::Function, v, xs::List) =
isempty(xs) ? v : reduce(f, f(v, first(xs)), rest(xs))
Might be nice to have simple optimisations like this as a compile step, even if optimising all tail calls is a pain.
Recurssion is a very in-style thing for Julia. Because using multiple dispatch to recognise terminating condition based on change of argument. Thus programmer are already thinking recussively.
Though I'm not sure the best usecases are solved with tail call recussion optimistation, vs more significant optimisation to flatten a (overlapping) broader class of recussive algorithms. Which I suspect LLVM is doing already.
Please consider TCO in implementation of finite machine state, in Erlang is relative easy because this, would be great if Julia include too:
%%%% get_content_service.erl
%% Implement next state machine:
%%
%% From (A) to (B) when "begin_tag"
%% From (B) to (B) when "content"
%% From (B) to (A) when "end_tag"
%% From (A) to end when "end_document"
%%
%% end
%% ^ ___ content
%% end_document | / \
%% | begin_tag | /
%% -> (A) ---------------------------------> (B) <-
%% | |
%% \ ________________________________________/
%% end_tag
%%
%%
-module(get_content_service).
-export([start/0]).
-export([a/1]).
start() ->
register(get_content_service, spawn(get_content_service, a, [[]])),
run_test(),
ok.
a(Values) ->
receive
{begin_tag} -> b(Values);
{end_document, From} -> From ! {result, Values}
end.
b(Values) ->
receive
{content, Value} ->
NewValues = lists:append(Values, [Value]),
b(NewValues);
{end_tag} -> a(Values)
end.
run_test() ->
get_content_service ! {begin_tag},
get_content_service ! {content, 1},
get_content_service ! {content, 2},
get_content_service ! {end_tag},
get_content_service ! {begin_tag},
get_content_service ! {content, 99},
get_content_service ! {end_tag},
get_content_service ! {end_document, self()},
receive
{result, Values} -> io:format("~w~n", [Values])
end.
Then in erlang shell:
1> c("get_content_service").
{ok,get_content_service}
2> get_content_service:start().
[1,2,99]
ok
No one here is against TCO, the problem with saying "we have TCO" is that we then have to guarantee that the optimization is performed for any call in tail position, and LLVM does not at the moment provide a way to enforce this. I would expect it to happen in simple cases though.
I would expect the pop of the GC frame would prevent TCO from happening in many cases... (not this one though)
Correct me if I'am wrong, but can a tail recursive call not be rewritten to a loop by the julia compiler?. I mean that julia recursive code is transformed in loop llvm IR code.
As an optimization, sure, with very low priority. As a semantic guarantee, very unlikely.
@yuyichao What do you mean with semantic guarantee?
That functions like
function rec()
return rec()
end
will be guaranteed to have tail call and not stack overflow.
Thanks for replying.
And why could not the julia compiler write this to
while(true) end
There's no reason it can't for this case, but it most likely can't guarantee that. Doing that in the compiler level requires guaranteed inferablility (the compiler has to know that you are calling the same function/calling with the same argument types), which is not part of the semantics/guarantee. Doing that without the compiler transformation requires a runtime interface that can do tail call, that's pretty hard/impossible to do (when you need to interleave compiled and "interpreted" frame and somehow tail call both of them).
Local recursion is the easy case. Full tail-call semantics mean that every call in tail position must use no stack space, no matter how many functions are involved or what the structure of the call graph is.
Is it a impossible task to guarantee it or does it only slow down your compilation step. If only the latter is true, then something like tag @\doTCO in source code or julia -O3 for the command line would be a solution. I have another question, is it possible to manipulate the julia compiler via a plugin framework, I've heard that some languages do support it?
It is impossible with the current compiler infrastructure if only because LLVM doesn't support it. Whether there is a hypothetical implementation of the language that could provide the guarantee is up in the air. Hooking into the compiler is generally done via generated functions at this point.
It's not impossible, it's just very difficult for little practical benefit. Also, the point of "guaranteed" TCO is not raw performance --- it's not always faster --- the point is to be able to express constant-space control flow with function calls.
Julia supports macros and @generated
functions, which effectively let you extend the language at two stages. Look through the metaprogramming section of the manual. Beyond that I'd recommend asking more specific questions on discourse, explaining what you'd like to accomplish.
Maybe it's been obvious for you all along, but it just occurred to me that you could use more than one LLVM calling convention, @JeffBezanson maybe that's what you meant by "very difficult" rather than "impossible with the current compiler infrastructure".
Couldn't you use a calling convention that supports TCO for Julia => Julia calls, and use CCC for Julia => C and C => Julia? (the latter with a thin wrapper on the C side of things to bridge the CC mismatch... alternatively, Julia functions that are called from both C and Julia could be compiled twice, once for each CC...)
That would enable TCO for pure Julia code, though off course inserting a C frame would break the optimization.
A quick DDG search yielded this: http://lists.llvm.org/pipermail/llvm-dev/2008-December/018955.html so it is not impossible to achieve.
Edit: noting that LLVM only supports TCO for a couple of CPU arch, still not sure it is worth it. My use case would have been fast parser combinators, but I understand it isn't the main focus of the language... I've been out for a while, the progress made in the mean time is impressive :-)
Using multiple LLVM calling conventions is not the problem,
Basically to have this as a guarantee rather than an optimization, you'll need to change many different part of the runtime so that they will not get in the way of tail call.
Ooh, ok my bad...
Couldn't you let users request TCO explicitly (@TCO return foo()
?) and have the compiler error out if type inference and devirtualization can't be achieved?
A feature that depends on type inference and compilation is not something we want to have. We could of course have a very limited version that are defined to work, but it seems hard at a glance to come up with a restriction that's both useful and easily implementable.
Sorry to bother, but after 3 years I wonder if there is tail calls support for Julia now ?
No, it's easy to check the function at the top, in the latest master, and in my 1-day old master I got:
julia> rec()
ERROR: StackOverflowError:
Stacktrace:
[1] rec()
@ Main ./REPL[1]:2
That said, there's still the different Easter-egg language (with TCO)::
$ julia --lisp
Why do you need tail call elimination?
Why do you need tail call elimination?
So that (mutually) recursive functions can be compiled efficiently.
Why do you need tail call elimination?
For this
Starting Julia...
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _' | |
| | |_| | | | (_| | | Version 1.6.1 (2021-04-23)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> fib(n, a=zero(n), b=one(n)) = n == 0 ? a : fib(n-1, b, a+b)
fib (generic function with 3 methods)
julia> fib(10)
55
julia> fib(100_000)
2754320626097736315
julia> fib(1_000_000)
ERROR: StackOverflowError:
Stacktrace:
[1] fib(n::Int64, a::Int64, b::Int64) (repeats 79984 times)
@ Main ./none:1
julia> fib(big(1_000_000))
signal (4): Illegal instruction: 4
in expression starting at none:1
Also what's with the signal (4): Illegal instruction: 4
, wasn't expecting that!
FWIW, it overflows before 100_000:
julia> fib(n, a=zero(n), b=one(n)) = n == 0 ? a : fib(n-1, b, a+b)
fib (generic function with 3 methods)
julia> fib(10)
55
julia> fib(100_000)
2754320626097736315
julia> fib(Int128(100_000))
123084731589716971363493501662014098043
(Of course, all else equal, I'd always prefer having an optimization over not having it.)
(Pretty sure the Int128
answer above overflowed, too.)
I think that Say What You Mean is a pretty powerful rule in this instance. If you want the language to use an iterative implementation, write an iterative implementation. Julia is really good for code transformations, but baking trivial and not-very-useful ones into the compiler can be a slippery slope.
A whole class of languages (functional programming languages) prefer a recursive over iterative stylistically, making claims such as: "programs are easier to understand without mutation or side-effects" and "there are many constructs that are more naturally represented as recursive than iterative". Some of these languages don't support mutation or loops at all. Designers and users of these functional programming languages Say What They Mean in terms of semantic guarantees and expected output, and expect the compiler to generate efficient compiled instructions from that.
Right now I don't know how to write a non-trivial program (e.g. compute Fibonacci numbers) in Julia without explicitly handling side-effects. This ability might not be very useful to some people, but I think it is incredibly useful to others. It is sad for me to see Julia's syntax and architecture come so close to supporting a functional programming paradigm, but fall short in this critical way.
@time fib(big(52_325)) # highest I can do, then 1 higher it hangs; on 32-bit Julia I can actually go higher 130_000, not 140_000, so seems like a bug.
Isn't the/a @tailrec
macro good enough (it doesn't support mutual recursion), which gets around the limitation?
(@v1.8) pkg> add https://github.com/TakekazuKATO/TailRec.jl
julia> using TailRec
julia> @tailrec fib(n, a=zero(n), b=one(n)) = n == 0 ? a : fib(n-1, b, a+b)
fib (generic function with 3 methods)
julia> @benchmark fib(46) # Without the macro is 51x slower, that's the largest Int that doesn't overflow, and maybe an argument to have the macro in Base
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.957 ns … 28.149 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.040 ns ┊ GC (median): 0.00%
Time (mean ± σ): 2.088 ns ± 0.871 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █▃ ▂ ▂
█▃▄▁█▁██▆▄▇█▄▆▃█▅▄▄▆▅▁▃▅▄▁▁▄▄▃▃▁▃▄▃▄▃▅▅▅▇▇███████▇███▇▇▆▅▅ █
1.96 ns Histogram: log(frequency) by time 2.67 ns <
julia> @benchmark fib(big(50_000)) # 14% faster then without the macro. Without big and the macro 3x slower, but both wrong results, except for modulo math, which is likely never useful for fib:
BenchmarkTools.Trial: 83 samples with 1 evaluation.
Range (min … max): 48.710 ms … 164.389 ms ┊ GC (min … max): 13.67% … 70.29%
Time (median): 58.329 ms ┊ GC (median): 18.29%
Time (mean ± σ): 60.706 ms ± 14.401 ms ┊ GC (mean ± σ): 20.66% ± 8.19%
▅▂█▂ ▃ ▂▃
▃▃█████▅██▆██▃█▅▁▃▃▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▁
48.7 ms Histogram: frequency by time 121 ms <
Memory estimate: 107.83 MiB, allocs estimate: 250006.
julia> @btime fib(big(1000_000)); # @benchmarks gives same number of allocations, except stating "over"
10.078 s (5000006 allocations: 40.50 GiB)
@PallHaraldsson @tailrec
is seriously outperforming the loop I wrote. Do you know how this is happening?
#]add https://github.com/TakekazuKATO/TailRec.jl
using TailRec
@tailrec fib_tr(n, a=zero(n), b=one(n)) = n == 0 ? a : fib_tr(n-1, b, a+b)
#fib_tr (generic function with 3 methods)
function fib_loop(n::Integer)
a = zero(n)
b = one(n)
for _ in 1:n
a, b = b, a+b
end
a
end
#fib_loop (generic function with 1 method)
fib_tr(big(10_000)) == fib_loop(big(10_000))
#true
@btime fib_tr(big(10_000));
# 2.398 ms (50006 allocations: 5.01 MiB)
@btime fib_loop(big(10_000));
# 4.300 ms (100006 allocations: 6.24 MiB)
Yes, but not because of TailRec, since they were not strictly equivalent, more like this way which is only 1% slower:
julia> function fib_loop(n::Integer)
a = zero(n)
b = one(n)
for _ in n-1:-1:0 # and then n fewer allocations, and thus yet faster, with n:big(-1):0
a, b = b, a+b
end
a
end
I admit I had to stare at the code a bit (after noticing the difference in number of allocations, seems subtracting 1 from BigInt is optimized for allocations; adding 1 can make the result (for a positive number) larger, by one bit, so it seem like they make that assumption allocating always, while subtracting 1 from a positive number can't).
and then in 58% of the time with even fewer allocations:
julia> function fib_loop(n::Integer)
a = zero(n)
b = one(n)
for _ in 1:Int64(n) # should be safe here?
a, b = b, a+b
end
a
end
It's been mentioned in several discussions of tail call elimination that it's not really an optimization. Optimizations are optional transforms that do not change the behavior of the program but make it faster. Tail call elimination is not optional when you need it; it changes the behavior of the program—stack overflow versus no stack overflow, and in the case of an error, it changes the stack traces substantially; and it's also not necessarily faster to reuse the current stack frame than to start a new one. That's why the title of this issue was changed from "tail call optimization" to "tail call elimination".
All of this suggests that it is probably better to have an explicit tail call annotation—after all, when you need it, the compiler is not really free to not do the "optimization" and if the code changes in such a way that the compiler cannot do the transformation anymore (e.g. you accidentally put some other operation around the tail call), then it should be an error rather than silently working by doing normal recursion. I'm not the only one to observe this: https://news.ycombinator.com/item?id=15698624.
Anyway, I think it would make sense to introduce a @tailcall f(...)
macro—or maybe @goto f(...)
or @return f(...)
—that explicitly replaces the current stack frame with a new function invocation. If you call it with anything besides a function invocation, it should error at macro expansion time. It would also allow invoking any function, not just the one that's currently running, since one of the use cases for tail calls is implementing state machines by transferring between a set of functions without pushing the stack. Another feature of an explicit @tailcall
is that since it explicitly replaces the current function on the stack frame, if an error occurs, those other frames will never appear in the stack trace.
I guess an aggressive alternative would be that any time someone does return f(...)
we consider it a tail call and replace the stack frame rather than recursing normally, but that feels like a bit too much.
@StefanKarpinski Hm. Devil's advocate: maybe implicit TCE would be useful as a compiler/interpreter option?
Why? If you're writing code that needs TCE then you need it and being able to run Julia with/without just means your code might be broken by something you don't control. What is the use case for optional TCE? Explicitly required TCE is the only thing that makes sense to me, which then shouldn't be controlled by a flag, it should be a semantic guarantee.
@StefanKarpinski I don’t know about the compiler/interpreter flag, but having some kind of globalish switch (e.g., on the module/package level) is very helpful on brevity. I suggest having the @tailcall macro support modules as well, where all the tail calls in that module get optimized. Perhaps we can add a disabler macro as well, @notailcall.
It's interesting that this is valid syntax in Julia and Lua:
The difference is that in Lua, when you call
rec()
it will stare at you and engage your motherboard's built-in space heater until you ctrl-C out. In Julia:So the stack 80000 things deep. That's interesting.
Why does this matter? I'm not sure. But some people care a lot:
http://www.lua.org/pil/6.3.html