ta0kira / zeolite

Zeolite is a statically-typed, general-purpose programming language.
Apache License 2.0
18 stars 0 forks source link

Fix C++ code gen for tail calls. #34

Closed ta0kira closed 4 years ago

ta0kira commented 4 years ago

A tail call in Zeolite should remain a tail call in C++ so that the C++ compiler has the option to optimize it.

Currently, the result is assigned to returns and then returns is returned. Just return the result directly if the return doesn't need to be constructed, i.e., any return other than return { ... } or return _.

ta0kira commented 4 years ago

Note that tail calls probably can't be optimized by the C++ compiler if tracing is enabled, which is the default. This is because the trace context can't destruct until the function returns. There also might be similar issues with local variables; especially since args are passed as const references.

ta0kira commented 4 years ago

For future reference, local vars also complicate things if they have non-trivial destruction. As long as no local variables or named returns are used, there should be no non-trivial destruction.

Note that this relies on String args not being unboxed like the other primitive types are, as well as args being const&.

So, in theory tail calls should already be optimizable if the call is essentially a delegation and only has (non-String) primitive local variables, and tracing (at least in that function) has been disabled.

ta0kira commented 4 years ago

https://github.com/ta0kira/zeolite/commit/d1a0e843d98d278b39e00caeb4cbbc2960d76389 adds support for disabling tracing; however, clang++ doesn't seem to optimize the tail calls, even with manual edits to the generated C++. This could be due to the non-trivial destructors of boxed arguments passed to the recursive call.

So, it seems unlikely that tail calls will be optimized by the C++ compiler unless the generated code passes only trivially-destructed data (temporary const& doesn't count here), even if it's in just a subset of situations. For example, removing empty param/arg lists for functions that don't need params/args, and passing unboxed primitive types when the dispatching routine is being bypassed.

Leaving this closed, since it seems like at best it will be unclear to the user if/when optimization is going to happen.