In #26 we looked at what "speculative optimization" is in theory. Now we will take a look at the practice.
To be clear: we will say that the compiler speculates on FOO to mean that it generates machine code using the assumption that condition FOO is true and that it prefixes this generated code with guard instructions to ensure that it only runs when this condition really is true. So the compiler optimizes for the case where the condition is true at the expense of the case where it is not true.
The most important speculative optimizations that RaptorJIT (the LuaJIT family) does are:
Speculating on the type of each local variable and temporary value. Each value is assigned to a machine register and given a static type (nil, false, true, userdata, object, table, thread, prototype, function, number, 8/16/32/64-bit signed/unsigned integer, or any number of user-defined FFI types). The values are type checked when loaded from memory and then operated on "naked" (unboxed) in machine registers.
Speculating on the definition of each called function. Every call to a Lua function is inlined, always, without exception. Generated code does perform function lookup but only to verify the identity of the function being called. The body of the called function is always inline in the generated code. This inlining is fully recursive into subroutines and sub-subroutines, etc. This means that generated code never contains branches into separately compiled code and it does not even maintain a call stack with return addresses.
Speculating on the outcome of each conditional branch. The JIT speculates on whether a given if statement will take the then branch or the else branch and it only generate code for that case. This means that the control flow of generated code is strictly linear, with no internal branches, and no control-flow analysis is needed during optimization.
Speculating on specific characteristics of values. The JIT will sometimes speculate that a table object has a specific number of hash buckets in order to optimize constant-key lookups, or speculate that a numeric value has a specific value (e.g. 32) to make arithmetic more efficient (e.g. compile division as bit-shift.)
These are the main ways that the RaptorJIT compiler speculatively optimizes programs. The key to writing efficient code is to anticipate the compiler's thought process and ensure that its speculations will tend to be successful. Each time a speculative optimization suffers a misprediction - its premise is found to be false at runtime - this triggers an expensive transfer of control to search for another piece of generated code that is applicable.
The key to writing inefficient code is to frequently contradict these speculations. Use lots of different types of values in the same variable; replace function definitions in frequently called modules; switch the bias from your if statements between the then and else clauses; lookup the same constant keys in hashtables of many different shapes and sizes. These things may seem quite natural, and may perform perfectly well with other compilers, but they are anathema to this tracing JIT.
In #26 we looked at what "speculative optimization" is in theory. Now we will take a look at the practice.
To be clear: we will say that the compiler speculates on FOO to mean that it generates machine code using the assumption that condition FOO is true and that it prefixes this generated code with guard instructions to ensure that it only runs when this condition really is true. So the compiler optimizes for the case where the condition is true at the expense of the case where it is not true.
The most important speculative optimizations that RaptorJIT (the LuaJIT family) does are:
if
statement will take thethen
branch or theelse
branch and it only generate code for that case. This means that the control flow of generated code is strictly linear, with no internal branches, and no control-flow analysis is needed during optimization.These are the main ways that the RaptorJIT compiler speculatively optimizes programs. The key to writing efficient code is to anticipate the compiler's thought process and ensure that its speculations will tend to be successful. Each time a speculative optimization suffers a misprediction - its premise is found to be false at runtime - this triggers an expensive transfer of control to search for another piece of generated code that is applicable.
The key to writing inefficient code is to frequently contradict these speculations. Use lots of different types of values in the same variable; replace function definitions in frequently called modules; switch the bias from your
if
statements between thethen
andelse
clauses; lookup the same constant keys in hashtables of many different shapes and sizes. These things may seem quite natural, and may perform perfectly well with other compilers, but they are anathema to this tracing JIT.