Open lukego opened 7 years ago
Hey @lukego, I learned a lot from your write-up like this one!
Have you looked into http://terralang.org ? It uses LuaJIT as compile-time macro language, as well as hosting language at runtime (LuaJIT + llvm).
Cheers!
Thanks for the kind feedback @FongHou!
Terra looks interesting. Lately I am happy simply programming though and I don't feel the need for a metaprogramming language. Snabb-wise I think a great strength of LuaJIT is that you can learn the language so quickly by reading Programming in Lua. If you already know Python/Ruby/Scheme/Smalltalk/etc then you can get up and running very quickly. I am not sure the same is true of "high brow" languages like Terra, Rust, C++, Haskell, etc but maybe that's my own background speaking.
Just need to write an excellent HOWTO for making the tracing JIT do what you want it to & keep fleshing out the associated profiling and benchmarking infrastructure :).
Modest proposal: Let's become true LuaJIT experts!
Specifically let's understand all the relevant details well enough to write a really simple HOWTO for Snabb hackers to write predictably fast code. I bet that we can come up with a very simple answer but that formulating it will be a hard intellectual challenge. "Hard fun!"
This issue is to explore one specific aspect of JIT: how to arrange our code so that we are not sensitive to side-trace behaviour. I will also link a couple of relevant LuaJIT issues where I have been looking into this in the background.
LuaJIT deep dive warning!
Example
Consider this humble piece of code:
The first line of the loop looks innocent but the dynamic nature of Lua potentially implies quite a lot of work:
link
in the_G
hashtable of globals.receive
in thelink
module hashtable.packet
in the_G
hashtable.free
in thepacket
module hashtable.link
,packet
, andfreelist
.So in principle we need to execute dozens of extra instructions beyond doing our actual work of moving a packet from a link to a freelist.
Present situation
How does this work out in practice? There are really three different times when that work can be done:
So how does this behave in real Snabb applications? I would say that usually we get the nice second case due to LuaJIT's "loop optimization" that compiles separate code for iteration 1 (slow) and iteration n>1 (fast - reusing work from first iteration.) However, we also often get the nasty first case that repeats work on each iteration due to
if
branches in our loops that break out of the loop optimization. One example of the fast case is lukego/blog#6 and of the slow case is lukego/blog#8. This uncertainty causes much gnashing of teeth about "side traces" (see e.g. https://github.com/LuaJIT/LuaJIT/issues/218).Let me illustrate the situation with some data. Here is a performance comparison on snabbnfv showing the impact of disabling the loop optimization entirely (
-O-loop
) so that the worst case is always triggered:This causes around a 50% performance drop. So this is a direct measurement of how dependent that application is to the LuaJIT loop optimization eliminating work.
Possible way forward
So - new idea - how about if we could trigger the third case where the JIT does all of the work at compile time and inserts the results directly into the machine code as constants? This makes intuitive sense because it actually is always the same module definitions that we are looking up in practice and there is no need to repeat this work all the time. If we could eliminate this busy-work entirely then it would make the code run faster even when loop optimization is not applicable and make us less sensitive to the whole side-trace issue.
Sounds good eh? Initial experiments on microbenchmarks are promising (see https://github.com/LuaJIT/LuaJIT/issues/248#issuecomment-273841796) but more investigation is needed. I think the first major step would be to optimize the benchmark shown above i.e. make a Snabb application run fast even when loop optimization is disabled (and without putting lots of boring annotations in the source like
local link_transmit = link.transmit
etc that I find unacceptably ugly.)