Open kakkoyun opened 1 year ago
hey @kakkoyun what's the
complexity, of this issue?
thanks
hey @kakkoyun what's the
- priority
This is actually quite close to the top of our priority list. However, the team needs a couple of months to get to it.
- complexity, of this issue?
This could be quite complex. You can check #1933 and #1984 PRs for the scope of it. But it shouldn't intimidate you.
If you want to give it a shot, go for it!
I actually think it's a little more tricky than ruby/python as primarily we want to support LuaJIT, so we need something that's a mix of the native unwinder and the python/ruby unwinder, that we can switch to. @javierhonduco is already looking at how we could make switching work, but I think it's harder than it looks at first glance.
On a positive note though, there is some prior art that suggests that something similar to the python/ruby purely for reading the frames should indeed work: https://github.com/yunwei37/nginx-lua-ebpf-toolkit
Let's wait for @javierhonduco's investigation result. Thus, I'm assigning it to him for now. @Namanl2001, we will keep you in the loop; we would love this to be handled by the community ❤️
@kakkoyun Has there been any progress on this ticket?
Hey @sichvoge, it's on our immediate roadmap now. That being said, adding the Lua support will still take 1-2 months.
cc @Sylfrena
Awesome thanks Kemal!
The ideal goal is to have a Lua/Native mixed frames profiler. The apisix profiler claims to do this but its not clear how since it relies on bpf_get_stackid
which only works with frame pointer enabled stacks. Some comments at the end of the developer guide lead me to believe it has issues. I suspect it reliably gets the lua stack but not the native stack beneath it. Another draw back is that it relies on uprobe's attached to lua_pcall/lua_resume to get the "current" lua state.
The LuaJIT profiler avoids this problem by stashing the Lua global pointer.
W/o the ability to walk the stack this is dubious, if we could walk stack frames and identify the C/JIT call boundary its probably relatively straight forward to pluck the Lua state pointer as the first arg. Since our unwind tables tell us which is a C frame and which is a JIT frame this seems doable. The Lua global isn't stored anywhere like a thread local so we can't do what works for Python et al. Its possible that we could do something container specific, ie peak into openresty's module code and try to find the Lua context pointers but then we'd have to manage that for everything that uses luajit.
This needs to be analyzed but my guess is the uprobe overhead is much smaller than the average Lua program execution time in most contexts.
It was suggested on the Lua mailing list that we can walk Lua frames using the dwarf information Lua emits to handle stack unwinding machinery needed for C++ exceptions. This doesn't work however. Lua stack unwinding only works when the Lua frame is at a boundary/exit point (ie another function call), they don't work for unwinding at any arbitrary instruction. This was the best description of this issue I could find: https://news.ycombinator.com/item?id=37926172. This explains why perf and gdb can't unwind Lua stacks if the starting context is some random instruction in a Lua JIT'd frame. Even if you use the latest libunwind to unwind the lua stack in process it doesn't work and in fact will crash (actually newer versions of libunwind don't crash but still can't walk through a LuaJIT frame).
So our choices seem to be:
I think the best course of action is start with #1 and in parallel request/advocate/contribute #2. I don't know what the state of LuaJIT development is but this issue is encouraging: https://github.com/LuaJIT/LuaJIT/issues/1092. Especially the bit about debugging.