Closed agentzh closed 5 years ago
We could strip off the 48-bit value for non-pointer lightuserdata values used solely for hash table keys (in the registry hash table, for example), as implemented in #1151, but we cannot do that for real C pointers like cycle
. Maybe the ultimate solution here is to avoid using lightuserdata for such C pointer values but use proper cdata objects which do not have this limitation in LuaJIT's GC64 mode.
Add to that, another place that is definitely problematic is
https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_util.h#L354
Now we can hack around all places with the mask and restores the original address when reading, but I fear that is gonna get ugly very soon and may break again once the memory layout changes (mostly due to new architecture). It seems to me there is no portable way of fixing this unless LuaJIT stops using memory addresses to piggyback information.
Background reading:
@dndx No, you cannot disable GC64 on ARM64 since GC64 is required on that architecture.
The only hole here seems like the lightuserdata thing. We can replace it with FFI cdata, at least in theory, I think.
@thehowl Is it possible to configure your Linux kernel to use the "memory layout with 64KB pages + 2 levels" as mentioned in LuaJIT/LuaJIT#49?
In the meantime, we'll try phasing out the use of lightuserdata for real C pointer data by migrating them to LuaJIT's FFI cdata objects to remove this limitation for good.
Sorry about not having answered to your requests yet, I'm actively working on trying to change the bloody kernel, but it seems to be a crusade-level challenge to do that. Built a debian package for the custom kernel I compiled, rebooted aaand it was still using the old kernel (I'm guessing this has to do with the way Scaleway servers boot up, and I'm looking into the matter). Hopefully with some more research I should be able to try out what you asked me to do, though it's quite the nightmare it seems.
Sorry for the double update, but I've finally tracked down the fact that I can't change the kernel. (sigh)
I'm using Scaleway, and as I found out, the only way to use a custom kernel is through kexec. And from what is said on this post ARMv8 servers (the ones I'm using) DO NOT have kexec and this has been requested here: https://github.com/scaleway/kernel-tools/issues/348 (no answer from the Scaleway/online.net team yet). So, I'm terribly sorry, but I really can't change the kernel to use 64kb pages + 2 levels :(
Checking in on this issue, because it's impacting an effort for multi-architecture OpenWhisk. I'm happy to do some lifting and build PR's if you can point me toward where the work needs to be done.
@jonpspri If all the lightuserdata pointers have a fixed offset, then removing and re-adding that offset for all lightuserdata values would work. I'm not familiar with that architecture though. Help needed here.
Yeah, it doesn't look quite that simple, but it is close. There are two use cases that I see:
1) Pointers being used as keys to the registry -- no big deal; mask them to 47 bits and they'll still work without collision because of how the linker manages them. I think I see a branch where that work has already been done, but if not it's a simple macro/inline function to make it happen.
2) Pointers to actual data structures (i.e. ___ngx_req). More problematic because they need to remain intact. The best likely path is to convert them to use userdata. That shouldn't require a pile of effort because the instances are fairly well contained, but I will move slowly.
I'm setting up an Nginx::Test environment on a VM right now so I can make the changes on an x86 environment and ensure they don't break core capabilities before verifying on an ARM64.
@jonpspri Using userdata for those pointer values will be much more expensive since they become GC objects now. A lightuserdata value can be fit into a TValue object directly without the need of GC management. I wonder if it's possible to re-patch the truncated lightuserdata value everytime we read from the Lua VM. This is still way cheaper than using GC objects.
@jonpspri After thinking on this a bit more, or maybe we can make the GC object persist in the global env table, this way we can avoid allocating and deallocating new GC objects upon every invocation of filter handler invocation.
Hi @agentzh, @jonpspri Curious to know if any progress made on this issue .
We would like to have this issue resolved and would be willing to help out to get it to a resolution faster Thanks.
@agentzh - it looks like the neovim folks tackled this exact problem successfully in their PR https://github.com/neovim/neovim/pull/7880 "Remove all places where lightuserdata is used". I am looking through Github to find other similar issues and hopefully solutions to what looks like a common problem.
Another instance of a successful fix is at https://github.com/PowerDNS/pdns/pull/6512 from PowerDNS.
@vielmetti I looked into the PRs you referenced and they look more like workarounds than real fixes. Please note that lua_pushstring
and lua_pushlstring
are WAY more expensive than lua_pushlightuserdata
since the former requires creating new GC object and also the relatively expensive lj_str_new()
call.
Another thing we need to find a solution to is to port the opitmized lj_str_new()
using Intel's fast SSE 4.2 instruction set's CRC instruction over to ARM64 equivalents otherwise the hash collisions on ARM64 can be really ugly:
Please note that lua_pushstring and lua_pushlstring are WAY more expensive than lua_pushlightuserdata since the former requires creating new GC object and also the relatively expensive lj_str_new() call.
The PowerDNS link has benchmark results, for what it's worth.
Pointers being used as keys to the registry -- no big deal; mask them to 47 bits and they'll still work without collision because of how the linker manages them.
Do you have pointers to documentation that makes this promise? As you can see from the PowerDNS results, the effective pain is just a few percent, but avoiding it would be even better.
@Habbie I am not sure if lua_pushstring
will solve the problem here. We have to be able to store pointer addresses to data structures like ngx_request_t
or ngx_cycle_t
and just being able to store their type is not good enough. Unless you meant to store the value of the address as strings, but that seems to be very hacky compared to the cdata approach.
@Habbie One common problem with those micro-benchmarks is that they may not be really testing things or cases that they are supposed to cover (like hash collisions are happening or irrelevant GC allocations or over overwhelming operations in other parts of the code shadowing the real overhead).
I'll open up the opportunity to get test cycles (or benchmark cycles) on a hosted arm64 machine at Packet. I run the Works on Arm project and we have a cluster (which @Habbie can speak to directly from having used it) for arm64 tests. This effort is funded by Arm and has resources both for short-term test and development and long-term CI. See http://github.com/worksonarm/cluster for details.
@vielmetti Thanks Ed, for helping to revisit this issue. @agentzh Thanks for the points you raise about the performance issues and workarounds. We discovered this whole issue with lightuserdata while building ONAP components for an initial demo, where OpenResty is leveraged. We solved our immediate problem by leveraging one of the workarounds employed by others, but certainly the longer term solution must still be found and optimized performance established for Arm platforms.
@Habbie I am not sure if lua_pushstring will solve the problem here. We have to be able to store pointer addresses to data structures like ngx_request_t or ngx_cycle_t and just being able to store their type is not good enough. Unless you meant to store the value of the address as strings, but that seems to be very hacky compared to the cdata approach.
Where addresses (the address of a type_info
) were used as indexes, we replaced those with (semi-human readable) strings (the mangled type names) - https://github.com/ahupowerdns/luawrapper/pull/46
Where addresses needed to be stored for later retrieval, we boxed them in a full userdata - https://github.com/ahupowerdns/luawrapper/pull/45
Where we were using addresses as indexes but really didn't need to, we replaced that with luaL_ref
- https://github.com/ahupowerdns/luawrapper/pull/44
We did not go for cdata
as we'd like our code to stay portable to non-LuaJIT. If real world experience turns out much worse than the micro-benchmark, we'll look into making the change above a compile time choice.
I'm setting up an Nginx::Test environment on a VM right now so I can make the changes on an x86 environment and ensure they don't break core capabilities before verifying on an ARM64.
Highly recommend this approach - I did the same for PowerDNS.
Another thing we need to find a solution to is to port the opitmized lj_str_new() using Intel's fast SSE 4.2 instruction set's CRC instruction over to ARM64 equivalents otherwise the hash collisions on ARM64 can be really ugly:
openresty/luajit2@7923c63
Thanks for pointing this out. I am trying to port this one to ARM64.
Noted in the above referenced issue that OpenWhisk for arm64 is blocked on this.
@agentzh Any updates?
We've finally prepared all the patches for all the OpenResty components. All our tests have successfully passed in a real arm64 box, including the valgrind memcheck tests. We're currently doing the final round of code review.
One good news is that the simplest Lua handler on OpenResty is also overall 10% faster with wrk benchmark on x86_64 (and we should see comparable results on all the other architectures we already supported). We generally didn't follow the suggestions from the LuaJIT community since they tend to make things slower rather than faster :)
@agentzh Hi, I have the same question, cound you tell me where to get the patches or the version?
nginx: [error] init_by_lua error: fail to require cjson.safe module,err:bad light userdata pointer stack traceback: [builtin#20]: at 0xffff88d18920 [C]: in function 'require' /usr/local/NSP/etc/router_tmp/router_config.lua:18: in main chunk
@luohuichang Just try this for now:
https://openresty.org/download/openresty-1.15.8.1rc0.tar.gz
See below for building instructions:
https://openresty.org/en/installation.html
This is already resolved in the git master branch and this rc0 version. Closing this.
@agentzh , Hi, I have the same question with @luohuichang , and I try the version openresty-1.15.8.1rc0.tar.gz. It's really work. But when I integrate it with Kong. we get the following error:
[root@kwephicprc03693 bin]# ./kong
2019/03/04 17:39:56 [warn] 26412#0: 2 [lua] _G write guard:12: __newindex(): writing a global lua variable ('_NETTLE_LIB_PATH') which may lead to race conditions between concurrent requests, so prefer the use of 'local' variables
stack traceback:
./kong:5: in function 'file_gen'
init_worker_by_lua:45: in function
Below is the script file content named kong:
[root@kwephicprc03693 bin]# cat kong
_NETTLE_LIB_PATH = "/opt/cloud/openresty/lualib/nettle-3.4/" _KEY_COMPONENTS_PATH = "/opt/cloud/openresty/nginx/conf/"
package.path = "./?.lua;./?/init.lua;" .. package.path package.cpath = "./?.so;./?/init.so;/opt/cloud/openresty/luajit/share/lua/5.1/?.so;/opt/cloud/openresty/luajit/share/lua/5.1/?/init.so;" .. package.cpath
the problem mentioned above occur on ARM platform.
@zllovezj The new version of OpenResty just exposes misuse of globals in Kong (or its plugins). You should contact the Kong team to address them or support you to address them. Not an issue in OpenResty itself.
@agentzh thanks
@agentzh Hi, I find the problem "bad light userdata pointer stack traceback" in the lib lpeg.so on ARM64. and I find the source file change you solved the problem on openresty like below.
@timusketeers You should contact the maintainers of that lpeg library instead. We are not maintaining that module. Maybe it actually uses the light userdata as real C pointers and dereferences them.
For Lua modules maintained by us, like lua-cjson, you can try it out by just using the latest OpenResty 1.15.8.1 RC1. And we'd like to hear about any issues with OpenResty's Lua C modules.
@agentzh I will contract the maintainers of lpeg, thanks
The original discussion started here:
https://github.com/openresty/lua-nginx-module/issues/757#issuecomment-328519741
But it really does not belong there.