Kong / openresty-patches

Moved to https://github.com/Kong/kong-build-tools
Apache License 2.0
14 stars 6 forks source link

fix(luajit) allow regular pointers to be used as lightuserdata in ARM64 #47

Closed hishamhm closed 5 years ago

hishamhm commented 5 years ago

This patch allows for regular pointers to be used as lightuserdata in ARM64.

The source of the problem is that ARM64 has 48 bits of addressable memory in userspace (256 GiB), but LuaJIT only supports 47 bits for lightuserdata (128 GiB). This means that half of the memory is not addressable.

This is what the ARM64 user space memory layout looks like on Linux:

    / 0000000000000000 1111111111111111 1111111111111111 1111111111111111 |
    | 0000000000000000 1111111111111111 1111111111111111 1111111111111110 V stack
 64G| ...
    | 0000000000000000 1100000000000000 0000000000000000 0000000000000001
    \ 0000000000000000 1100000000000000 0000000000000000 0000000000000000
    / 0000000000000000 1011111111111111 1111111111111111 1111111111111111
    | 0000000000000000 1011111111111111 1111111111111111 1111111111111110
 64G| ...
    | 0000000000000000 1000000000000000 0000000000000000 0000000000000001
    \ 0000000000000000 1000000000000000 0000000000000000 0000000000000000
    / 0000000000000000 0111111111111111 1111111111111111 1111111111111111
    | 0000000000000000 0111111111111111 1111111111111111 1111111111111110
 64G| ...
    | 0000000000000000 0100000000000000 0000000000000000 0000000000000001
    \ 0000000000000000 0100000000000000 0000000000000000 0000000000000000
    / 0000000000000000 0011111111111111 1111111111111111 1111111111111111
    | 0000000000000000 0011111111111111 1111111111111111 1111111111111110
 64G| ...
    | 0000000000000000 0000000000000000 0000000000000000 0000000000000001 ^ heap
    \ 0000000000000000 0000000000000000 0000000000000000 0000000000000000 |

Notice that the stack grows downwards from the top (this is also the area where static variables are allocated), and the heap grows upwards from the bottom. By losing bit 48, that means that only the bottom 128 GiB are addressable, meaning that all pointers to the stack and to static variables are invalid, and cause a "bad light userdata pointer" error.

This patch modifies ARM64 support so that instead of losing access to the top 128 GiB, we lose access of the middle 128 GiB, allowing for heap and stack pointers to be valid, up to 64 GiB each. This greatly restores compatibility with third-party Lua modules, most of which do not expect over 64 GiB of addressable space but do expect to be able to store lightuserdata of static variables: it allows, for example, lua-cjson 2.1.0.6 to run unmodified with LuaJIT on ARM64 (because it stores lightuserdata pointing to static variables).

This patch does this safely: LuaJIT's original check for bad pointers is still in place.

How this patch works:

LuaJIT does a check and produces a "bad light userdata pointer" if any bit from bits 48 to 64 are set.

On input (lua_pushlightuserdata), this patch checks bits 47 and 48:

In other words, if bit 47 is set, flip bit 48.

On output (lua_touserdata), this patch checks bit 47, and if it is set, it sets bit 48 as well.

hishamhm commented 5 years ago

@lukego a review on this is welcome if you have any cycles :)

lukego commented 5 years ago

Heavy stuff! :-) I'm an "FFI guy" so I haven't deep dived on lightuserdata in the VM especially much but here are some thoughts - mostly questions - that come to mind:

Can we be confident that Linux won't use stack addresses in the excluded range and cause runtime errors? Is there a risk of non-deterministic runtime errors due to address-space layout randomization (ASLR) causing the stack to cross the boundary into unrepresentable values?

Have we updated all of the necessary parts of the code? Do we need to update the JIT compiler to apply the same bit-twiddling transformation when assembling inline loads/stores/conversions? Do we need to update the snapshot-restoring code for flushing lightuserdata values from machine registers onto the Lua stack?

Would it be simpler to extend the lightuserdata type to support the full 48-bit address space instead? Currently the VM has one type-tag value for lightuserdata but if we allocated a second then we should be able to represent twice as many values (e.g. separate type-tags depending on whether the high bit of the address is set.) If we wanted to do this then I believe there would be a couple of available type-tag values to choose from.

hishamhm commented 5 years ago

Can we be confident that Linux won't use stack addresses in the excluded range and cause runtime errors?

Only if the stack grew beyond 64GiB, but we get a stack overflow in LuaJIT way before that.

Is there a risk of non-deterministic runtime errors due to address-space layout randomization (ASLR) causing the stack to cross the boundary into unrepresentable values?

I thought of that and I checked it: I haven't found ARM64 specific info, but it looks like ASLR on Linux generally randomizes addresses within a relatively short range (e.g. 8MB) — it won't start your heap in the middle of the memory space, for example, because that would mean that you just lost access to half the memory below it. So this patch means we have effective 64GiB heap space give or take the ASLR variation, which is a few megs.

Do we need to update the JIT compiler to apply the same bit-twiddling transformation when assembling inline loads/stores/conversions? Do we need to update the snapshot-restoring code for flushing lightuserdata values from machine registers onto the Lua stack?

I don't think so, because as far as Lua is concerned lightuserdata is an opaque value, which only needs to respect equality. The JIT compiler just goes about its business passing the encoded value around.

There is one problematic situation that I can think of, though, which is casting a lightuserdata to a FFI C pointer: that would produce an incorrect pointer for stack pointers. I don't know how up-to-date the LuaJIT website is on this regard, but it mentions that this conversion isn't ever JIT-compiled so fixing this could be doable without a lot of pain.

This should be sufficient: (untested)

--- a/LuaJIT-2.1-20190507/src/lj_cconv.c    2019-05-02 17:58:14.000000000 -0300
+++ b/LuaJIT-2.1-20190507/src/lj_cconv.c    2019-07-22 13:29:19.007491869 -0300
@@ -612,6 +612,10 @@
       tmpptr = *(void **)tmpptr;
   } else if (tvislightud(o)) {
     tmpptr = lightudV(o);
+#if LJ_TARGET_ARM64 
+    if ((uintptr_t) tmpptr >> 46)
+      tmpptr = (void*) ((uintptr_t) (tmpptr) | (1UL << 47));
+#endif
   } else if (tvisfunc(o)) {
     void *p = lj_ccallback_new(cts, d, funcV(o));
     if (p) {

I also considered moving the decoding up to the definition of lightudV, so that both lj_api.c would lj_cconv.c use it: (also untested)

diff -u bundle/LuaJIT-2.1-20190507/src/lj_obj.h build/LuaJIT-2.1-20190507/src/lj_obj.h
--- bundle/LuaJIT-2.1-20190507/src/lj_obj.h 2019-05-02 17:58:14.000000000 -0300
+++ build/LuaJIT-2.1-20190507/src/lj_obj.h  2019-07-22 12:53:40.290579574 -0300
@@ -801,8 +801,16 @@
 #endif
 #define boolV(o)   check_exp(tvisbool(o), (LJ_TFALSE - itype(o)))
 #if LJ_64
+#if LJ_TARGET_ARM64
+#define lightudV(o) \
+  check_exp(tvislightud(o), \
+    ( ((uintptr_t) (o)->u64 & (1UL << 46))
+    ? (void *)(((o)->u64 & U64x(00007fff,ffffffff)) ^ (1UL << 47))
+    : (void *)( (o)->u64 & U64x(00007fff,ffffffff)) ) )
+#else
 #define lightudV(o) \
   check_exp(tvislightud(o), (void *)((o)->u64 & U64x(00007fff,ffffffff)))
+#endif
 #else
 #define lightudV(o)    check_exp(tvislightud(o), gcrefp((o)->gcr, void))
 #endif

However I'm not a fan of this solution because lightudV is also used in a couple of other places (including converting lua_CFunctions, where the masking doesn't seem necessary). I would rather keep the patch as explicit as possible and not have it rely on propagating its effects implicitly.

I think this FFI conversion is a bit of a corner case, since FFI libs tend to use FFI pointers, and Lua/C modules use lightuserdata, so I don't expect this cross-over to be common. (The fact that the OpenResty port only ran into this problem in modules that exercise the Lua/C API also indicates this to be the case.) Still, I would be up for adding the lj_cconv.c change above to the patch and giving it a spin if that's deemed necessary.

Would it be simpler to extend the lightuserdata type to support the full 48-bit address space instead?

Definitely not simpler, given it's a 4 year old conversation with no sign of a definitive solution in sight. :)

Currently the VM has one type-tag value for lightuserdata but if we allocated a second then we should be able to represent twice as many values (e.g. separate type-tags depending on whether the high bit of the address is set.) If we wanted to do this then I believe there would be a couple of available type-tag values to choose from.

That would give you one extra bit, but it would require way more extensive changes, including hitting the JIT-compiler side as well — which you know about a lot better than me and I think it wouldn't be fun to do at all. Especially since it is a lost cause: 52-bit user addresses are coming anyway so we are going to hit the problem again soon and adding a bit won't make it (though replicating the hack of this PR for 52-bit VA systems would be possible, if one can accept the 64G-stack/64G-heap limitation). My pragmatic stance is that either LuaJIT supports true 64-bit lightuserdata or we'll have to keep papering over it like this patch does. The fact that Mike closed the original issue doesn't give me much hope about the former.

Those were great questions! Thanks for the feedback!

hishamhm commented 5 years ago

So I went ahead and tested the lj_cconv.c change. Lightuserdata to FFI is working fine now too!

Given the following Lua/C module:

#include <lua.h>
#include <stdio.h>

int var = 123;

int mod_ptr(lua_State* L) {
   printf("sent ptr is %p\n", &var);
   lua_pushlightuserdata(L, &var);
   return 1;
}

int mod_getptr(lua_State* L) {
   void* ptr = lua_touserdata(L, 1);
   printf("received ptr is %p\n", ptr);
   printf("value is %d\n", *((int*) ptr));
   return 0;
}

int luaopen_mod(lua_State* L) {
   lua_newtable(L);
   lua_pushliteral(L, "ptr");
   lua_pushcfunction(L, mod_ptr);
   lua_settable(L, -3);
   lua_pushliteral(L, "getptr");
   lua_pushcfunction(L, mod_getptr);
   lua_settable(L, -3);
   return 1;
}

We can get a lightuserdata pointer to static memory, and pass it to an FFI function:

root@hisham-arm64-test:~/build# gcc -o mod.so --shared mod.c -I$(luarocks config variables.LUA_INCDIR)
root@hisham-arm64-test:~/build# $(luarocks config variables.LUA)
LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2017 Mike Pall. http://luajit.org/
JIT: ON fold cse dce fwd dse narrow loop abc sink fuse
> mod = require("mod")
> lu = mod.ptr()
sent ptr is 0xffff92aac050
> mod.getptr(lu)
received ptr is 0xffff92aac050
value is 123
> ffi.cdef[[   
int printf(const char *fmt, ...);
]]
> ffi.C.printf("yay, value is: %p\n", lu)
yay, value is: 0xffff92aac050
> 
thibaultcha commented 5 years ago

We also need to patch lua_cpcall which also accepts user-provided lightuserdata values. An updated version of this patch would look like so, and address both of lua_pushlightuserdata and lua_cpcall:

diff --git a/src/lj_obj.h b/src/lj_obj.h
index a89ea0d..bdd1603 100644
@@ -841,7 +847,12 @@ static LJ_AINLINE void setlightudV(TValue *o, void *p)
 #endif
 }

-#if LJ_64
+#if LJ_TARGET_ARM64
+#define checklightudptr(L, p) \
+  do { if ((uint64_t)p >> 46) p = (void *)((uint64_t)p | (1UL << 47)); \
+    (((uint64_t)(p) >> 47) ? (lj_err_msg(L, LJ_ERR_BADLU), NULL) : (p)); \
+  } while (0)
+#elif LJ_64
 #define checklightudptr(L, p) \
   (((uint64_t)(p) >> 47) ? (lj_err_msg(L, LJ_ERR_BADLU), NULL) : (p))
 #else

I am currently running test suites (LuaJIT's and OpenResty's) with this patch (and variations like above) on an ARM64 box.

hishamhm commented 5 years ago

@thibaultcha Great catch about lua_cpcall!

I see you're turning an expression into a statement there, which I believe is relying on a GCC extension to make this work as an argument to setlightudV. I would keep things standard and just add the check in both places, to avoid annoyances with future compiler versions.

dndx commented 5 years ago

Ah just realized @thibaultcha mentioned about this as well. But still I think lightudV is being used in lj_dispatch.c, lj_obj.c as well. Maybe patching lightudV directly makes the copy-n-paste less.

hishamhm commented 5 years ago

@dndx Yeah, I mentioned lightudV as an option above, but chose not to do it precisely because of the rippling effects. It looked okay at first glance, but the correctness is not as evident as the self-contained changes in the Lua/C API module. I chose to go the safe path over playing code-golf on the number of lines of the patch. :)

thibaultcha commented 5 years ago

I am about to propose an updated version of this patch (given it passes the test suites) which I believe fixes @dndx's concerns already and also addresses a few other edge-cases I have in mind, afaict.

dndx commented 5 years ago

Yeah, I agree with @thibaultcha here that as long as all the test cases pass it should be alright to change setlightudV and lightudV to statements. Since those functions are internal to LuaJIT and as long as the code compiles and test pass it should generate no externally visible effect. Especially if this helps us reduce the patch size.

hishamhm commented 5 years ago

All right, as long as the third-party modules keep working I'm happy! :) Looking forward for the revised patch.

thibaultcha commented 5 years ago

@hishamhm I was about to post it here actually - don't want to keep this PR open?

hutchic commented 5 years ago

Ran this patch on an arm box

root@ip-172-31-25-242:~# uname -a
Linux ip-172-31-25-242 4.15.0-1043-aws #45-Ubuntu SMP Mon Jun 24 14:08:49 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux
root@ip-172-31-25-242:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.2 LTS
Release:        18.04
Codename:       bionic

Without patch

Error:
bad light userdata pointer
stack traceback:
        [C]: in function 'match'
        /usr/local/share/lua/5.1/pgmoon/arrays.lua:92: in function 'fn'
        /usr/local/share/lua/5.1/pgmoon/init.lua:485: in function 'parse_data_row'
        /usr/local/share/lua/5.1/pgmoon/init.lua:371: in function 'format_query_result'
        /usr/local/share/lua/5.1/pgmoon/init.lua:313: in function 'query'
        .../share/lua/5.1/kong/db/strategies/postgres/connector.lua:487: in function 'query'
        .../share/lua/5.1/kong/db/strategies/postgres/connector.lua:689: in function 'schema_migrations'
        /usr/local/share/lua/5.1/kong/db/migrations/state.lua:213: in function 'load'
        /usr/local/share/lua/5.1/kong/db/init.lua:388: in function 'schema_state'
        /usr/local/share/lua/5.1/kong/cmd/migrations.lua:95: in function 'cmd_exec'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:87>
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
        /usr/local/bin/kong:9: in function 'file_gen'
        init_worker_by_lua:48: in function <init_worker_by_lua:46>
        [C]: in function 'xpcall'
        init_worker_by_lua:55: in function <init_worker_by_lua:53>

With patch

kong migrations list --vv        
ERROR: bad light userdata pointer
stack traceback:
        [C]: in function 'require'
        /usr/local/share/lua/5.1/kong/db/dao/init.lua:1: in main chunk
        [C]: in function 'require'
        /usr/local/share/lua/5.1/kong/db/init.lua:1: in main chunk
        [C]: in function 'require'
        /usr/local/share/lua/5.1/kong/cmd/migrations.lua:1: in main chunk
        [C]: in function 'require'
        /usr/local/share/lua/5.1/kong/cmd/init.lua:54: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
        /usr/local/bin/kong:9: in function 'file_gen'
        init_worker_by_lua:48: in function <init_worker_by_lua:46>
        [C]: in function 'xpcall'
        init_worker_by_lua:55: in function <init_worker_by_lua:53>
hishamhm commented 5 years ago

@hutchic Thanks for the test report. I just reproduced this and got the same results.

The reason is because OpenResty's fork of lua-cjson modifies the pointer value (precisely to avoid the issue that this patch solves), and ends up producing a pointer that is in that "middle section" of the address space (making it invalid to the patched LuaJIT's eyes).

So I tested reverting the lua-cjson workaround (making it work like the vanilla lua-cjson, which I was I had originally tested), and I was able to start Kong and run its testsuite.

The Kong testsuite runs but some tests fail: a few of them because they assume pairs order, which is undefined according to Lua, but the tests seem to be assuming the same random order we seem to always get in x86_64, which is not the same under ARM64. I haven't dug into all of the failing tests, but things like the balancer tests run fine, running Postgres (meaning the solves the error triggered by pgmoon in @hutchic's first traceback).

If the masking trick is done only by OpenResty's lua-cjson, I think we should add another patch reverting that as a companion to this one.

hutchic commented 5 years ago

open point of discussion: how can we test this?

hishamhm commented 5 years ago

If the masking trick is done only by OpenResty's lua-cjson, I think we should add another patch reverting that as a companion to this one.

I think the OpenResty devs @thibaultcha and @dndx have the answer to this one: was this masking trick performed anywhere else?

hishamhm commented 5 years ago

@hutchic I added the lua-cjson patch to this branch as well, so it should be possible to re-run the test you did that produced the backtrace above, and now it should work.

thibaultcha commented 5 years ago

Yes, it is done in the ngx_http_lua and ngx_stream_lua modules as well: https://github.com/openresty/lua-nginx-module/blob/d154e5a7c1cbb7810d0224862e577fc8a6b8e3ac/src/ngx_http_lua_common.h#L149

hishamhm commented 5 years ago

@thibaultcha thanks for the feedback! I also saw that luaossl 20190731 already applied the same masking trick.

Since the behavior incompatible with this patch has started to proliferate, I came up with an alternative solution. I'm pushing another revision.

carnei-ro commented 5 years ago

@hishamhm any updates? I'd like to run kong on my raspberry pi :heart:

thibaultcha commented 5 years ago

Superseded by https://github.com/Kong/openresty-patches/pull/49 - closing this PR.

hishamhm commented 5 years ago

@carnei-ro if you're using Raspbian on your Pi, that is still built as ARM32, so all Kong versions should build out of the box. For ARM64, Kong 1.3 packages are indeed on the way!