I also reported this issue to the lujait project (https://github.com/LuaJIT/LuaJIT/issues/1161), but got little useful response. So I'm also reporting it here, hoping for some more fruitful results.
Background: we ran into an issue running dnsdist on FreeBSD stable/14 (which is the branch that will grow into the next release).
It turned out that we were allocating an object using lua_newuserdata() that required 16-byte alignment, but the default luajit allocator uses 8 as alignment. The string copy constructor is inlined, and uses the fact that the data is supposed to be 16-byte aligned. As a consequence movaps instructions are emitted, that operate on data dat is no 16-byte aligned runtime.
See https://github.com/PowerDNS/pdns/issues/13766 for the back story.
This is what happened in detail:
The object has alignment requirement 16, because it contains a std::function object, which has alignment requirement 16, because the implementation of std::function has a std::__1::aligned_storage<24ul, 16ul>::type __buf_ field.
A placement new constructs the object in a non-16 byte aligned piece of memory using the string copy constructor.
The compiler is smart enough to conclude from the 16-byte alignment requirement of the struct that the string field (which is the the first field of it) is (or at least should be) also 16-byte aligned.
The inlined string copy constructor uses this knowledge and uses the fast movaps instructions that require 16-byte alignment.
The data isn't actually 16-byte aligned -> SIGBUS.
It is no surprise that alignof(max_align_t) on this platform is 16. There are more platforms where that is the case. So far we escaped having problem on those, since only version 17 of c++lib has an inlined string copy constructor. An out-of-line string copy copy constructor does not have the ability to take advantage of alignment requirements other than that of string itself. Our expectation is that with more systems starting to use c++lib 17 or newer, this issue will pop up in more places. But in general compilers have gotten smarter in using the alignment knowledge of the objects they manipulate. Especially on arm64, which has more strict alignment requirements than amd64, this might cause extra problems.
A fundamental solution would be to to change the default alignment of the luajit allocator to agree to what the C/C++ compiler and runtime library assume, i.e. make sure that lua_newuserdata() returns allocations aligned to alignof(max_align_t). This may waste some memory but that unavoidable if you want no alignment related crashes.
I am aware that luajit offers facilities to either use a custom allocator or use the system provided one. But it is important that luajit does the right thing out of the box. Most package maintainers and developers using luajit won't be aware of the subtle alignment issues that can arise when using the default setup as it is now.
Hello,
I also reported this issue to the lujait project (https://github.com/LuaJIT/LuaJIT/issues/1161), but got little useful response. So I'm also reporting it here, hoping for some more fruitful results.
Background: we ran into an issue running
dnsdist
on FreeBSD stable/14 (which is the branch that will grow into the next release). It turned out that we were allocating an object usinglua_newuserdata()
that required 16-byte alignment, but the default luajit allocator uses 8 as alignment. The string copy constructor is inlined, and uses the fact that the data is supposed to be 16-byte aligned. As a consequencemovaps
instructions are emitted, that operate on data dat is no 16-byte aligned runtime. See https://github.com/PowerDNS/pdns/issues/13766 for the back story.This is what happened in detail:
std::function
object, which has alignment requirement 16, because the implementation ofstd::function
has astd::__1::aligned_storage<24ul, 16ul>::type __buf_
field.string
field (which is the the first field of it) is (or at least should be) also 16-byte aligned.string
copy constructor uses this knowledge and uses the fastmovaps
instructions that require 16-byte alignment.It is no surprise that
alignof(max_align_t)
on this platform is 16. There are more platforms where that is the case. So far we escaped having problem on those, since only version 17 of c++lib has an inlined string copy constructor. An out-of-line string copy copy constructor does not have the ability to take advantage of alignment requirements other than that ofstring
itself. Our expectation is that with more systems starting to use c++lib 17 or newer, this issue will pop up in more places. But in general compilers have gotten smarter in using the alignment knowledge of the objects they manipulate. Especially on arm64, which has more strict alignment requirements than amd64, this might cause extra problems.A fundamental solution would be to to change the default alignment of the luajit allocator to agree to what the C/C++ compiler and runtime library assume, i.e. make sure that
lua_newuserdata()
returns allocations aligned toalignof(max_align_t)
. This may waste some memory but that unavoidable if you want no alignment related crashes.I am aware that luajit offers facilities to either use a custom allocator or use the system provided one. But it is important that luajit does the right thing out of the box. Most package maintainers and developers using luajit won't be aware of the subtle alignment issues that can arise when using the default setup as it is now.