crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.32k stars 1.61k forks source link

LLVM 18 breaks 128-bit integers in the interpreter #14832

Open straight-shoota opened 1 month ago

straight-shoota commented 1 month ago

An interpreter built with a LLVM 18 compiler crashes with invalid memory access when using a 128-bit integer type.

The reproduction requires a compiler with LLVM 18 which builds a compiler with interpreter. Using a 128-bit integer value in that interpreter causes an invalid memory access:

$ crystal --version
Crystal 1.13.1 (2024-07-12)

LLVM: 18.1.8
Default target: x86_64-unknown-linux-gn
$ make crystal interpreter=1 release=1
Using /usr/bin/llvm-config-18 [version=18.1.3]
CRYSTAL_CONFIG_BUILD_COMMIT="" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1720742400" CC="cc -fuse-ld=lld" CRYSTAL
_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -D strict_multi_assign -D preview_overload_order --release  -o .build/cryst
al src/compiler/crystal.cr -D without_openssl -D without_zlib -D use_pcre2
$ bin/crystal i
Using compiled compiler at .build/crystal
Crystal interpreter 1.13.1 (2024-07-12).
EXPERIMENTAL SOFTWARE: if you find a bug, please consider opening an issue in
https://github.com/crystal-lang/crystal/issues/new/
icr:1> 1_i128
 => Invalid memory access (signal 11) at address 0x0
[0x55c5918c6629] ?? +94307038815785 in /app/.build/crystal
[0x55c5918c65e0] ?? +94307038815712 in /app/.build/crystal
[0x7f83b0326320] ?? +140203573535520 in /lib/x86_64-linux-gnu/libc.so.6
[0x55c591f72740] ?? +94307045812032 in /app/.build/crystal
[0x55c591d8ca5b] ?? +94307043822171 in /app/.build/crystal
[0x55c591f981f6] ?? +94307045966326 in /app/.build/crystal
[0x55c59235c83d] ?? +94307049916477 in /app/.build/crystal
[0x55c5923591cc] ?? +94307049902540 in /app/.build/crystal
[0x55c5922b34e5] ?? +94307049223397 in /app/.build/crystal
[0x55c59186773f] __crystal_main +36271 in /app/.build/crystal
[0x55c59186ddc9] main +73 in /app/.build/crystal
[0x7f83b030b1ca] ?? +140203573424586 in /lib/x86_64-linux-gnu/libc.so.6
[0x7f83b030b28b] __libc_start_main +139 in /lib/x86_64-linux-gnu/libc.so.6
[0x55c59185e8c5] _start +37 in /app/.build/crystal
[0x0] ???

The bug does not reproduce when building the interpreter in debug mode, nor in non-release mode.

This is likely related to LLVM changing the alignment of 128-bit integer types to 128 bits in LLVM 18: https://reviews.llvm.org/D86310

The interpreter may expect the old alignment somewhere.

So far it has only materialized in the interpreter and it's quite likely that this is an interpreter-specific bug. But at this point we cannot certainly exclude any effect on other applications.

This issue appeared while trying to update the previous Crystal release to 1.13.1: https://github.com/crystal-lang/crystal/pull/14810

ysbaddaden commented 1 month ago

This method is suspicious; does it calculate the alignment on a 8 bytes boundary?

https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/interpreter/context.cr#L455-L462

straight-shoota commented 1 month ago

Yeah, looks like it. I've stumbled over that as well. Although for size 16 it returns 16, so it probably doesn't matter for this issue...

straight-shoota commented 1 month ago

Ah, no that indeed is a problem. align is solely based on size, not on the current stack pointer. It needs to take the current stack pointer into account. If the address is not aligned to 16 bytes, we won't be able to get a 16-byte alignment. The interpreter stack does not seem to anticipate an alignment larger than 8 bytes.

HertzDevil commented 1 month ago

IIRC the interpreter memcpys the stack contents rather than dereferencing the stack pointer directly, so if there is an aligned 128-bit load/store at a misaligned address, it is probably somewhere else

straight-shoota commented 1 month ago

stack_push and stack_pop use pointer dereferencing. Or do you mean something else?

crysbot commented 1 month ago

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/invalid-memory-access-signal-11-on-1-13-1-not-present-on-1-12-2/7051/1

straight-shoota commented 1 month ago

https://github.com/crystal-lang/crystal/pull/14843 seems to fix the memory access error. However, it does so by simply avoiding misaligned stack access errors. However, the misalignment still exists.

So it does feels a bit like a workaround. It does not fix the underlying issue that 128-bit integers are not properly aligned.

# interpreter with patch #14843 and LLVM 18
pad = 1_i8
x = 1_i128
pointerof(x).address % alignof(typeof(x)) # => 8

Note: This issue only depends on the LLVM version of the interpreter itself, not of the parent compiler. The interpreter's LLVM version determines the alignment value.

If the alignmed was correct, we would not need #14843 to make the interpreter work.

It's probably still a good idea to merge #14843, though. It removes a dependency on the alignment policy of the parent compiler. This seems like a good idea anyway.