Closed lwakefield closed 2 months ago
~Well... I found a "fix" but I'm not trying to work out why it works, because it feels very hacky...~
Edit - seems this doesn't consistently work, which is fine because I was definitely throwing spaghetti and seeing what stuck...
https://github.com/lwakefield/crystal-pg/commit/1012f8b4a91b1f0c7b941a3fca704ccf9b428bbe
begin
- yield @sized_io
+ r = yield @sized_io
+ r
This smells like a bug in LLVM (or the Crystal compiler).
I can reproduce the invalid memory access only on Crystal built with LLVM 18, not with LLVM 17.
yes - I was coming to a conclusion that I was going to be out of my depth shortly... So that checks out!
@straight-shoota - do you have any tips to generate useful diagnostics? I'm planning on simply logging an issue over on https://github.com/crystal-lang/crystal with as much diagnostic information as possible (everything I posted here plus some versions of crystal/llvm/pg).
Sounds good. I don't think much information about the environment is necessary because the issue clearly reproduces with LLVM 18 but not with LLVM 17:
The error reproduces with the latest 1.13.1 compiler built with LLVM 18:
$ crystal --version
Crystal 1.13.1 (2024-07-12)
LLVM: 18.1.8
Default target: x86_64-unknown-linux-gnu
$ crystal build bool.cr --release
$ ./bool
Invalid memory access (signal 11) at address 0x0
[0x55f506c614e9] ?? +94510868993257 in ./memory-llvm18
[0x55f506c614a0] ?? +94510868993184 in ./memory-llvm18
[0x7fbfda745520] ?? +140461980538144 in /lib/x86_64-linux-gnu/libc.so.6
[0x55f506cc1f6e] ?? +94510869389166 in ./memory-llvm18
[0x55f506c2d6d2] __crystal_main +60258 in ./memory-llvm18
[0x55f506c34b5b] main +59 in ./memory-llvm18
[0x7fbfda72cd90] ?? +140461980437904 in /lib/x86_64-linux-gnu/libc.so.6
[0x7fbfda72ce40] __libc_start_main +128 in /lib/x86_64-linux-gnu/libc.so.6
[0x55f506c1eaa5] _start +37 in ./memory-llvm18
[0x0] ??
It does not reproduce with the latest 1.13.1 compiler built with LLVM 17:
$ crystal
Crystal 1.13.1 (2024-07-12)
LLVM: 17.0.6
Default target: x86_64-pc-linux-gnu
$ crystal build bool.cr --release
Using compiled compiler at /crystal/.build/crystal
$ ./bool
true
I created https://github.com/crystal-lang/crystal/issues/14898 to track this issue in the Crystal repo.
Thanks @straight-shoota for opening it there, and nice find @lwakefield! I'm going to close this one here since it seems like the new issue in crystal-lang/crystal is a better spot. But if anyone disagrees let me know and we can reopen this one.
Just wanted to say thanks to everyone in this discussion! :heart: Great job tracking it down to such a simple example @lwakefield -- this appears to have fixed our issue too: Invalid memory access (signal 11) on 1.13.1, not present on 1.12.2?
Don't know if this issue is related to https://github.com/crystal-lang/crystal-sqlite3/issues/96, but anyway, will try later.
If it reproduces with a Crystal compiler <= 1.13.1 that uses LLVM 18, but not when using LLVM < 18, it's probably the same issue. You may also check against Crystal nightly where the bug is fixed and it should work regardless of LLVM version.
I can still reproduce my issue on Crystal 1.13.2, as following screenshot.
I've been chasing this error for a hot second, and have finally narrowed (reproducible, at least for me) it down to "pg"!
The wrinkle is, the crash only shows up when built with the
--release
flag, which of course has more obtuse debugging output...From what I can tell so far, this only occurs when querying and casting for
Bool
. Though TBD if I can recreate with other data types...So with:
and
but...
Creating this issue in case it is obvious to others, others experience something similar, or others have pointers for me. But ideally I'm planning on chasing this down myself...