Closed xorrvin closed 3 months ago
Update: I've compiled Erlang 26.2.5 from source and additionally compiled debug vm. I've changed Elixir script so that it executes cerl -debug
, and now it crashes with a different core file (still unintelligible though):
coredump_debug.tgz
I tried to run shell in gdb, so that I can execute mix compile
from within the debugger, and got this (I've removed most of repeating JITed symbol file is not an object file, ignoring it.
messages):
(gdb) r -c "mix compile"
Starting program: /bin/sh -c "mix compile"
process 18498 is executing new program: /usr/bin/env
process 18498 is executing new program: /bin/sh
[New process 18498]
process 18498 is executing new program: /bin/sh
[New process 18498]
process 18498 is executing new program: /root/otp_src_26.2.5/bin/x86_64-unknown-netbsd10.0/erlexec
process 18498 is executing new program: /root/otp_src_26.2.5/bin/x86_64-unknown-netbsd10.0/beam.debug.smp
[New LWP 13717 of process 18498]
[New LWP 22609 of process 18498]
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
[New LWP 8870 of process 18498]
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
[New LWP 5249 of process 18498]
[New LWP 11159 of process 18498]
[New LWP 7127 of process 18498]
[New LWP 944 of process 18498]
[New LWP 7076 of process 18498]
[New LWP 28065 of process 18498]
[New LWP 16383 of process 18498]
[New LWP 17406 of process 18498]
[New LWP 16425 of process 18498]
[New LWP 2267 of process 18498]
[New LWP 17568 of process 18498]
[New LWP 28090 of process 18498]
[New LWP 6614 of process 18498]
[New LWP 29903 of process 18498]
[New LWP 147 of process 18498]
[New LWP 26091 of process 18498]
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
gmake[1]: Entering directory '/home/builder/elixirtest/vix/c_src'
gmake[1]: Leaving directory '/home/builder/elixirtest/vix/c_src'
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
Compiling 28 files (.ex)
JITed symbol file is not an object file, ignoring it.
JITed symbol file is not an object file, ignoring it.
--Type <RET> for more, q to quit, c to continue without paging--
Thread 7 "" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 5249 of process 18498]
_rtld_call_ifunc (obj=0x7044910ed400, mask=mask@entry=0x7044d29baa90, cur_objgen=cur_objgen@entry=3)
at /usr/src/libexec/ld.elf_so/reloc.c:325
325 *where = target;
(gdb) bt
#0 _rtld_call_ifunc (obj=0x7044910ed400, mask=mask@entry=0x7044d29baa90, cur_objgen=cur_objgen@entry=3)
at /usr/src/libexec/ld.elf_so/reloc.c:325
#1 0x00007f7fb22065ad in _rtld_call_ifunc_functions (cur_objgen=3, obj=<optimized out>, mask=0x7044d29baa90)
at /usr/src/libexec/ld.elf_so/rtld.c:273
#2 _rtld_call_ifunc_functions (cur_objgen=3, obj=<optimized out>, mask=0x7044d29baa90)
at /usr/src/libexec/ld.elf_so/rtld.c:266
#3 _rtld_call_init_functions (mask=mask@entry=0x7044d29baa90) at /usr/src/libexec/ld.elf_so/rtld.c:297
#4 0x00007f7fb2207698 in dlopen (name=<optimized out>, mode=2) at /usr/src/libexec/ld.elf_so/rtld.c:1082
#5 0x00000000007949f4 in erts_sys_ddll_open_noext (
dlname=0x7044d414c1e8 "/home/builder/elixirtest/vix/_build/prod/lib/vix/priv/vix.so", handle=0x7044d29bac30,
err=0x7044d29babd0) at sys/unix/erl_unix_sys_ddll.c:131
#6 0x00000000007949a7 in erts_sys_ddll_open (
full_name=0x7044d414c188 "/home/builder/elixirtest/vix/_build/prod/lib/vix/priv/vix", handle=0x7044d29bac30,
err=0x7044d29babd0) at sys/unix/erl_unix_sys_ddll.c:116
#7 0x000000000071a0c0 in erts_load_nif (c_p=0x70448afe9bb0, I=0x7f7ff6d26214, filename=123439695930610, args=15)
at beam/erl_nif.c:4678
#8 0x000000000048db29 in beam_jit_load_nif (c_p=0x70448afe9bb0, I=0x7f7ff6d26214, reg=0x7044d29badc0)
at beam/jit/beam_jit_common.cpp:683
#9 0x00007f7ff5ef03d8 in ?? ()
#10 0x0000000000000000 in ?? ()
So it looks like it crashes upon opening compiled library. My current theory is that library does another dlopen
and something goes wrong?
Another update: I've isolated it to libvips version. Git version works okay, while latest stable release segfaults the VM.
Upon further debug it turns out that the culprit is in how native code is linked: removing -Wl,-z,relro
from linker args of the library solves the issue. It seems to be NetBSD-specific:
https://mail-index.netbsd.org/netbsd-bugs/2023/12/26/msg080904.html
Seems like this is not an issue with Erlang/OTP so I'm closing this issue.
Describe the bug Trying to install vix (Elixir wrapper for libvips graphical library) crashes VM with segfault. This package uses native code, so there's compile/make phase involved, however crash occures after all native code is linked.
To Reproduce This may be quite tedious, but in essence you'll need to spin up NetBSD 10 VM, and bootstrap pkgsrc. Erlang and Elixir are in the main tree, and libvips can be found at https://github.com/NetBSD/pkgsrc-wip/tree/master/libvips. I understand it sounds like too much, and can provide access to my VM by request or run some commands if needed.
Anyway, it goes like this.
This needed to indicate that target library is already provided by the system, otherwise vix script would try to compile it and fail on NetBSD:
Cloning latest version (doesn't really matter at this point):
Installing deps:
Compilation
Trying again, to ensure it's not connected to the native code:
Expected behavior No crash
Affected versions Erlang v26.2.5, Elixir v.1.14.5, Elixir v1.16.2
Additional context
I thought that maybe it crashes due to some unresolved native dependency, but .so looks alright:
I'm really puzzled regarding what goes wrong, because opening coredump yields nothing spectacular. I'm attaching coredump here for brevity: coredump.tgz