omniosorg / omnios-extra

Packages for OmniOS extra
https://omnios.org
Other
27 stars 58 forks source link

neovim crashes with "E41: Out of memory!" #1539

Open jclulow opened 1 week ago

jclulow commented 1 week ago

@pfmooney and I have seen random crashes from Neovim 0.10 lately, and I have finally had some cycles to dig in and debug it. The crux of the issue is that the embedded LuaJIT library is doing some relatively daft things with mappings, and depending on how unlucky you get with random numbers, it will collide with the process heap and malloc() will stop being able to allocate more memory.

When this happens, the process exits unceremoniously with somewhat corrupted output:

E41: Out of memory!
                   Vim: Finished.
$

I used DTrace to get a sense of what was causing the allocation failures. They're all failures from malloc(), and if you look at a regular nvim process that has not yet failed, it's not hard to see why:

$ pmap -x $(pgrep -f /bin/nvim.--embed)
27112:  /opt/ooce/neovim/bin/nvim --embed
         Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
0000000000400000       5396       4560          -          - r-x--  nvim
0000000000954000         92         92         64          - rw---  nvim
000000000096B000        104         92         92          - rw---  nvim
0000000001204000       7152       7144       7144          - rw---    [ heap ]
000000000B3C0000         64         56          -          - r-x--    [ anon ]
000000000B3D0000         64         64          -          - r-x--    [ anon ]
00007396754A3000       2048       1964       1964          - rw---    [ anon ]
FFFFFC7FECA12000        324        288          -          - r-x--  ld.so.1
FFFFFC7FECA73000         12         12         12          - rwx--  ld.so.1
FFFFFC7FECA76000          8          8          8          - rwx--  ld.so.1
FFFFFC7FECE90000          4          4          4          - rwx--    [ anon ]
...

Critically, in this particular instance, the heap currently extends up to 0x1205BF0, and the MAP_ANON mappings that LuaJIT has created are as low as 0xB3C0000, which is only 161MB away! The initial address that's chosen for the mapping is made up by a random number generator in LuaJIT, and constrained to be within about 2GB of some of the language runtime functions; these are compiled into the binary, so they're down under the heap in the address space. Further compounding the issue, LuaJIT appears to really want to grow its allocation downward, setting it on a collision course with the heap.

This doesn't happen as much on, say, Ubuntu, because they are able to build and ship LuaJIT as a shared library. This library ends up loaded way up high in the address space, very far from the heap, and so mappings in that 2GB neighbourhood are less of an issue. We should try to do the same thing, but I believe some investigation may be required to make that work correctly.

In the meantime, it seems possible to work around the issue by preloading libumem and setting UMEM_OPTIONS to backend=mmap in the environment. This causes malloc() to be provided by umem, and to avoid the classic sbrk()-based heap entirely. We can provide a default value for UMEM_OPTIONS by adding a C function with the appropriate name to the resultant binary; e.g.,

const char *
_umem_options_init(void)
{
        return ("backend=mmap");
}

Note that this is not a fix, and we should still also investigate building LuaJIT as a shared library, to at least be on the same footing as other common platforms that ship Neovim and LuaJIT.

jclulow commented 1 week ago

Related notes on the issue with building it as a shared library:

 $ PATH=/usr/gnu/bin:$PATH gmake -j8
==== Building LuaJIT 2.1 ====
gmake -C src
gmake[1]: Entering directory '/ws/safari/luajit/src'
DYNLINK   libluajit.so
Text relocation remains                         referenced
    against symbol                  offset      in file
lj_tab_len                          0x63c       lj_vm_dyn.o
lj_meta_cat                         0xa71       lj_vm_dyn.o
lj_func_closeuv                     0xd15       lj_vm_dyn.o
lj_gc_barrieruv                     0xc20       lj_vm_dyn.o
lj_gc_barrieruv                     0xc88       lj_vm_dyn.o
lj_tab_newkey                       0x120d      lj_vm_dyn.o
lj_tab_dup                          0xe28       lj_vm_dyn.o
lj_gc_step_fixtop                   0xdf5       lj_vm_dyn.o
lj_gc_step_fixtop                   0xe5c       lj_vm_dyn.o
lj_tab_new                          0xdba       lj_vm_dyn.o
lj_func_newL_gc                     0xd55       lj_vm_dyn.o
lj_state_growstack                  0x16c8      lj_vm_dyn.o
lj_state_growstack                  0x1e31      lj_vm_dyn.o
lj_state_growstack                  0x1eed      lj_vm_dyn.o
lj_state_growstack                  0x2bd4      lj_vm_dyn.o
lj_state_growstack                  0x2d17      lj_vm_dyn.o
lj_state_growstack                  0x39b9      lj_vm_dyn.o
lj_tab_reasize                      0x133a      lj_vm_dyn.o
lj_err_throw                        0x1e6b      lj_vm_dyn.o
lj_tab_setinth                      0x2314      lj_vm_dyn.o
lj_meta_tset                        0x22a1      lj_vm_dyn.o
lj_tab_getinth                      0x2211      lj_vm_dyn.o
lj_tab_getinth                      0x294e      lj_vm_dyn.o
lj_meta_tget                        0x21b8      lj_vm_dyn.o
lj_meta_istype                      0x2405      lj_vm_dyn.o
lj_meta_equal_cd                    0x23e6      lj_vm_dyn.o
lj_meta_equal                       0x23c7      lj_vm_dyn.o
lj_meta_comp                        0x234a      lj_vm_dyn.o
lj_meta_for                         0x2525      lj_vm_dyn.o
lj_meta_call                        0x24ca      lj_vm_dyn.o
lj_meta_len                         0x249d      lj_vm_dyn.o
lj_meta_arith                       0x245a      lj_vm_dyn.o
lj_tab_next                         0x2831      lj_vm_dyn.o
lj_strfmt_num                       0x27e3      lj_vm_dyn.o
lj_tab_get                          0x273a      lj_vm_dyn.o
lj_ffh_coroutine_wrap_err           0x2d04      lj_vm_dyn.o
lj_buf_putstr_lower                 0x34d1      lj_vm_dyn.o
lj_buf_tostr                        0x346d      lj_vm_dyn.o
lj_buf_tostr                        0x34d9      lj_vm_dyn.o
lj_buf_tostr                        0x3545      lj_vm_dyn.o
lj_buf_putstr_reverse               0x3465      lj_vm_dyn.o
lj_str_new                          0x3335      lj_vm_dyn.o
lj_buf_putstr_upper                 0x353d      lj_vm_dyn.o
lj_dispatch_call                    0x3aec      lj_vm_dyn.o
lj_trace_hot                        0x3abd      lj_vm_dyn.o
lj_dispatch_ins                     0x3a58      lj_vm_dyn.o
lj_gc_step                          0x39e5      lj_vm_dyn.o
lj_err_trace                        0x3d99      lj_vm_dyn.o
lj_trace_exit                       0x3cbb      lj_vm_dyn.o
lj_dispatch_profile                 0x3bd0      lj_vm_dyn.o
lj_dispatch_stitch                  0x3ba6      lj_vm_dyn.o
lj_err_unwind_dwarf                 0x12        lj_vm_dyn.o
lj_ccallback_leave                  0x4080      lj_vm_dyn.o
lj_ccallback_enter                  0x401f      lj_vm_dyn.o
ld: fatal: relocations remain against allocatable but non-writable sections
collect2: error: ld returned 1 exit status
gmake[1]: *** [Makefile:737: libluajit.so] Error 1
gmake[1]: Leaving directory '/ws/safari/luajit/src'
gmake: *** [Makefile:126: default] Error 2

These relocations are ostensibly all of type R_AMD64_PC32:

 $ elfdump -r src/lj_vm_dyn.o

Relocation Section:  .rela.text
    type                               offset             addend  section        symbol
  R_AMD64_PC32                          0x63c 0xfffffffffffffffc  .rela.text     lj_tab_len
  R_AMD64_PLT32                         0xa2e 0xfffffffffffffffc  .rela.text     pow
  R_AMD64_PC32                          0xa71 0xfffffffffffffffc  .rela.text     lj_meta_cat
  R_AMD64_PC32                          0xc20 0xfffffffffffffffc  .rela.text     lj_gc_barrieruv
  R_AMD64_PC32                          0xc88 0xfffffffffffffffc  .rela.text     lj_gc_barrieruv
  R_AMD64_PC32                          0xd15 0xfffffffffffffffc  .rela.text     lj_func_closeuv
  R_AMD64_PC32                          0xd55 0xfffffffffffffffc  .rela.text     lj_func_newL_gc
  R_AMD64_PC32                          0xdba 0xfffffffffffffffc  .rela.text     lj_tab_new
  R_AMD64_PC32                          0xdf5 0xfffffffffffffffc  .rela.text     lj_gc_step_fixtop
  R_AMD64_PC32                          0xe28 0xfffffffffffffffc  .rela.text     lj_tab_dup
  R_AMD64_PC32                          0xe5c 0xfffffffffffffffc  .rela.text     lj_gc_step_fixtop
  R_AMD64_PC32                         0x120d 0xfffffffffffffffc  .rela.text     lj_tab_newkey
  R_AMD64_PC32                         0x133a 0xfffffffffffffffc  .rela.text     lj_tab_reasize
  R_AMD64_PC32                         0x16c8 0xfffffffffffffffc  .rela.text     lj_state_growstack
  R_AMD64_PC32                         0x1e31 0xfffffffffffffffc  .rela.text     lj_state_growstack
  R_AMD64_PC32                         0x1eed 0xfffffffffffffffc  .rela.text     lj_state_growstack
  R_AMD64_PC32                         0x21b8 0xfffffffffffffffc  .rela.text     lj_meta_tget
  R_AMD64_PC32                         0x2211 0xfffffffffffffffc  .rela.text     lj_tab_getinth
  R_AMD64_PC32                         0x22a1 0xfffffffffffffffc  .rela.text     lj_meta_tset
  R_AMD64_PC32                         0x2314 0xfffffffffffffffc  .rela.text     lj_tab_setinth
  R_AMD64_PC32                         0x234a 0xfffffffffffffffc  .rela.text     lj_meta_comp
  R_AMD64_PC32                         0x23c7 0xfffffffffffffffc  .rela.text     lj_meta_equal
  R_AMD64_PC32                         0x23e6 0xfffffffffffffffc  .rela.text     lj_meta_equal_cd
  R_AMD64_PC32                         0x2405 0xfffffffffffffffc  .rela.text     lj_meta_istype
  R_AMD64_PC32                         0x245a 0xfffffffffffffffc  .rela.text     lj_meta_arith
  R_AMD64_PC32                         0x249d 0xfffffffffffffffc  .rela.text     lj_meta_len
  R_AMD64_PC32                         0x24ca 0xfffffffffffffffc  .rela.text     lj_meta_call
  R_AMD64_PC32                         0x2525 0xfffffffffffffffc  .rela.text     lj_meta_for
  R_AMD64_PC32                         0x273a 0xfffffffffffffffc  .rela.text     lj_tab_get
  R_AMD64_PC32                         0x27e3 0xfffffffffffffffc  .rela.text     lj_strfmt_num
  R_AMD64_PC32                         0x2831 0xfffffffffffffffc  .rela.text     lj_tab_next
  R_AMD64_PC32                         0x294e 0xfffffffffffffffc  .rela.text     lj_tab_getinth
  R_AMD64_PC32                         0x2bd4 0xfffffffffffffffc  .rela.text     lj_state_growstack
  R_AMD64_PC32                         0x2d04 0xfffffffffffffffc  .rela.text     lj_ffh_coroutine_wrap_err
  R_AMD64_PC32                         0x2d17 0xfffffffffffffffc  .rela.text     lj_state_growstack
  R_AMD64_PLT32                        0x2e55 0xfffffffffffffffc  .rela.text     log
  R_AMD64_PLT32                        0x2e83 0xfffffffffffffffc  .rela.text     log10
  R_AMD64_PLT32                        0x2eb1 0xfffffffffffffffc  .rela.text     exp
  R_AMD64_PLT32                        0x2edf 0xfffffffffffffffc  .rela.text     sin
  R_AMD64_PLT32                        0x2f0d 0xfffffffffffffffc  .rela.text     cos
  R_AMD64_PLT32                        0x2f3b 0xfffffffffffffffc  .rela.text     tan
  R_AMD64_PLT32                        0x2f69 0xfffffffffffffffc  .rela.text     asin
  R_AMD64_PLT32                        0x2f97 0xfffffffffffffffc  .rela.text     acos
  R_AMD64_PLT32                        0x2fc5 0xfffffffffffffffc  .rela.text     atan
  R_AMD64_PLT32                        0x2ff3 0xfffffffffffffffc  .rela.text     sinh
  R_AMD64_PLT32                        0x3021 0xfffffffffffffffc  .rela.text     cosh
  R_AMD64_PLT32                        0x304f 0xfffffffffffffffc  .rela.text     tanh
  R_AMD64_PLT32                        0x3094 0xfffffffffffffffc  .rela.text     pow
  R_AMD64_PLT32                        0x30d9 0xfffffffffffffffc  .rela.text     atan2
  R_AMD64_PLT32                        0x311e 0xfffffffffffffffc  .rela.text     fmod
  R_AMD64_PLT32                        0x3191 0xfffffffffffffffc  .rela.text     frexp
  R_AMD64_PLT32                        0x31dd 0xfffffffffffffffc  .rela.text     modf
  R_AMD64_PC32                         0x3335 0xfffffffffffffffc  .rela.text     lj_str_new
  R_AMD64_PC32                         0x3465 0xfffffffffffffffc  .rela.text     lj_buf_putstr_reverse
  R_AMD64_PC32                         0x346d 0xfffffffffffffffc  .rela.text     lj_buf_tostr
  R_AMD64_PC32                         0x34d1 0xfffffffffffffffc  .rela.text     lj_buf_putstr_lower
  R_AMD64_PC32                         0x34d9 0xfffffffffffffffc  .rela.text     lj_buf_tostr
  R_AMD64_PC32                         0x353d 0xfffffffffffffffc  .rela.text     lj_buf_putstr_upper
  R_AMD64_PC32                         0x3545 0xfffffffffffffffc  .rela.text     lj_buf_tostr
  R_AMD64_PC32                         0x39b9 0xfffffffffffffffc  .rela.text     lj_state_growstack
  R_AMD64_PC32                         0x39e5 0xfffffffffffffffc  .rela.text     lj_gc_step
  R_AMD64_PC32                         0x3a58 0xfffffffffffffffc  .rela.text     lj_dispatch_ins
  R_AMD64_PC32                         0x3abd 0xfffffffffffffffc  .rela.text     lj_trace_hot
  R_AMD64_PC32                         0x3aec 0xfffffffffffffffc  .rela.text     lj_dispatch_call
  R_AMD64_PC32                         0x3ba6 0xfffffffffffffffc  .rela.text     lj_dispatch_stitch
  R_AMD64_PC32                         0x3bd0 0xfffffffffffffffc  .rela.text     lj_dispatch_profile
  R_AMD64_PC32                         0x3cbb 0xfffffffffffffffc  .rela.text     lj_trace_exit
  R_AMD64_PC32                         0x3d99 0xfffffffffffffffc  .rela.text     lj_err_trace
  R_AMD64_PC32                         0x401f 0xfffffffffffffffc  .rela.text     lj_ccallback_enter
  R_AMD64_PC32                         0x4080 0xfffffffffffffffc  .rela.text     lj_ccallback_leave
  R_AMD64_PC32                         0x1e6b 0xfffffffffffffffc  .rela.text     lj_err_throw

Relocation Section:  .rela.debug_frame
    type                               offset             addend  section        symbol
  R_AMD64_32                             0x1c                  0  .rela.debug_fr .debug_frame (section)
  R_AMD64_64                             0x20                  0  .rela.debug_fr .text (section)
  R_AMD64_32                             0x44                  0  .rela.debug_fr .debug_frame (section)
  R_AMD64_64                             0x48                  0  .rela.debug_fr lj_vm_ffi_call

Relocation Section:  .rela.eh_frame
    type                               offset             addend  section        symbol
  R_AMD64_PC32                           0x12                  0  .rela.eh_frame lj_err_unwind_dwarf
  R_AMD64_PC32                           0x28                  0  .rela.eh_frame .text (section)
  R_AMD64_PC32                           0x60                  0  .rela.eh_frame lj_vm_ffi_call

Of note, src/lj_vm_dyn.o appears to be built from a very special assembler, hand-crafted for this project! e.g., the routine lj_vmeta_equal():

lj_meta_equal                       0x23c7      lj_vm_dyn.o

  ....

lj_vmeta_equal()
    0x23a1: 48 c1 e0 11        shlq   $0x11,%rax
    0x23a5: 48 c1 e8 11        shrq   $0x11,%rax
    0x23a9: 48 83 eb 04        subq   $0x4,%rbx
    0x23ad: 48 89 ce           movq   %rcx,%rsi
    0x23b0: 89 e9              movl   %ebp,%ecx
    0x23b2: 48 8b 6c 24 10     movq   0x10(%rsp),%rbp
    0x23b7: 48 89 55 20        movq   %rdx,0x20(%rbp)
    0x23bb: 48 89 c2           movq   %rax,%rdx
    0x23be: 48 89 ef           movq   %rbp,%rdi
    0x23c1: 48 89 5c 24 18     movq   %rbx,0x18(%rsp)
    0x23c6: e8 00 00 00 00     call   +0x0      <0x23cb>    <----- relocation here
    0x23cb: eb 81              jmp    -0x7f     <0x234e>

comes from src/vm_x64.dasc:

  |->vmeta_equal:
  |  cleartp TAB:RD
  |  sub PC, 4
  |.if X64WIN
  |  mov CARG3, RD
  |  mov CARG4d, RBd
  |  mov L:RB, SAVE_L
  |  mov L:RB->base, BASE       // Caveat: CARG2 == BASE.
  |  mov CARG2, RA
  |  mov CARG1, L:RB            // Caveat: CARG1 == RA.
  |.else
  |  mov CARG2, RA
  |  mov CARG4d, RBd            // Caveat: CARG4 == RA.
  |  mov L:RB, SAVE_L
  |  mov L:RB->base, BASE       // Caveat: CARG3 == BASE.
  |  mov CARG3, RD
  |  mov CARG1, L:RB
  |.endif
  |  mov SAVE_PC, PC
  |  call extern lj_meta_equal  // (lua_State *L, GCobj *o1, *o2, int ne)
  |  // 0/1 or TValue * (metamethod) returned in eax (RC).
  |  jmp <3

The special assembler is invoked thus:

BUILDVM   lj_vm.S
host/buildvm -m elfasm -o lj_vm.S
ASM       lj_vm.o
gcc -fPIC -O2 -fomit-frame-pointer -Wall   -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -U_FORTIFY_SOURCE  -DLUA_MULTILIB=\"lib\" -DLUA_LJDIR=\"/usr/local/share/luajit-2.1\" -fno-stack-protector -DLUAJIT_UNWIND_EXTERNAL   -c -o lj_vm_dyn.o lj_vm.S
gcc -O2 -fomit-frame-pointer -Wall   -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -U_FORTIFY_SOURCE  -DLUA_MULTILIB=\"lib\" -DLUA_LJDIR=\"/usr/local/share/luajit-2.1\" -fno-stack-protector -DLUAJIT_UNWIND_EXTERNAL   -c -o lj_vm.o lj_vm.S

It produces assembly files that it then gives to GCC; e.g., for the function shown above:

        .globl lj_vmeta_equal
        .hidden lj_vmeta_equal
        .type lj_vmeta_equal, @function
        .size lj_vmeta_equal, 44
lj_vmeta_equal:
        .byte 72,193,224,17,72,193,232,17,72,131,235,4,72,137,206,137
        .byte 233,72,139,108,36,16,72,137,85,32,72,137,194,72,137,239
        .byte 72,137,92,36,24
        call lj_meta_equal
        .byte 235,129

It's conceivable that the generated assembly is not quite right. More study required to determine where the bug is, exactly.