Closed happenslol closed 1 year ago
Is this reproducible with default neovim or is it possible it only happens with some combination of plugins?
If possible, knowing the line number of where the crash is happening in push_fs_result
would likely be very helpful.
Stacktrace looks similar to one in https://github.com/neovim/neovim/issues/21467
I'm 99% sure it happens when I have neo-tree enabled, but I can't say for certain. However, I've tried disabling their usage of libuv
, and the crashes have persisted.
How would I go about getting the line number? Loading the segfault into gdb only provides me with the stacktrace I have posted above. I'm assuming I can only get the line number out if debug information is compiled in, I'm not sure how I would do that in this case
Edit: Just had a look at that thread. One of the stacktraces in there is the exact same as mine, however the command posted in there (:lua require("luv").handle_get_type(newproxy())
) also causes a segfault for me, albeit with a different stacktrace.
Compiling neovim from source with debug info would probably work. You could also try the instructions here if that makes anything easier:
https://github.com/NixOS/nixpkgs/pull/219400#issuecomment-1455150162
EDIT: Just tried and compiling neovim from source is pretty painless, can use CMAKE_BUILD_TYPE=Debug
for making sure debug info will be available. Wasn't able to reproduce this crash, though.
The :lua require("luv").handle_get_type(newproxy())
crash is different, and should be fixed by https://github.com/luvit/luv/pull/634
I'll try that tomorrow and report back. Thanks for the link!
Alright, that was quite the journey since the nix build has been broken for a few weeks on NixOS due to treesitter not being up to date in nixpkgs, but I got one step further. libluv
still seems to not have debug symbols even though I built neovim-debug
, but at least there's some line numbers for the neovim portion now:
#0 0x00007fe385c789ee in push_fs_result () from /nix/store/fsdy4sq9pi4ibp0p6gjzp9lgi5ap77yq-libluv-1.43.0-0/lib/libluv.so.1
#1 0x00007fe385c7e524 in luv_fs_cb () from /nix/store/fsdy4sq9pi4ibp0p6gjzp9lgi5ap77yq-libluv-1.43.0-0/lib/libluv.so.1
#2 0x00007fe385a603c2 in uv.work_done () from /nix/store/avbmp3dcrbzrckrprx48cxx2mwlh825l-libuv-1.44.2/lib/libuv.so.1
#3 0x00007fe385a6409d in uv.async_io.part () from /nix/store/avbmp3dcrbzrckrprx48cxx2mwlh825l-libuv-1.44.2/lib/libuv.so.1
#4 0x00007fe385a780d5 in uv.io_poll () from /nix/store/avbmp3dcrbzrckrprx48cxx2mwlh825l-libuv-1.44.2/lib/libuv.so.1
#5 0x00007fe385a649bc in uv_run () from /nix/store/avbmp3dcrbzrckrprx48cxx2mwlh825l-libuv-1.44.2/lib/libuv.so.1
#6 0x000000000051c267 in loop_uv_run (loop=0x7ed518 <main_loop>, ms=ms@entry=0, once=true) at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/event/loop.c:65
#7 loop_poll_events (loop=0x7ed518 <main_loop>, ms=ms@entry=0) at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/event/loop.c:87
#8 0x0000000000604b2d in os_breakcheck () at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/os/input.c:197
#9 0x000000000055dc18 in vgetorpeek (advance=140) at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/getchar.c:2378
#10 0x000000000055cfae in vpeekc () at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/getchar.c:1635
#11 0x000000000068b029 in state_enter (s=s@entry=0x7ffed28cc4d0) at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/state.c:61
#12 0x00000000005d4b26 in normal_enter (cmdwin=false, noexmode=false) at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/normal.c:497
#13 0x0000000000456ea8 in main (argc=<optimized out>, argv=<optimized out>) at /build/ab6rvrg81mvwsivc1rhdlfp07qgnsyrg-source/src/nvim/main.c:641
Unfortunately the libluv
line numbers would be the helpful bit, since push_fs_result
contains a switch statement so if the crash is happening in a particular case then it'd narrow down the possible reproductions significantly.
If you're using that nixpkgs
branch, maybe adding separateDebugInfo = true;
to here would give you debug info for libluv
? (note that this is a total guess on my part, I have no experience with nixpkgs
)
Mhm, no luck so far I'm afraid. I've tried compiling the debug symbols separately and loading them into gdb, but the nixpkgs version seems to be different since I'm getting bogus line numbers. I'm not too experienced with overriding nixpkgs either, I'll try to get some help on the forums for that. Man, nix is amazing when it works, but it makes things like these so complicated...
Thanks for your patience!
enabling debug symbol can differ between projects, separateDebugInfo might be one of those case, if you can point me at instructions to enable debug symbols in libuv, we can see how to modify the nix expression together.
Yeah, libuv didn't have separateDebugInfo
, but I managed to enable it myself by overriding libluv
in the rust flake and settings the cmake build type as well as dontStrip
(that last one took a bit to figure out..), and I have libluv
with debug symbols now. Turns out my last crash was so long ago that coredumpctl already cleaned out the stack traces though, so I'll have to wait for the next crash to get you that line number :-P
enabling debug symbol can differ between projects, separateDebugInfo might be one of those case, if you can point me at instructions to enable debug symbols in libuv, we can see how to modify the nix expression together.
Thanks a lot for the offer still! I learned a lot about overriding things in nix, and I can at least do it for separate targets now. My current way would be building libluv by itself with debug symbols, stripping them out using objcopy
and then loading them dynamically in coredumpctl
with gdb
. Writing an overlay to modify the libluv
that neovim builds with would probably be a lot easier, but I haven't done a deep dive into how overlays work yet.
The lines the backtrace is pointing to:
https://github.com/luvit/luv/blob/e2fbfba499f9481ebef6a8510b526b183233fd63/src/fs.c#L103
https://github.com/luvit/luv/blob/e2fbfba499f9481ebef6a8510b526b183233fd63/src/fs.c#L352
https://github.com/luvit/luv/blob/e2fbfba499f9481ebef6a8510b526b183233fd63/src/fs.c#L377
Let's do some analyze.
uv.fs_opendir
result callback, by newuserdata to create luv_dir
, by newuserdata to create luv_dir->handle->dirents
and set luv_dir->dirents_ref
to dirents
.luv_dir->dirents_ref
be unref in uv.fs_closedir
or luv_fs_dir_gc
, cause dirents
gc
to invalid.fs_readdir
, luv_dir
mybe gc before fs_readdir
be called.luv_dir
in fs_readdir
, and unref in readdir callback, avoid lost dirents
memory.Reproduced
test("fs.{open,read,close}dir ref check", function(print, p, expect, uv)
local dir = assert(uv.fs_opendir('.', nil, 50))
local function readdir_cb(err, dirs)
assert(not err)
if dirs then
p(dirs)
uv.fs_readdir(dir, readdir_cb)
else
assert(uv.fs_closedir(dir)==true)
end
end
uv.fs_readdir(dir, readdir_cb)
dir = nil
collectgarbage()
collectgarbage()
collectgarbage()
end, "1.28.0")
That reproduction produces a different stack trace for me when I run it via gdb:
#0 0x00007ffff7c321dc in uv__fs_readdir (req=<optimized out>, req=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/fs.c:610
610 dirent->name = uv__strdup(res->d_name);
#1 uv__fs_work (w=<optimized out>) at /home/ryan/Programming/luvit/luv-tmp/deps/libuv/src/unix/fs.c:1709
#2 0x00007ffff7c2a34e in worker (arg=0x0) at /home/ryan/Programming/luvit/luv-tmp/deps/libuv/src/threadpool.c:122
#3 0x00007ffff7be4609 in start_thread (arg=<optimized out>) at pthread_create.c:477
but I think the fix might solve the luv_push_dirent
segfault, too (it's likely the same problem; the garbage collection is just happening at a different time).
Nevermind, the stack trace is the same as the neovim one if I run it with LuaJIT (I was using PUC Lua since sometimes that makes things easier to debug):
Thread 1 "luajit" received signal SIGSEGV, Segmentation fault.
luv_push_dirent (L=L@entry=0x7ffff7fa9380, ent=0x0, table=table@entry=1) at /home/ryan/Programming/luvit/luv/src/fs.c:121
121 lua_pushstring(L, ent->name);
#0 luv_push_dirent (L=L@entry=0x7ffff7fa9380, ent=0x0, table=table@entry=1) at /home/ryan/Programming/luvit/luv/src/fs.c:121
#1 0x00007ffff7bfb1d8 in push_fs_result (L=L@entry=0x7ffff7fa9380, req=req@entry=0x7ffff7fc84d8) at /home/ryan/Programming/luvit/luv/src/fs.c:371
#2 0x00007ffff7bfb5b1 in luv_fs_cb (req=0x7ffff7fc84d8) at /home/ryan/Programming/luvit/luv/src/fs.c:401
#3 0x00007ffff7c10240 in uv__work_done (handle=0x7ffff7fbc1f0) at /home/ryan/Programming/luvit/luv/deps/libuv/src/threadpool.c:329
#4 0x00007ffff7c1407b in uv__async_io (loop=0x7ffff7fbc140, w=0x7fffffff9580, events=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/async.c:176
#5 0x00007ffff7c25ff3 in uv__io_poll (loop=loop@entry=0x7ffff7fbc140, timeout=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/linux.c:1303
#6 0x00007ffff7c14cc3 in uv_run (loop=0x7ffff7fbc140, mode=mode@entry=UV_RUN_DEFAULT) at /home/ryan/Programming/luvit/luv/deps/libuv/src/unix/core.c:447
#7 0x00007ffff7c0bc00 in luv_run (L=0x7ffff7fa9380) at /home/ryan/Programming/luvit/luv/src/loop.c:36
#8 0x00005555555ca03b in lj_BC_FUNCC () at buildvm_x86.dasc:859
#9 0x00005555555bbe03 in lua_pcall (L=0x7ffff7fa9380, nargs=<optimized out>, nresults=-1, errfunc=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/luajit/src/lj_api.c:1116
#10 0x000055555555c8ab in docall (L=0x7ffff7fa9380, narg=0, clear=0) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:122
#11 0x000055555555dbd2 in handle_script (argx=<optimized out>, L=0x7ffff7fa9380) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:292
#12 pmain (L=0x7ffff7fa9380) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:550
#13 0x00005555555ca03b in lj_BC_FUNCC () at buildvm_x86.dasc:859
#14 0x00005555555bbfa1 in lua_cpcall (L=<optimized out>, func=<optimized out>, ud=<optimized out>) at /home/ryan/Programming/luvit/luv/deps/luajit/src/lj_api.c:1173
#15 0x000055555555c70e in main (argc=2, argv=0x7fffffffda48) at /home/ryan/Programming/luvit/luv/deps/luajit/src/luajit.c:581
I'm using the nightly neovim build and am regularly encountering segfaults. I haven't yet 100% narrowed down when they occur, but it mostly seems to be when files are changed while the editor is opened.
This is what the stack trace looks like:
I'm not sure what else to include here, so please tell me if there's any additional information you require.