Closed klmr closed 1 year ago
Thanks @klmr for an excellent and very complete bug report. Unfortunately I can't do much with this because I don't have access to a Mac.
Can you tell me if the directory you are in has a particularly large amount of files/folders or if it is within a very large git repo? Is there anything unusual about the hardware (very old or very new?)
I think that in the case of a segfault, the fault is ultimately in Neovim itself. Neo-tree may be doing something to surface that problem, but I don't think the lua code should be able to cause a segfault. Have you checked the issues in the neovim repo?
I’ve been able to test and reproduce this on two different macOS models (both running an ARM chip, M2 — apologies, I should have mentioned this!). There’s nothing special about the folder structure. Any folder will do, including something directly in the home directory; no deep nesting, and no large subdirectory structure.
I don't think the lua code should be able to cause a segfault
Yeah, I actually agree with this. Unfortunately I haven’t been able to find any issue that looks related.
… I’m actually puzzled by this lack of bug reports, since the behaviour is fairly disruptive and has been happening for months (that’s how long it took me to be able to narrow the issue down and make it reproducible). I’m sure other people must have stumbled across it; the only reason it took me so long was that I am mostly using Linux.
Should I cross-post the issue to the NeoVim repo?
Should I cross-post the issue to the NeoVim repo?
I think so, after checking existing issues of course.
I suppose I could definitely see how only neo-tree could find a problem with readdir
because we will spawn multiple asynchronous reads. Certainly only another tree plugin would behave in this way. It would be interesting to see if Nvim-tree causes segfaults as well.
@klmr I would be curious, are you running one of the M1 ARM chips? Edit: I should really learn how to read. I am firing up a pi to see if I can recreate this on linux on ARM
This smells like a libuv issue as opposed to neovim directly (though of course, Neovim provides libuv and it is used heavily in the filesystem source within Neo-tree). I ask about the architecture because I have seen a handful of other weird issues in Neovim land related to running on a non-x86 architecture. I haven't tried yet, but I wonder if this can be recreated on something like a raspberry pi (also running ARM).
Tested this on a raspberry pi 4 running Manjaro and I was unable to replicate. So there must be something with the ARM architecture and how libuv is relaying instructions to the processor through Apples Kernel (all well beyond me). In any case, I believe this is below Neo-tree specifically :(
In the meantime I have tried and failed to reproduce the issue with the official NeoVim Universal build. Turns out, the issue only seems to exist with the build from MacPorts, so I will re-report this bug to MacPorts. They have their own build infrastructure, and they must have done something slightly differently.
I agree with the assessment that this is probably ultimately a libuv issue. In fact, there is a (fixed, luvit/luv#640) issue which sounds suspiciously similar: neovim/neovim#22694.
Did you check docs and existing issues?
Neovim Version (nvim -v)
NVIM v0.8.3–v0.9.1
Operating System / Version
macOS (multiple versions, incl. 12 & 13)
Describe the Bug
On macOS (but not on Linux!) I can reproducibly segfault NeoVim when Neo-Tree is opened and
follow_current_file
is enabled, by switching between different file buffers. It takes a bit of time, but after several buffer switches, NeoVim closes without a message. Via Console.app I can find that the cause of the crash is always due to an invalid pointer access (KERN_INVALID_ADDRESS
). The invalid pointer address varies, but occasionally the addresses are wildly invalid, e.g. 0x0000000000000040 — my guess therefore is that this is due to a buffer overflow which overwrites the pointer memory, rather than off-by-one errors.The error seems to happen inside
readdir
, called from inside libuv. Here’s a typical stack trace of the crashed thread:The function on the top of the stack isn’t always the same — sometimes it’s
_readdir_unlocked
instead ofpthread_mutex_lock
. Occasionally, the actual crash instead happens inside the calling thread inpthread_kill
, or insideluv_push_dirent
, with the following stack trace:I’ve attached an exemplary macOS crash report, and I am happy to supply others on request.
Screenshots, Traceback
⬇️ nvim-2023-08-30-164412.ips.log
Steps to Reproduce
:bn
/:bp
etc. works as well). The behaviour is nondeterministic, so it might require several dozen buffer switches before nvim crashes. However, I have never needed more than ~50, and usually only around 10.(The steps above aim to make the example self-contained; obviously you don’t need to create a new directory and files, it works equally well in any existing, non-empty directory.)
Instead of the self-contained
repro.lua
, the followingminimal.lua
also reproduces the issue:Expected Behavior
No segfault occurs.
Your Configuration