Closed iovis closed 1 year ago
This is a very strange issue indeed due to this not being consistent, the same command and code is run by fzf-lua so the inconsistency is making it hard to identify what’s the source of the issue.
The common denominator between the commands that fail seems to be a lua table input into fzf_exec
, I’ll look at the code tomorrow and try to figure out what can fail overtime and why specifically fish.
Do you know what could be going on or how I could debug this further?
In the meantime if you can fins a way to consistently reproduce this with steps then I’d be able to solve it easily from my end.
I unfortunately haven't been able to find a consistent way of reproducing it. Although something weird that I saw last night when running the buffers
command (which is usually the one I use the most from the failing ones) was seeing the dreaded ^[i
for a few milliseconds and then the popups being render properly in place. I ran the command multiple times after that and I didn't see it again for that session. Not sure if it helps, but I figured I'd mention it (could there be some sort of race condition?).
If you want me to add some kind of logging or print statements somewhere you're suspicious of, just let me know. I'm guessing this may be a weird interaction of my system (and likely a mistake on my part), so it's probably going to be incredibly hard to reproduce outside of it.
I'm new to fish, so I probably messed something up. One thing I'll mention given the weird interaction I had with that sort of race condition: I managed to make my fish config significantly faster than my ZSH one (startup time being ~10ms). My guess is that this probably doesn't really matter, but I guess more info doesn't hurt
I unfortunately haven't been able to find a consistent way of reproducing it. Although something weird that I saw last night when running the
buffers
command (which is usually the one I use the most from the failing ones) was seeing the dreaded^[i
for a few milliseconds and then the popups being render properly in place. I ran the command multiple times after that and I didn't see it again for that session. Not sure if it helps, but I figured I'd mention it (could there be some sort of race condition?).If you want me to add some kind of logging or print statements somewhere you're suspicious of, just let me know. I'm guessing this may be a weird interaction of my system (and likely a mistake on my part), so it's probably going to be incredibly hard to reproduce outside of it.
I'm new to fish, so I probably messed something up. One thing I'll mention given the weird interaction I had with that sort of race condition: I managed to make my fish config significantly faster than my ZSH one (startup time being ~10ms). My guess is that this probably doesn't really matter, but I guess more info doesn't hurt
Tysm for the details! anything helps.
I'm pretty sure this is not "your fault", any shell and any setup should work regardless of circumstanes, as long as you're patient and willing to work with me on this I can assure you we will not leave a rock unturned and get to the bottom of this issue.
I was able to identify what I believe is the common ground between the failing commands, they are all using the stdin input redirection directive < /path/to/fifo
to feed contents into fzf from a FIFO generated by the mkfifo
command.
Here's where I get confused: in order to avoid shell woes the fzf command is explicitly wrapped with POSIX shell using sh -c
, as you can see below:
https://github.com/ibhagwan/fzf-lua/blob/cddb1c3b23bf6f3e56435410a28e43c2c954b350/lua/fzf-lua/fzf.lua#L178-L184
The commands that never fails (files
, etc) are not using stdin input redirection and instead using the environment variable FZF_DEFAULT_COMMAND
which is spawned directly by fzf.
According to the termopen/jobstart
help, supplying a table as a first argument avoids the shell alltogether, which means that fish isn't used in our scenario, yet somehow this happens only on fish.
:help jobstart
jobstart({cmd} [, {opts}]) *jobstart()* Spawns {cmd} as a job. If {cmd} is a List it runs directly (no 'shell'). If {cmd} is a String it runs in the 'shell', like this: :call jobstart(split(&shell) + split(&shellcmdflag) + ['{cmd}']) (See |shell-unquoting| for details.)
Althought this might happen because of mkfifo
I'm not sure if this is the case as, again, the command is called with first argument as a table which should spwan without a shell and also the strange ^]i
escape sequence seems like something more likely to happen after termopen
(where mkfifo
is called prior):
https://github.com/ibhagwan/fzf-lua/blob/cddb1c3b23bf6f3e56435410a28e43c2c954b350/lua/fzf-lua/fzf.lua#L71-L75
This leaves me to suspect a delay (as you mentioned, sometimes it does open properly after a while) while executing the fzf
command which could come from an external command which somehow uses fish
?
Although I'm still unsure why this would cause such a behavior, can you post the output of the below:
This is the equivalent of
echo $FZF_DEFAULT_OPTS
in the shell:lua print(vim.env.FZF_DEFAULT_COMMAND) :lua print(vim.env.FZF_DEFAULT_OPTS)
I'm pretty sure this is not "your fault", any shell and any setup should work regardless of circumstanes, as long as you're patient and willing to work with me on this I can assure you we will not leave a rock unturned and get to the bottom of this issue.
I really appreciate the help. I love this library and have used for more than a year now, I'll do anything I can to help! I also appreciate explaining what's going on.
Here's where I get confused: in order to avoid shell woes the fzf command is explicitly wrapped with POSIX shell using
sh -c
Smart. I agree this makes it more confusing that this weird behavior would be terminal dependent, but I'm still not knowledgeable enough of fish to be aware of possible differences. I'll mention that, because of how fish works, not only I managed to get a significant speedup on non-interactive startup of fish, I was also able to add more stuff that I couldn't with zsh —because otherwise I would be able to notice a performance impact—, which means I'm loading functions and aliases that I use a lot in interactive sessions that didn't use to be there before (I don't think this is the issue, but probably something to keep in mind). I can't easily "not load them in non-interactive" because they're autoloaded functions (instead of aliases).
This leaves me to suspect a delay (as you mentioned, sometimes it does open properly after a while) while executing the fzf command which could come from an external command which somehow uses fish?
Although I'm still unsure why this would cause such a behavior, can you post the output of the below:
My default command is using fd, a find
alternative written in rust.
print(string.format("FZF_DEFAULT_COMMAND: %s", vim.env.FZF_DEFAULT_COMMAND))
print(string.format("FZF_DEFAULT_OPTS: %s", vim.env.FZF_DEFAULT_OPTS))
FZF_DEFAULT_COMMAND: fd -H -E '.git' -E '.keep' --type file --follow --color=always
FZF_DEFAULT_OPTS: --ansi --bind=ctrl-n:page-down,ctrl-p:page-up,alt-a:select-all,alt-d:deselect-all,alt-t:toggle-all,home:first,end:last
One (even weirder) thing, I'm currently on PTO so I'm using my work computer less and this problem is much harder to manifest in my personal computers. One of them is identical to my work computer, a macbook pro from last year, and an old intel mac mini. My dotfiles are in github and the same across the board, so the biggest difference would be me running much heavier applications on my work computer (I usually just SSH into my personal ones from my iPad, and not running things like Docker or the MDM software).
I'll also mention that I use neovim within tmux almost exclusively, although I was able to reproduce outside of tmux (in case you wanted to rule it out).
I just took my work computer to see if I could reproduce again, and the problem was there immediately.
I even saw a flash of ^[i
in the files
command (but fixed itself almost instantly). The buffers
command is not working.
I unset all my FZF
environment variables (and double checked with :lua=vim.env.FZF_DEFAULT_COMMAND
) and the issue is still there. Are there any print
statements or sh -c
commands you want me to run?
I recorded a video, if you pay attention to the beginning of the files
command (it's just a few frames) you'll see the ^[i
appearing:
https://user-images.githubusercontent.com/6660202/218008518-094ca662-e1cd-4c4c-8657-18412cf9538f.mov
While I had the system in this state, I tried changing my $SHELL to zsh.
It doesn't fail... but while recording I'm just realizing I'm seeing the ^[i
too, it just gets the right output immediately after:
https://user-images.githubusercontent.com/6660202/218009328-fce05587-2096-4dd9-ad17-b7482a09b7bd.mov
Ty @iovis, this is very helpful.
I'm just realizing I'm seeing the ^[i too, it just gets the right output immediately after
I can clearly see it in both your videos, still no idea where it’s coming from but it’s good to know it’s happening in all commands so I can invalidate some of my other assumptions.
Are there any print statements or
sh -c
commands you want me to run?
Running any command with debug=true
provides debug output of the command, preview parameters and other info, none that would help us in this case though, I can create a branch with extra logging but I’m still unsure where to put these and what information I’m even looking for as I still have no hints as to the source of this thing.
I’m suspecting something in your shell configuration as this issue would have probably come up before by many others but I can’t pin point what… armed with this new information I’ll keep digging in the code and try to come up with a better idea.
I just took my work computer to see if I could reproduce again, and the problem was there immediately.
One clarification I forgot to ask, is the work computer using the same dotfiles?
If so, are your dotfiles (or at least the shell profile) online? If I can reproduce this on my system with your dotfiles I would be able to solve it much faster.
One clarification I forgot to ask, is the work computer using the same dotfiles?
If so, are your dotfiles (or at least the shell profile) online? If I can reproduce this on my system with your dotfiles I would be able to solve it much faster.
Yes, same dotfiles. You can find them here:
I managed to reproduce with a very minimal configuration (video attached):
# ~/.config/fish/config.fish
/opt/homebrew/bin/brew shellenv | source # This is the path for `nvim`, `fzf` and friends
sh -c "$(curl -s https://raw.githubusercontent.com/ibhagwan/fzf-lua/main/scripts/mini.sh)"
In the video, I ran the commands with debug=true
. It also doesn't look like ^[i
appears anymore, it may have been a red herring.
https://user-images.githubusercontent.com/6660202/218150923-90911ea6-2942-422e-a2b2-5eab8c6d3577.mp4
Ty @iovis!
I just tried the same exact setup on a OSX, neovim 0.8.3 with fzf 0.37 with your config.fish
(with starship installed) and I still can’t reproduce this annoyance lol
I’d like to try something else, I have a hunch that may or may not be true that this might be related to the new load event added with fzf 0.36, would you be able to download fzf 0.35.1 Darwin binary from fzf releases, set fzf_bin
within fzf-lua’s setup
function and see if the issue persists?
I just tried the same exact setup on a OSX, neovim 0.8.3 with fzf 0.37 with your config.fish (with starship installed) and I still can’t reproduce this annoyance lol
Can't say I'm surprised, I seem to only be able to reproduce it consistently under very heavy load.
I’d like to try something else, I have a hunch that may or may not be true that this might be related to the new load event added with fzf 0.36, would you be able to download fzf 0.35.1 Darwin binary from fzf releases, set fzf_bin within fzf-lua’s setup function and see if the issue persists?
For a second I got excited, because I managed to make buffers
work the first time, but then it failed again. I tried setting fzf_bin
directly and also just having it in my $PATH
, but no differences.
https://user-images.githubusercontent.com/6660202/218187994-bc8662c9-581d-4011-b2b8-3d5ed287a308.mp4
For context, under the same load, it works fine in ZSH:
https://user-images.githubusercontent.com/6660202/218189036-df9cab7f-25f2-4327-9846-5220a72989f7.mp4
Sorry that it's a bit hard to read with the bg being set to magenta for some reason
Also, when reproducing for fish
, I'm using this as config.fish
:
# ~/.config/fish/config.fish
set -gx XDG_CONFIG_HOME $HOME/.config
set -gx XDG_DATA_HOME $XDG_CONFIG_HOME/.local/share
set -gx XDG_CACHE_HOME $XDG_CONFIG_HOME/.cache
fish_add_path /opt/homebrew/bin
Can't say I'm surprised, I seem to only be able to reproduce it consistently under very heavy load.
What is considered "heavy load"? Are you running any background process I that I can attempt to do the same? When said command ends and load subsides does it still fail or will it work again?
Sorry that it's a bit hard to read with the bg being set to magenta for some reason
No worries, I should probably adjust the defaults in to look somewhat normal on the default colorscheme.
Also, when reproducing for fish, I'm using this as config.fish
I used the same as this was inside the config.fish
you linked above, can't rule anything out anymore but I'm not sure why this sohuld make a difference.
For context, under the same load, it works fine in ZSH
This is what’s so confusing here, there should be no point in the code where fish shell is run or matters, at all, anywhere a shell command is used is wrapped with sh -c
.
During the plug-in development I did notice some commands behave differently if $SHELL
is set to fish, for example shellescape
, perhaps I should go in this route and examine every neovim API call and it’s help to see where there’s a special case for fish. From initial search doesn’t seem like any other command is affected by the $SHELL
variable aside from termopen | jobstart
which is already wrapped in sh -c
…
What is considered "heavy load"? Are you running any background process I that I can attempt to do the same? When said command ends and load subsides does it still fail or will it work again?
I just tried stopping everything thinking it would come back to normal and it's still reproducible, so I'm now even more confused of why it's easier to reproduce on my work computer. The biggest difference with my personal computers would be the MDM software, but I still have managed to reproduce in my personal computers at some point or another (just less frequently).
During the plug-in development I did notice some commands behave differently if $SHELL is set to fish, for example shellescape, perhaps I should go in this route and examine every neovim API call and it’s help to see where there’s a special case for fish. From initial search doesn’t seem like any other command is affected by the $SHELL variable aside from termopen | jobstart which is already wrapped in sh -c…
Do you want to give me some code to run manually in my editor to see if I can pinpoint which line is the one failing? I can compare between ZSH and fish to maybe see if there's any difference
Do you want to give me some code to run manually in my editor to see if I can pinpoint which line is the one failing? I can compare between ZSH and fish to maybe see if there's any difference
I think this is the best route, next chance I’m on the computer I’ll create a debug branch and pepper the code with some “before” and “after” prints, you can then set Lazy to said branch and we can examine the output from :messages
which will hopefully be more insightful.
I think this is the best route, next chance I’m on the computer I’ll create a debug branch and pepper the code with some “before” and “after” prints, you can then set Lazy to said branch and we can examine the output from :messages which will hopefully be more insightful.
Sounds good, I appreciate the patience helping me debug this!
I think this is the best route, next chance I’m on the computer I’ll create a debug branch and pepper the code with some “before” and “after” prints, you can then set Lazy to said branch and we can examine the output from :messages which will hopefully be more insightful.
Sounds good, I appreciate the patience helping me debug this!
Can you try branch 633_debug?
It has prints before and after suspected areas of the code, paste the output of :messages
while it’s stuck and especially the message the is displayed while the window is open and hopefully we get a good hint from it.
Can you try branch 633_debug?
It has prints before and after suspected areas of the code, paste the output of :messages while it’s stuck and especially the message the is displayed while the window is open and hopefully we get a good hint from it.
fish
before win create
after win create
before raw_fzf
before tempname1
after tempname1, before tempname2
after tempname2
before mkfifo /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/SGmKtp/1
after mkfifo /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/SGmKtp/1
before termopen
after termopen
before fs_open /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/SGmKtp/1
after fs_open /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/SGmKtp/1
before contents
after contents # => THE ONE DISPLAYED WHILE THE WINDOW IS OPEN
termopen exit with 129
after raw_fzf
zsh
before win create
after win create
before raw_fzf
before tempname1
after tempname1, before tempname2
after tempname2
before mkfifo /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/oPOVuG/1
after mkfifo /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/oPOVuG/1
before termopen
after termopen
before fs_open /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/oPOVuG/1
after fs_open /var/folders/99/_2_c98ys0f9428nsf1_vvp440000gp/T/nvim.david/oPOVuG/1
before contents
after contents
termopen exit with 0
after raw_fzf
Ty @iovis, this just got even stranger lol
after contents # => THE ONE DISPLAYED WHILE THE WINDOW IS OPEN
This is the code part: https://github.com/ibhagwan/fzf-lua/blob/2d675a6a387eb3d0d67ad83fa038c6ca0cf0f3a0/lua/fzf-lua/fzf.lua#L260-L264
Essentially all the fzf-lua code was executed and now neovim is waiting for the terminal job (fzf) to exit.
Another oddity is an exit code not mentioned in fzf manual:
0 Normal exit
1 No match
2 Error
130 Interrupted with CTRL-C or ESC
Another oddity is an exit code not mentioned in fzf manual:
I can reproduce the error 129
in ZSH (normal operation) when clicking outside the fzf popup in neovim or running :close
while the popup is open. So I'm not sure if it's FZF's necessarily. I think that's just related to me closing the term popup weirdly when it fails
Exit code 129 is apparently SIGHUP
which gets sent when the terminal is closing which I assume you force closed with <C-w-q>
, now the unanswered questions:
termopen
command which is sent sh -c
(without any shell) behave differently when you’re using fish
shellI’ll admit, this is quite the bug we got here lol, but at least I know now I need to focus on alternatives to redirecting the fifo into stdin
, I’ll look for alternative (perhaps tee
can help) and we can do some more testing.
Want me to run the sh -c
command directly in fish
and zsh
?
I wonder if the underlying sh -c
is inheriting some weird env
. I noticed that when I ran zsh
from a fish
env to try to reproduce from zsh
, it would fail in zsh too. I had to chsh -s /bin/zsh
to "clean it", and it would not be reproducible anymore. So I wonder if the sh -c
is somehow inheriting something from my env too
Just switched the input redirection to tail
in the debug branch, can you try to pull rebase and see if that solves the issue (I have a good feeling it might)?
@ibhagwan I think you nailed it!
I've been trying to break it with my full config and I can't, it looks rock solid. I don't even see the ^[i
flashing in those commands anymore! (I do see the ^[i
flashing for things like files
, but I know you didn't touch there).
Let me try again with clean configs and zsh and maybe heavy load again. But this looks promising!
@ibhagwan I think you nailed it!
Close but not there yet! It seems it cuts the input arbitrarily at 10 items, for example, try FzfLua bulltin
or FzfLua man_pages
, I have no idea why, yet :)
I’m still trying to find a solution for this, but otherwise it’s a solid solution, it’s also better to let fzf handle the command process using FZF_DEFAULT_COMMAND
.
Close but not there yet! It seems it cuts the input arbitrarily at 10 items, for example, try FzfLua bulltin or FzfLua man_pages, I have no idea why, yet :)
tail
defaults to 10 lines! I was just going to mention that FzfLua commands
only reported 10 lines and I was looking at the plugin code to see if there was a new paginator haha
It's the -n
option. You can test with tail -n20
@ibhagwan I just changed:
FZF_DEFAULT_COMMAND = string.format("tail %s", vim.fn.shellescape(fifotmpname))
-- to
FZF_DEFAULT_COMMAND = string.format("cat %s", vim.fn.shellescape(fifotmpname))
And it works without limiting the output, not sure if that helps
And it works without limiting the output, not sure if that helps
Ofc it helps, this is great, I’m just sure how cat handles ongoing input from a fifo file, I gonna do some more research and you might have just found our solution :)
I'm investigating in the meantime, and it seems like mkfifo
works differently than other shells:
https://github.com/fish-shell/fish-shell/issues/2867#issuecomment-204827423
https://stackoverflow.com/questions/61946995/ls-fifo-blocks-fish-shell
https://stackoverflow.com/questions/69216897/what-are-the-differences-between-fish-shell-and-bash-when-handling-named-pipes
I'm investigating in the meantime, and it seems like
mkfifo
works differently than other shells: fish-shell/fish-shell#2867 (comment) https://stackoverflow.com/questions/61946995/ls-fifo-blocks-fish-shell https://stackoverflow.com/questions/69216897/what-are-the-differences-between-fish-shell-and-bash-when-handling-named-pipes
The question is, why does this even matter, mkfifo
command is executed directly (no shell) and termopen uses sh
, so technically fish shell shouldn’t even be a factor here, very confusing.
In any event the cat
solution seems to work great, if it works perfectly for you as well I will commit the change, lmk?
The second link you posted seems to be a clue as to what may be happening here, but again, we’re using sh
… and it still doesn’t explain the inconsistency of the bug.
In any event the cat solution seems to work great, if it works perfectly for you as well I will commit the change, lmk?
It's working amazing.
The second link you posted seems to be a clue as to what may be happening here, but again, we’re using sh… and it still doesn’t explain the inconsistency of the bug.
Do you think you can give me an sh -c
command with mkfifo
that I could run directly in fish? I have a feeling something's weird with these subshells. When i ran zsh
to get an interactive zsh session from within a fish session, the error was still reproducible as if I was still in fish. It would only "clean up" if I ran a chsh -s /bin/zsh
and started over.
My hunch right now is that sh -c
may not be as isolated as we think it is...
Actually, even if termopen is using sh
, how would it know where to find fzf
if it was not inheriting at least the $PATH
? Wouldn't that mean it's not fully isolated?
I've been trying with:
sh -c 'mkfifo myfifo; (ls > myfifo &); cat < myfifo; rm myfifo'
But I couldn't see any error
In any case, thank you so much for fixing this issue and for your patience ❤️
https://github.com/ibhagwan/fzf-lua/commit/a294a8330e9b0622acc3de4e0b9ab4e853b53fa2 - I'm on a flight and the plane WiFi is blocking my git ssh connection, I had to do this from the web interface (first time), hopefully I didn't mess up anything :-)
Tysm @iovis for your patience! after much joint efforts I think we've come to the conclusion of this quite mysterious bug.
My hunch right now is that
sh -c
may not be as isolated as we think it is...
That is the only way to explain it but that would be quite the conundrum, perhaps this is isolated to sh -c
within a neovim job? That would make more sense as it's a pretty limited usecase and the reason this hadn't come up until now.
Actually, even if termopen is using sh, how would it know where to find fzf if it was not inheriting at least the $PATH? Wouldn't that mean it's not fully isolated?
Using termopen|jobstart
inherits the current env unless clear_env = true
is specified in the options, the question is what in the env vars affects input redirection?
I've been trying with: But I couldn't see any error
Idk if this rabbit hole is worth pursuing any more than what we already did :-)
In any case, thank you so much for fixing this issue and for your patience
Ty again @iovis, it's been a pleasure working with you!
https://github.com/ibhagwan/fzf-lua/issues/633#issuecomment-1426290620:
Sorry that it's a bit hard to read with the bg being set to magenta for some reason
Now fixed with https://github.com/ibhagwan/fzf-lua/commit/1df92492201ff9bf081a276c3c773a36cfee66a0, no more magenta on default colorsceme lol, beats me why linking to a cleared highlight group would choose Pmenu
as the background color but the highlight setup now checks for cleared links and resets the hl accordingly:
Much easier to read haha. Btw, the fix has been working wonderfully on my end these days, tysm for looking into this!
Info
nvim --version
: v0.8.3fzf --version
: 0.37.0 (brew)mini.sh
Description
I've been trying to diagnose this one by myself, but I can't figure it out anymore. Maybe you can help me debug this better.
So, I recently switched to Fish as my shell (from ZSH). Everything was working fine, till I started to sometimes see
:FzfLua buffers
behave strangely:No error, just that lonely
^[i
right there (it's always there when this happens). The thing is, it's not consistent (although once it starts, it'll stay there). I could reproduce with the minimal setup.It also seems like it may somehow be related to commands related to vim. These are some commands that stop working when this happens:
But other commands do work normally, even when this starts happening, like:
If I run neovim from ZSH, this doesn't happen, it works as expected.
My guess is that something is breaking somewhere, but not showing the error for some reason. I checked
messages
and stuff, but couldn't find anything.Do you know what could be going on or how I could debug this further?