junegunn / fzf

:cherry_blossom: A command-line fuzzy finder
https://junegunn.github.io/fzf/
MIT License
61.81k stars 2.35k forks source link

panic: runtime error: invalid memory address or nil pointer dereference #3890

Closed michaeltraxler closed 6 days ago

michaeltraxler commented 1 week ago

Checklist

Output of fzf --version

0.53.0 (c4a9ccd)

OS

Shell

Problem / Steps to reproduce

When typing: kill ** the result is:

 % kill **panic: runtime error: invalid memory address or nil pointer dereference                                                                                               (23.06. 15:26:13) !49725
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x59462b]

goroutine 1 [running]:
main.exit(0x2, {0x0?, 0x0?})
    github.com/junegunn/fzf/main.go:43 +0x2b
main.main()
LangLangBart commented 1 week ago

It's possible the issue has already been fixed in the current master version.

Ref: #3863

If you'd like, you can try downloading the repository, running make, and checking the binary in the target folder.

michaeltraxler commented 1 week ago

Thanks a lot! Yes, after rebuilding fzf the crash is gone. But the output in zsh with freshly installed .oh-my-zsh (also made a new account for this) is: kill **input/output error so, when I press TAB, it just adds input/output error to the current line and reprints kill ** in a new line. Ok, this might also be a bug in oh-my-zsh?

LangLangBart commented 1 week ago

Thanks a lot! Yes, after rebuilding fzf the crash is gone.

👍

But the output in zsh with freshly installed .oh-my-zsh (also made a new account for this) is: kill **input/output error so, when I press TAB, it just adds input/output error to the current line and reprints kill ** in a new line. Ok, this might also be a bug in oh-my-zsh?

Not able to reproduce with a minimal .zshrc file.

# my demo .zshrc

export ZSH="$HOME/.oh-my-zsh"
export PATH="$HOME/Developer/fzf/bin:$PATH"
ZSH_THEME="robbyrussell"

plugins=(fzf)

source $ZSH/oh-my-zsh.sh

What is the output for this command ?

bindkey | grep -F '^I'

Did you source the fzf completion.zsh file differently ?

Any other plugins installed? Try to disable them.

michaeltraxler commented 1 week ago

I made a new user and skipped oh-my-zsh and just did the recommended fzf install method:

git clone --depth 1 https://github.com/junegunn/fzf.git ~/.fzf
~/.fzf/install

copied the newly compiled fzf (without the segmentation fault) to ~/bin and here I get the same error: kill **input/output error

michaeltraxler commented 1 week ago

The output for the fzf-only user is:

l10% bindkey | grep -F '^I'
"^I" fzf-completion
"^[^I" self-insert-unmeta

For the oh-my-zsh user it is the same.

By the way: ctrl-r, ctrl-t and alt-c work always very well (also with original binary file).

LangLangBart commented 1 week ago

Can you try a zsh session without reading your .zshrc file ?

# new  session
zsh -f
source ~/.fzf/shell/completion.zsh
kill **
# issue still there ?

EDIT:

The next step would be to try to debug it.

zsh -f
source ~/.fzf/shell/completion.zsh
typeset -ft fzf-completion __fzf_defaults __fzf_comprun __fzf_extract_command _fzf_feed_fifo _fzf_complete _fzf_complete_kill _fzf_complete_kill_post 
kill **
# share verbose output …
michaeltraxler commented 1 week ago

I'm sorry, a part of this is due to my testing procedure... If I test the following way, coming from the root user:

l10:~ # su - test3
kill **

If I use:

su -P - test3 -c 'zsh'

I get the input/output error.

But this still doesn't solve the original issue with the normal account.

LangLangBart commented 1 week ago

Can you try the debug snippet with the typeset -ft … command to enable xtrace in the functions of the completion.zsh file? It would be interesting to know which command causes the input/output error.


If I use:

su -P - test3 -c 'zsh'

The default version of su under under macOS doesn't provide the -P/--pty flag.

# default 'su' man page under macOS
man =(curl https://raw.githubusercontent.com/apple-oss-distributions/shell_cmds/main/su/su.1)

The util-linux formulae has the su command, but isn't supported on macOS[^1].

[^1]: util-linux — Homebrew Formulae

michaeltraxler commented 1 week ago

Thanks for your help!

The first thing to do was to log on to the account without residual environment variables, which stay when using just zsh -f. So, I used ssh localhost -t 'zsh -f' to get a clean environment.

Then source ~/.fzf/shell/completion.zsh didn't work (no ctlr-r for example), I had to use source ~/.fzf.zsh

The output with typeset ... is here:

l10% kill **+fzf-completion:1> local tokens cmd prefix trigger tail matches lbuf d_cmds
+fzf-completion:2> setopt localoptions noshwordsplit noksh_arrays noposixbuiltins
+fzf-completion:6> tokens=( kill '**' )
+fzf-completion:7> [ 2 -lt 1 ']'
+fzf-completion:12> cmd=+fzf-completion:12> __fzf_extract_command 'kill **'
+__fzf_extract_command:1> local token tokens
+__fzf_extract_command:2> tokens=( kill '**' )
+__fzf_extract_command:3> token=kill
+__fzf_extract_command:4> token=kill
+__fzf_extract_command:5> [[ "$token" -regex-match [[:alnum:]] && ! "$token" -regex-match "=" ]]
+__fzf_extract_command:6> echo kill
+__fzf_extract_command:7> return
+fzf-completion:12> cmd=kill
+fzf-completion:15> trigger='**'
+fzf-completion:16> [ -z '**' -a '*' '=' ' ' ']'
+fzf-completion:19> [[ 'kill **' = *kill\*\* ]]
+fzf-completion:24> lbuf='kill **'
+fzf-completion:25> tail='**'
+fzf-completion:28> [ 2 -gt 1 -a '**' '=' '**' ']'
+fzf-completion:29> d_cmds=( cd pushd rmdir )
+fzf-completion:31> [ -z '**' ']'
+fzf-completion:31> prefix=''
+fzf-completion:32> [[ '' = *\$\(* ]]
+fzf-completion:32> [[ '' = *\<\(* ]]
+fzf-completion:32> [[ '' = *\>\(* ]]
+fzf-completion:32> [[ '' = *:=* ]]
+fzf-completion:32> [[ '' = *`* ]]
+fzf-completion:35> [ -n '**' ']'
+fzf-completion:35> lbuf='kill '
+fzf-completion:37> eval 'type _fzf_complete_kill > /dev/null'
+(eval):1> type _fzf_complete_kill
+fzf-completion:38> prefix='' +fzf-completion:38> eval _fzf_complete_kill 'kill\ '
+(eval):1> _fzf_complete_kill 'kill '
+_fzf_complete_kill:1> _fzf_complete -m '--header-lines=1' --preview 'echo {}' --preview-window down:3:wrap --min-height 15 -- 'kill '
+_fzf_complete:1> setopt localoptions ksh_arrays
+_fzf_complete:3> local args rest str_arg i sep
+_fzf_complete:4> args=( -m '--header-lines=1' --preview 'echo {}' --preview-window down:3:wrap --min-height 15 -- 'kill ' )
+_fzf_complete:5> sep=''
+_fzf_complete:6> i=0
+_fzf_complete:7> [[ -m = -- ]]
+_fzf_complete:6> i=1
+_fzf_complete:7> [[ '--header-lines=1' = -- ]]
+_fzf_complete:6> i=2
+_fzf_complete:7> [[ --preview = -- ]]
+_fzf_complete:6> i=3
+_fzf_complete:7> [[ 'echo {}' = -- ]]
+_fzf_complete:6> i=4
+_fzf_complete:7> [[ --preview-window = -- ]]
+_fzf_complete:6> i=5
+_fzf_complete:7> [[ down:3:wrap = -- ]]
+_fzf_complete:6> i=6
+_fzf_complete:7> [[ --min-height = -- ]]
+_fzf_complete:6> i=7
+_fzf_complete:7> [[ 15 = -- ]]
+_fzf_complete:6> i=8
+_fzf_complete:7> [[ -- = -- ]]
+_fzf_complete:8> sep=8
+_fzf_complete:9> break
+_fzf_complete:12> [[ -n 8 ]]
+_fzf_complete:13> str_arg=''
+_fzf_complete:14> rest=( 'kill ' )
+_fzf_complete:15> args=( -m '--header-lines=1' --preview 'echo {}' --preview-window down:3:wrap --min-height 15 )
+_fzf_complete:23> local fifo lbuf cmd matches post
+_fzf_complete:24> fifo=/tmp/fzf-complete-fifo-31493
+_fzf_complete:25> lbuf='kill '
+_fzf_complete:26> cmd=+_fzf_complete:26> __fzf_extract_command 'kill '
+__fzf_extract_command:1> local token tokens
+__fzf_extract_command:2> tokens=( kill )
+__fzf_extract_command:3> token=kill
+__fzf_extract_command:4> token=kill
+__fzf_extract_command:5> [[ "$token" -regex-match [[:alnum:]] && ! "$token" -regex-match "=" ]]
+__fzf_extract_command:6> echo kill
+__fzf_extract_command:7> return
+_fzf_complete:26> cmd=kill
+_fzf_complete:27> post=_fzf_complete_kill_post
+_fzf_complete:28> type _fzf_complete_kill_post
+_fzf_complete:30> _fzf_feed_fifo /tmp/fzf-complete-fifo-31493
+_fzf_feed_fifo:1> rm -f /tmp/fzf-complete-fifo-31493
+_fzf_feed_fifo:2> mkfifo /tmp/fzf-complete-fifo-31493
+_fzf_complete:31> matches=+_fzf_feed_fifo:3> cat
+_fzf_complete:31> matches=+_fzf_complete:34> _fzf_complete_kill_post
+_fzf_complete:32> FZF_DEFAULT_OPTS=+_fzf_complete:32> __fzf_defaults --reverse ' '
+__fzf_defaults:3> echo '--height 40% --bind=ctrl-z:ignore --reverse'
+_fzf_complete:31> matches=+_fzf_complete:34> tr '\n' ' '
+_fzf_complete_kill_post:1> awk '{print $2}'
+__fzf_defaults:4> cat ''
+__fzf_defaults:5> echo '  '
+_fzf_complete:32> FZF_DEFAULT_OPTS=$'--height 40% --bind=ctrl-z:ignore --reverse\n  ' FZF_DEFAULT_OPTS_FILE='' +_fzf_complete:32> __fzf_comprun kill -m '--header-lines=1' --preview 'echo {}' --preview-window down:3:wrap --min-height 15 -q ''
+__fzf_comprun:1> [[ "$(type _fzf_comprun 2>&1)" -regex-match function+__fzf_comprun:1> type _fzf_comprun
+__fzf_comprun:1> [[ "$(type _fzf_comprun 2>&1)" -regex-match function ]]
+__fzf_comprun:3> [ -n '' ']'
+__fzf_comprun:11> shift
+__fzf_comprun:12> fzf -m '--header-lines=1' --preview 'echo {}' --preview-window down:3:wrap --min-height 15 -q ''
input/output error
+_fzf_complete:31> matches=''
+_fzf_complete:35> [ -n '' ']'
+_fzf_complete:38> rm -f /tmp/fzf-complete-fifo-31493
+fzf-completion:39> zle reset-prompt

EDIT: If I only use source ~/.fzf/shell/completion.zsh the output is the same...

LangLangBart commented 1 week ago

+fzf_comprun:3> [ -n '' ']' +__fzf_comprun:11> shift +fzf_comprun:12> fzf -m '--header-lines=1' --preview 'echo {}' --preview-window down:3:wrap --min-height 15 -q '' input/output error +_fzf_complete:31> matches='' +_fzf_complete:35> [ -n '' ']'

The __fzf_comprun function, along with its arguments, is invoked from the _fzf_complete function.

https://github.com/junegunn/fzf/blob/cc2b2146ee30ad38f3ed62c43bb211af48f88d2f/shell/completion.zsh#L236-L244 https://github.com/junegunn/fzf/blob/cc2b2146ee30ad38f3ed62c43bb211af48f88d2f/shell/completion.zsh#L107-L121

This command with the input from < "$fifo" seems to cause the input/output error message, which I am still unable to reproduce on macOS.

FZF_DEFAULT_OPTS='--height 40% --bind=ctrl-z:ignore --reverse' \
  fzf -m --header-lines=1 --preview 'echo {}' --preview-window down:3:wrap --min-height 15 -q ''

Are you aware if the kill ** command worked in previous versions of fzf and only stopped working recently? If you're unsure, could you test this with older versions of the fzf binary to see if the problem persists? If the issue doesn't occur in older versions, it would be helpful to share the oldest working fzf release or even perform a bisect [^1][^2] analysis on the fzf source code.

[^1]: Git - git-bisect Documentation [^2]: Git - Debugging with Git

michaeltraxler commented 1 week ago

This is all so weird... Again, some observations: a new user, install only fzf => kill ** works in bash, but in zsh it completely blocks the shell, ctrl-z, ctrl-c not working. The following processes are running and stuck:

test4     4379  0.0  0.0   5612  4008 pts/82   TN   14:59   0:00 tr \n
test4     4382  0.0  0.0   9076  5820 pts/82   TN   14:59   0:00 awk {print $2}
test4     4384  0.0  0.0 1822044 6272 pts/82   TNl  14:59   0:00 fzf -m --header-lines=1 --preview echo {} --preview-window down:3:wrap --min-height 15 -q

A kill -9 4379 (the "tr" process) releases the shell.

Exactly the same behavior on a different Tumbleweed OS. If I do the same on a openSUSE Leap 15.5 it works!

So, some external tools (which are newer on Tumbleweed) are doing something different?

michaeltraxler commented 1 week ago

The problem can be easily reproduced with a container:

The following if the build-file

FROM registry.opensuse.org/opensuse/tumbleweed:latest
RUN zypper ref
RUN zypper -vn dup -l
RUN zypper in -y fzf zsh awk 

If you write this to opensuse_tumbleweed_fzf.txt

then this will make the container: podman build -t opensuse_fzf -f opensuse_tumbleweed_fzf.txt Then it can be executed: podman run --rm -ti opensuse_fzf:latest /usr/bin/zsh Inside the zsh one can enable fzf completion and keybindings: source /usr/share/fzf/shell/key-bindings.zsh; source /usr/share/fzf/shell/completion.zsh Then a kill ** <TAB> will show the problem.

michaeltraxler commented 1 week ago

I could find an older executable in a different Tumbleweed, 0.52.0.
This executable causes the same problem on a new tumbleweed. On the Tumbleweed "Release: 20240507", it works somehow. Not with "", but with kill<space><TAB>. Then it shows the processes in fzf. With "" and it only shows all files recursively...

LangLangBart commented 1 week ago

will show the problem.

Thanks for the setup. I used docker and was able to reproduce it inside the container.

[!NOTE] I never manged to reproduce the issue on my normal macOS version outside of a Docker container. Even after trying with coreutils tools, the input/output error never occurred for me. It must be a platform-specific bug.


Minimal reproduction

Dockerfile --- ```docker FROM registry.opensuse.org/opensuse/tumbleweed:latest RUN zypper refresh RUN zypper --verbose --non-interactive dist-upgrade --auto-agree-with-licenses RUN zypper install --no-confirm fzf awk git make go vim zsh # Clone the fzf repository and build the binary RUN git clone https://github.com/junegunn/fzf.git /fzf && \ cd /fzf && \ make && make install ENV PATH="/fzf/bin:$PATH" ``` ---
brew install --cask docker
docker build --tag opensuse_fzf --file Dockerfile .
docker run --rm --tty  --interactive opensuse_fzf:latest /usr/bin/zsh

run the function below

test_fail() {
  PS4='%B%F{0}+ %D{%T:%3.} %2N:%i%f%b '
  setopt localoptions xtrace verbose
  fifo=$(mktemp -u)
  mkfifo "$fifo"
  (
    echo "Boom" >"$fifo" &
  )
  matches=$(fzf <"$fifo")
  rm -f "$fifo"
}

test_fail

output

+ 03:24:33:752 test_fail:3 fifo=
+ 03:24:33:753 test_fail:3 mktemp -u
+ 03:24:33:752 test_fail:3 fifo=/tmp/tmp.yNSG6LtONw 
+ 03:24:33:756 test_fail:4 mkfifo /tmp/tmp.yNSG6LtONw
+ 03:24:33:761 test_fail:8 matches=
+ 03:24:33:762 test_fail:6 echo Boom
+ 03:24:33:762 test_fail:8 fzf
input/output error
+ 03:24:33:761 test_fail:8 matches='' 
+ 03:24:33:774 test_fail:9 rm -f /tmp/tmp.yNSG6LtONw

Workaround

--- a/shell/completion.zsh
+++ b/shell/completion.zsh
@@ -199,9 +199,9 @@ _fzf_dir_completion() {
 }

-_fzf_feed_fifo() (
+_fzf_feed_fifo() {
   command rm -f "$1"
   mkfifo "$1"
-  cat <&0 > "$1" &
-)
+  cat <&0 > "$1" &|
+}

 _fzf_complete() {

Evolution of the error message by fzf

up to but not including e8405f40fe2eb3675f1cb4f69e825eff5f13f269

source /fzf/shell/completion.zsh
# Failed to read /dev/tty
kill **22R

since e8405f40fe2eb3675f1cb4f69e825eff5f13f269 and up to but not including 83b603390683d49ff75b72d142b4dba4b5186d73

source /fzf/shell/completion.zsh
kill **22R

since 83b603390683d49ff75b72d142b4dba4b5186d73 and up to but not including 94c33ac020fd7e846073cabffa5cf13386f3c852

source /fzf/shell/completion.zsh
# panic: runtime error: invalid memory address or nil pointer dereference
# [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x556fc85c0583]
# 
# goroutine 1 [running]:
# main.exit(0x2, {0x0?, 0x0?})
#         /home/abuild/rpmbuild/BUILD/fzf-0.53.0/main.go:43 +0x23
# main.main()
#         /home/abuild/rpmbuild/BUILD/fzf-0.53.0/main.go:100 +0x3ea
kill **;22R

since 94c33ac020fd7e846073cabffa5cf13386f3c852 and up to but not including 2326c74eb241d4667545eae3deaded4a06710521

source /fzf/shell/completion.zsh
kill **22R

since 2326c74eb241d4667545eae3deaded4a06710521

source /fzf/shell/completion.zsh
kill **input/output error
junegunn commented 1 week ago

@LangLangBart Thanks a lot for the repro. Interestingly, the bash version works fine.

image
junegunn commented 1 week ago

Workaround

  • remove the sub shell and use a &| (disowned job)

I can confirm that it fixes the problem. Do you think we should do this?

LangLangBart commented 1 week ago

Do you think we should do this?

Have not found a satisfactory explanation that justifies the change beyond "it works", and would like to spend more time examining the source code.

junegunn commented 6 days ago

That's a reasonable approach.

Possibly related: https://github.com/wavetermdev/waveterm/pull/608/files / https://github.com/wavetermdev/waveterm/issues/630

But if I recall correctly, the reason I used a subshell in _fzf_feed_fifo was simply to suppress job control messages. Disowning has the same effect, and it fixes this problem, so there's no reason not to use the method. I'll apply the suggested workaround. Thanks.

LangLangBart commented 6 days ago

The inability to reproduce the issue on macOS was due to the zsh package receiving an openSuse patch approximately 7 months ago[^1]. This patch, titled pipe-less-and-signals-handling.patch[^2], was included in revision 103.

If one were to check out revision 102 of the zsh package and try running fzf completion there, no issue would occur.

--- Instructions for Building zsh Revision 102 Using `osc`[^3] to work with the `Open Build Service` and a `Docker` container. Create an account on: [build.opensuse.org](https://build.opensuse.org/) Create an `oscrc` file containing your username and password. ```cfg [general] apiurl=https://api.opensuse.org [https://api.opensuse.org] user= pass= credentials_mgr_class=osc.credentials.PlaintextConfigFileCredentialsManager ``` ```docker FROM registry.opensuse.org/opensuse/tumbleweed:latest RUN zypper refresh RUN zypper --verbose --non-interactive dist-upgrade --auto-agree-with-licenses # some tools RUN zypper install --no-confirm fzf git vim zsh # Install OSC and other tools RUN zypper install --no-confirm osc build obs-service-format_spec_file \ patterns-devel-base-devel_basis hostname sudo tar zstd diffutils # Copy the OSC configuration file with username and password COPY oscrc /root/.config/osc/oscrc RUN osc -A https://api.opensuse.org checkout openSUSE:Factory/zsh && cd $_ && osc up --revision 102 WORKDIR openSUSE:Factory/zsh # directory and shell level ENV PROMPT="%~ %L > " ``` Start building the Docker image and run it with the `--privileged` flag. **NOTE:** The `osc build` command cannot be run during `docker build …` because it would cause a permission error when attempting to mount `proc`. Therefore, we perform it after executing `docker run …`. ```sh mount: /var/tmp/build-root/standard-x86_64/.mount/proc: permission denied. dmesg(1) may have more information after failed mount system call. ``` ``` docker build --tag opensuse_osc --file Dockerfile_osc . docker run --tty --interactive --rm --privileged opensuse_osc:latest /usr/bin/zsh ``` Build the `zsh` package for revision 102, start a new session, and notice that the issue no longer occurs. ```sh # process will take ~5min osc build # start the `zsh` session /var/tmp/build-root/standard-x86_64/usr/bin/zsh -f # test sourcing and completion with fzf, no issue anymore source <(fzf --zsh) kill ** ``` ---

[!NOTE] FYI: The issue doesn't occur on openSuse Leap (predecessor to openSuse Tumbleweed) because Leap runs a different zsh version (5.6). This is also why opensuse/tumbleweed seems to be the only Linux distro to have this issue. Tested on Ubuntu, Alpine, etc.


Alternative solutions to a06745826a4cba4f578a69258f9def75c59530fc

[^1]: Revisions of zsh - openSUSE Build Service [^2]: File pipe-less-and-signals-handling.patch of Package zsh - openSUSE Build Service [^3]: osc, the Command Line Tool | User Guide [^4]: zsh/code/3c7489: Improve process group handling in pipelines.

LangLangBart commented 6 days ago

@junegunn Since the patch from openSUSE is based solely on the development version of the zsh[^1][^2] source code, I would suggest to keep the change for now, as future zsh releases will likely behave the same way.

[^1]: zsh / Code / Commit [188c5c] [^2]: zsh / Code / Commit [61610e]

junegunn commented 6 days ago

@LangLangBart Thank you very much for the great analysis.