vim / vim

The official Vim repository
https://www.vim.org
Vim License
36.63k stars 5.46k forks source link

Vim hangs mid-completion when used with channel-based listener_add callback #4493

Closed myitcv closed 5 years ago

myitcv commented 5 years ago

Describe the bug

This was spotted in the context of govim - I haven't managed to find a smaller reproduction.

The setup is as follows, main.go initially populated with:

package main

func main() {
    mai
}

We are in insert mode, with the cursor at position (4,5).

Now we trigger the omnifunc with <C-x><C-o>; in the context this results in a single completion result, main, which duly completes to give us:

package main

func main() {
    main
}

The sequence from the Vim channel log corresponding to this is:

  2.274961 SEND on 0(in): '[32,["function","function:GOVIM_internal_Complete",[1,""]]]
'
  2.275044 on 0: Reading JSON
  2.275056 : looking for messages on channels
  2.275062 on 0: Waiting for up to 30000 msec
  2.275680 RECV on 0(out): '[0,[26,"expr","{\"bufnum\": bufnr(\"\"), \"line\": line(\".\"), \"col\": col(\".\")}"]]
'
  2.275698 : looking for messages on channels
  2.275755 on 0: Getting JSON message 0
  2.275769 on 0: Invoking channel callback <SNR>46_define
  2.275963 SEND on 0(in): '[33,["callback",26,["",{"bufnum":1,"col":5,"line":4}]]]
'
  2.276028 : looking for messages on channels
  2.276040 on 0: Waiting for up to 30000 msec
  2.287999 RECV on 0(out): '[32,["",1]]
'
  2.288016 on 0: Getting JSON message 32
  2.288228 SEND on 0(in): '[34,["function","function:GOVIM_internal_Complete",[0,"mai"]]]
'
  2.288256 on 0: Reading JSON
  2.288262 : looking for messages on channels
  2.288264 on 0: Waiting for up to 30000 msec
  2.288571 RECV on 0(out): '[34,["",[{"abbr":"main()","word":"main","info":""}]]]
'
  2.288581 on 0: Getting JSON message 34
  2.288756 SEND on 0(in): '[35,["function","function:GOVIM_internal_BufChanged",[1,4,5,0,[{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":5,"added":0,"end":5,"lines":["\tmain"]}]]]]
'
  2.288795 on 0: Reading JSON
  2.288800 : looking for messages on channels
  2.288803 on 0: Waiting for up to 30000 msec
  2.289967 RECV on 0(out): '[35,["",null]]
'
  2.289991 on 0: Getting JSON message 35
  2.292119 SEND on 0(in): '[36,["function","function:GOVIM_internal_BufChanged",[1,4,5,0,[{"lnum":4,"col":5,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tmain"]},{"lnum":4,"col":5,"added":0,"end":5,"lines":["\tmain"]}]]]]
'
  2.292177 on 0: Reading JSON
  2.292200 : looking for messages on channels
  2.292209 on 0: Waiting for up to 30000 msec
  2.293328 RECV on 0(out): '[36,["",null]]
'
  2.293358 on 0: Getting JSON message 36
  2.296341 : looking for messages on channels

~Note:~

~Otherwise,~ All is well at this point.

If, still in insert mode, we now hit backspace to remove the n on line 4, all is fine: we see the following in the Vim channel log:

  3.130369 SEND on 0(in): '[37,["function","function:GOVIM_internal_BufChanged",[1,4,5,0,[{"lnum":4,"col":5,"added":0,"end":5,"lines":["\tmai"]},{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tmai"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tmai"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tmai"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tmai"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tmai"]},{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tmai"]}]]]]
'
  3.130454 on 0: Reading JSON
  3.130467 : looking for messages on channels
  3.130473 on 0: Waiting for up to 30000 msec
  3.132323 RECV on 0(out): '[37,["",null]]
'
  3.132343 on 0: Getting JSON message 37
  3.134817 : looking for messages on channels

All is still well at this point.

If we then hit backspace again, the following sequence occurs:

  5.571337 SEND on 0(in): '[38,["function","function:GOVIM_internal_Complete",[1,""]]]
'
  5.571422 on 0: Reading JSON
  5.571434 : looking for messages on channels
  5.571440 on 0: Waiting for up to 30000 msec
  5.572045 RECV on 0(out): '[0,[27,"expr","{\"bufnum\": bufnr(\"\"), \"line\": line(\".\"), \"col\": col(\".\")}"]]
'
  5.572064 : looking for messages on channels
  5.572086 on 0: Getting JSON message 0
  5.572092 on 0: Invoking channel callback <SNR>46_define
  5.572284 SEND on 0(in): '[39,["callback",27,["",{"bufnum":1,"col":4,"line":4}]]]
'
  5.572761 SEND on 0(in): '[40,["function","function:GOVIM_internal_BufChanged",[1,4,5,0,[{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tma"]}]]]]
'
  5.572784 on 0: Reading JSON
  5.572789 : looking for messages on channels
  5.572793 on 0: Waiting for up to 30000 msec
  5.574954 RECV on 0(out): '[38,["",1]]
'
  5.574977 : looking for messages on channels
  5.575034 on 0: Getting JSON message 38
  5.575044 on 0: Dropping message 38 without callback
  5.575049 : looking for messages on channels
  5.575055 on 0: Waiting for up to 30000 msec
  5.575830 RECV on 0(out): '[40,["",null]]
'
  5.575845 on 0: Getting JSON message 40
  5.575955 : looking for messages on channels
  5.576027 on 0: Waiting for up to 30000 msec

Vim then hangs, only for 30 secs later (I have my channel timeout at 30secs so this corresponds):

 35.587927 on 0: Timed out
 35.588509 SEND on 0(in): '[41,["function","function:GOVIM_internal_Complete",[0,"\tma"]]]
'
 35.588638 on 0: Reading JSON
 35.588679 : looking for messages on channels
 35.588765 on 0: Waiting for up to 30000 msec
 35.589272 RECV on 0(out): '[41,["",[{"abbr":"main()","word":"main","info":""},{"abbr":"make(t Type, size ...int)","word":"make","info":"Type"}]]]
'
 35.589295 on 0: Getting JSON message 41
 35.589759 SEND on 0(in): '[42,["function","function:GOVIM_internal_BufChanged",[1,4,5,0,[{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tma{\u0001V"]},{"lnum":4,"col":5,"added":0,"end":5,"lines":["\tma{\u0001V"]},{"lnum":4,"col":6,"added":0,"end":5,"lines":["\tma{\u0001V"]}]]]]
'
 35.589903 on 0: Reading JSON
 35.589979 : looking for messages on channels
 35.590039 on 0: Waiting for up to 30000 msec
 35.591493 RECV on 0(out): '[42,["",null]]
'
 35.591551 on 0: Getting JSON message 42
 35.592308 : looking for messages on channels

govim does not appear to be waiting/doing anything during this 30sec period; it seems that Vim is waiting on something... but it's unclear to me what that is.

Full log can be found at https://gist.github.com/myitcv/b99443a1e597898a2978af9e25f6206f

To Reproduce

I can reproduce this using govim; I haven't managed to find a smaller repro.

Expected behavior

No hang.

Screenshots

If applicable, copy/paste the text or add screenshots to help explain your problem.

Environment (please complete the following information):

VIM - Vi IMproved 8.1 (2018 May 18, compiled Jun  5 2019 11:02:31)
Included patches: 1-1467
Compiled by myitcv@myitcv
Huge version with GTK2 GUI.  Features included (+) or not (-):
+acl               +extra_search      +mouse_netterm     -tag_old_static
+arabic            -farsi             +mouse_sgr         -tag_any_white
+autocmd           +file_in_path      -mouse_sysmouse    -tcl
+autochdir         +find_in_path      +mouse_urxvt       +termguicolors
-autoservername    +float             +mouse_xterm       +terminal
+balloon_eval      +folding           +multi_byte        +terminfo
+balloon_eval_term -footer            +multi_lang        +termresponse
+browse            +fork()            -mzscheme          +textobjects
++builtin_terms    +gettext           -netbeans_intg     +textprop
+byte_offset       -hangul_input      +num64             +timers
+channel           +iconv             +packages          +title
+cindent           +insert_expand     +path_extra        +toolbar
+clientserver      +job               -perl              +user_commands
+clipboard         +jumplist          +persistent_undo   +vartabs
+cmdline_compl     +keymap            +postscript        +vertsplit
+cmdline_hist      +lambda            +printer           +virtualedit
+cmdline_info      +langmap           +profile           +visual
+comments          +libcall           +python            +visualextra
+conceal           +linebreak         -python3           +viminfo
+cryptv            +lispindent        +quickfix          +vreplace
+cscope            +listcmds          +reltime           +wildignore
+cursorbind        +localmap          +rightleft         +wildmenu
+cursorshape       +lua               -ruby              +windows
+dialog_con_gui    +menu              +scrollbind        +writebackup
+diff              +mksession         +signs             +X11
+digraphs          +modify_fname      +smartindent       -xfontset
+dnd               +mouse             +startuptime       +xim
-ebcdic            +mouseshape        +statusline        +xpm
+emacs_tags        +mouse_dec         -sun_workshop      +xsmp_interact
+eval              -mouse_gpm         +syntax            +xterm_clipboard
+ex_extra          -mouse_jsbterm     +tag_binary        -xterm_save
   system vimrc file: "$VIM/vimrc"
     user vimrc file: "$HOME/.vimrc"
 2nd user vimrc file: "~/.vim/vimrc"
      user exrc file: "$HOME/.exrc"
  system gvimrc file: "$VIM/gvimrc"
    user gvimrc file: "$HOME/.gvimrc"
2nd user gvimrc file: "~/.vim/gvimrc"
       defaults file: "$VIMRUNTIME/defaults.vim"
    system menu file: "$VIMRUNTIME/menu.vim"
  fall-back for $VIM: "/home/myitcv/usr/vim/share/vim"
Compilation: gcc -c -I. -Iproto -DHAVE_CONFIG_H -DFEAT_GUI_GTK  -pthread -I/usr/include/gtk-2.0 -I/usr/lib/x86_64-linux-gnu/gtk-2.0/include -I/usr/include/gio-unix-2.0 -I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/pango-1.0 -I/usr/include/harfbuzz -I/usr/include/pango-1.0 -I/usr/include/fribidi -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/uuid -I/usr/include/freetype2 -I/usr/include/libpng16   -g -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1       
Linking: gcc   -L/usr/local/lib -Wl,--as-needed -o vim   -lgtk-x11-2.0 -lgdk-x11-2.0 -lpangocairo-1.0 -latk-1.0 -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lpangoft2-1.0 -lpango-1.0 -lgobject-2.0 -lglib-2.0 -lfontconfig -lfreetype -lSM -lICE -lXpm -lXt -lX11 -lXdmcp -lSM -lICE  -lm -ltinfo -lnsl   -ldl  -L/usr/lib/x86_64-linux-gnu -lluajit-5.1  -L/home/myitcv/.pythons/2.7.15/lib/python2.7/config -lpython2.7 -lpthread -ldl -lutil -lm -Xlinker -export-dynamic       

Additional context Add any other context about the problem here.

brammool commented 5 years ago

This is too complicated to debug. Please add some debug statements to find out where those duplicated changes come from (they might actually be valid and caused by autocompletion), and where the hang is happening.

myitcv commented 5 years ago

This is too complicated to debug

Agreed.

First, the good news. I just realised the multiple changes are not actually identical; they differ in the 'col' value (which is not of interest to me, hence my eye skipped over that detail). So the calls to the listener_add callback function look ok, including the multiple changes. Updated the description to reflect this.

I further managed to bisect the hang issue I'm seeing to the following setting in my .vimrc:

set bs=2

With set bs=2, I can recreate the hang as described above. What's important about that setup is:

Without that setting (which I think is a default of set bs=) I cannot reproduce the hang.

Using the simple .vimrc at the end of this comment which simulates a non-channel based omnifunc and listener_add callback I can't reproduce the hang, either with set bs=2 or otherwise.

So it seems that the hang is some function of the fact govim has channel based omnifunc and listener_add callbacks. The only thing I noticed from my report in the description above is that the listener_add callback is called concurrently with the completion function:

First the completion function is triggered:

  5.571337 SEND on 0(in): '[38,["function","function:GOVIM_internal_Complete",[1,""]]]

then the listener_add callback is called:

  5.572761 SEND on 0(in): '[40,["function","function:GOVIM_internal_BufChanged",[1,4,5,0,[{"lnum":4,"col":4,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":2,"added":0,"end":5,"lines":["\tma"]},{"lnum":4,"col":3,"added":0,"end":5,"lines":["\tma"]}]]]]

then the completion function returns:

  5.574954 RECV on 0(out): '[38,["",1]]

then the listener_add callback returns:

  5.575830 RECV on 0(out): '[40,["",null]]

I don't know whether that is significant or not, but such a scenario is obviously impossible in the VimScript based setup that follows.

set nocp
set noswapfile
set completeopt=menu,longest
set bs=2
function DoComplete(findstart, base)
  if a:findstart == 1
    return 0
  else
    if a:base == "mai"
      return [{"abbr":"main()","word":"main","info":""}]
    elseif a:base == "ma"
      return [{"abbr":"main()","word":"main","info":""},{"abbr":"make(t Type, size ...int)","word":"make","info":"Type"}]
    endif
  endif
endfunction
set omnifunc=DoComplete
function EchoChanges(bufnr, start, end, added, changes)
  redir >> /tmp/listener.log | echom a:changes | redir END
endfunction
call listener_add("EchoChanges")
myitcv commented 5 years ago

Just to make precise a finding in the previous comment:

myitcv commented 5 years ago

I now have a minimal reproduction of this, outside of govim.

Key finding is that this "hang" is triggered by the channel-based complete function synchronously calling back into Vim before it (the complete function) returns.

The reproduction files can be found in the following gist:

https://gist.github.com/myitcv/f968c2f9f073121675ca44699d42fae4

Requires a working Go installation: https://golang.org/dl/

Run with:

$ vi -u complete.vim

Then:

Notice at this point that Vim has hung, and will continue to hang until the 30 sec channel timeout is triggered.

See my resulting /tmp/vim_channel.log here: https://gist.github.com/myitcv/2c9644def065e80a4ce3ba155778690c

brammool commented 5 years ago

I now have a minimal reproduction of this, outside of govim.

Key finding is that this "hang" is triggered by the channel-based complete function synchronously calling back into Vim before it (the complete function) returns.

The reproduction files can be found in the following gist:

https://gist.github.com/myitcv/f968c2f9f073121675ca44699d42fae4

Requires a working Go installation: https://golang.org/dl/

Run with:

$ vi -u complete.vim

Then:

  • enter insert mode (i)
  • type mai
  • trigger omni-complete via <C-x><C-o> - this should leave the first line reading main
  • still in insert mode, hit backspace to remove the n - this should leave the first line reading mai
  • still in insert mode, hit backspace to remove the i - this should leave the first line reading ma

Notice at this point that Vim has hung, and will continue to hang until the 30 sec channel timeout is triggered.

See my resulting /tmp/vim_channel.log here: https://gist.github.com/myitcv/2c9644def065e80a4ce3ba155778690c

I'm trying to understand what happens exactly. It seems that the response to message 9 is dropped instead of handled:

This must result from the ch_evalexpr() call: 2.868934 SEND on 0(in): '[9,{"Comment":"Complete findstart","Calls":[["redraw",""]],"Delay":"100ms"}]

And it gets back the response about 100 msec later: 2.969833 RECV on 0(out): '[9,""]

But then: 2.969884 on 0: Getting JSON message 9 2.969893 on 0: Dropping message 9 without callback

It appears channel_read_json_block() is used recursively, and then ch_block_id is overwritten. This function was never written to handle a recursive call.

This happens because you made your server send a "redraw" while Vim is in channel_read_json_block(), and this redraw triggers invoking listener callbacks, which then does another ch_evalexpr().

I suppose the solution is to make channel_read_json_block() work resursively. You could work around it by avoiding ch_evalexpr() in the listener callback, use an asynchronous ch_sendexpr() and another callback.

Making a test for this is very tricky, since it is so timing sensitive. Thus I'll just fix the problem, and let you verify it.

-- The coffee just wasn't strong enough to defend itself -- Tom Waits

/// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\ \\ an exciting new programming language -- http://www.Zimbu.org /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

myitcv commented 5 years ago

I suppose the solution is to make channel_read_json_block() work resursively

I think so, yes. Because otherwise, at least (selfishly) from my perspective/understanding, it makes things much harder to work with/reason about.

Making a test for this is very tricky, since it is so timing sensitive. Thus I'll just fix the problem, and let you verify it.

Totally fine with that.

govim has fairly good integration tests, so we'd likely run into any issues quite quickly.

The logging we get from the Vim channel logs and govim makes tracking these things down relatively straightforward without need for further debugging statements etc. It's creating minimal reproductions which, as you say, is so much harder.

myitcv commented 5 years ago

This was fixed in https://github.com/vim/vim/commit/38ea784fecf7921dca83ddc75fe9cb40708521b2