shell-pool / shpool

Think tmux, then aim... lower
Apache License 2.0
1.14k stars 18 forks source link

Key bindings don't work inside neovim session #71

Open someplaceguy opened 2 months ago

someplaceguy commented 2 months ago

What happened

When I run neovim inside a shpool session, the shpool key bindings don't work.

I've tried the default key binding for detaching, as well as changing the default to Ctrl-a d and Ctrl-b d, but no matter which key binding I use, neovim still interprets the key binding rather than shpool.

The key bindings work fine once I exit neovim.

What I expected to happen

I expected the session to detach when I press shpool's detach key binding.

To Reproduce

  1. Start the shpool user service
  2. Run shpool attach -f test
  3. Run neovim inside the shpool test session
  4. Observe how the shpool detach key binding doesn't work

Version info

shpool 0.6.2

Logs

Nothing appears in the log when I press the key binding inside neovim. If I exit neovim and press the key binding, then I do see:

daemon.info shpool[2094806]: 2024-07-07T21:18:03.763689Z  INFO ThreadId(57) client->shell{s="test" cid=12}: Detach keybinding action fired
daemon.info shpool[2094806]: 2024-07-07T21:18:03.763736Z  INFO ThreadId(57) client->shell{s="test" cid=12}:action_detach: new
daemon.info shpool[2094806]: 2024-07-07T21:18:03.834319Z  INFO ThreadId(56) reader{s="test" cid=12}: disconnect, shutting down client stream
(...)
ethanpailes commented 2 months ago

Can you post some full daemon logs as well as client logs with the verbosity cranked up? From that log snippet we can tell that the keybinding engine is correctly recognizing the keybinding, but it's not enough to figure out why the detach action isn't working as expected.

To do that, you can run shpool -l /tmp/shpool-daemon.log -vv -s /tmp/shpool.sock daemon to run the daemon, then shpool -l /tmp/shpool-attach.log -vv -s /tmp/shpool.sock attach test then reproduce the bug and upload the two logs from /tmp/shpool*log

someplaceguy commented 2 months ago

From that log snippet we can tell that the keybinding engine is correctly recognizing the keybinding

I'm sorry if I wasn't clear: the log messages that I've posted only appear when I press the keybinding outside of a neovim session.

If I press the keybinding inside neovim, then nothing appears in the log.

ethanpailes commented 2 months ago

Oh yeah I see that now, sorry didn't read your post closely enough. I would still like to get the full logs I requested. There isn't much I can do to diagnose the issue without them since disconnect works fine for me when I'm in a neovim session.

someplaceguy commented 2 months ago

I've attached the requested log files. For this session, I was using the following config file:

prompt_prefix = "[$SHPOOL_SESSION_NAME]"
session_restore_mode = { lines = 30000 }

[[keybinding]]
binding = "Ctrl-b d"
action = "detach"

Accordingly, I've reproduced the bug by starting neovim, then pressing ^B and then pressing d.

Thanks!

shpool-attach.log shpool-daemon.log

ethanpailes commented 2 months ago

I don't see ^B in the daemon log the way I do when I launch a verbose daemon locally and mash Ctrl-b, so I'm thinking different control codes might be getting generated somehow. Currently, we just have a hardcoded table that I populated experimentally, but it might be that we ought to be doing something else.

Unfortunately, the only thing I can think of that could change control codes is different terminal emulators, and it looks like you are using alacritty the same way that I am

2024-07-08T13:28:15.638954Z  INFO ThreadId(05) handle_conn{cid=1}:handle_attach:spawn_subshell:inject_env: injecting TERM into shell Some("alacritty")

The biggest difference I see in our setups is that you are using fish while I'm using bash, though when I manually launch fish and enter neovim inside it the detach keybinding still works.

someplaceguy commented 2 months ago

I don't see ^B in the daemon log the way I do when I launch a verbose daemon locally and mash Ctrl-b

I believe the line in the daemon log that corresponds to me pressing Ctrl-B is this one:

2024-07-08T13:28:21.464759Z TRACE ThreadId(07) reader{s="test" cid=1}: read pty master len=37 '^B        '

That's what I see when I use bat to see the log file. However, when I use vim to see the log file, that line looks quite different:

2024-07-08T13:28:21.464759Z TRACE ThreadId(07) reader{s="test" cid=1}: read pty master len=37 '^[[?25l^[[63;222H^B        ^[[1;5H^[[?25h'

So it appears that bat is interpreting those ANSI escape codes when showing the file, which is why it looks cleaner.

The client-side log seems to have a similar log entry. When I use bat I see:

2024-07-08T13:28:21.464868Z DEBUG ThreadId(05) sock->stdout: chunk='^B        ' kind=Data len=37

When I use vim I see:

2024-07-08T13:28:21.464868Z DEBUG ThreadId(05) sock->stdout: chunk='^[[?25l^[[63;222H^B        ^[[1;5H^[[?25h' kind=Data len=37

That said, I have no idea why all these ANSI codes are being sent...

Unfortunately, the only thing I can think of that could change control codes is different terminal emulators, and it looks like you are using alacritty the same way that I am

I am indeed using alacritty and fish and I don't know why my system behaves differently either.

As you've noticed, my shell environment has TERM=alacritty. I also have COLORTERM=truecolor, but I don't know if that has any significance regarding this issue. Nothing else appears to be relevant as far as I can see...

ethanpailes commented 2 months ago

Oh yeah you're right about that ^B line. The problem is definitely the characters between the ^B and d char then.

I don't set COLORTERM=truecolor, so perhaps that is the problem. I did try setting it myself but detach still worked for me.

isaksamsten commented 1 month ago

I can reproduce this with Neovim (0.10) and Kitty, iTerm2, WezTerm but not in Terminal.app. My guess is that it is related to CSI u

ethanpailes commented 1 month ago

That's interesting. I don't have a mac to test with right now, but I should be able to get to one tomorrow and I'll see if I can reproduce with that setup then (though I guess I can probably try Kitty and WezTerm on my linux box).

@someplaceguy what OS are you using? I'd been assuming you were on linux, but now I'm wondering if alacritty could generate different control codes on different OSes for compatibility reasons or something like that.

someplaceguy commented 1 month ago

@someplaceguy what OS are you using?

@ethanpailes I'm using NixOS Linux 24.05.

isaksamsten commented 1 month ago

I can reproduce with Alacritty as well (version 0.14-dev). Alacritty implemented the kitty keyboard protocol in December 2023 so I tried an older version (0.12) and in which I cannot reproduce the issue.

EDIT: so my guess now is that it is related to the kitty keyboard protocol :-)

ethanpailes commented 1 month ago

I'm still on alacritty 0.12, so that makes sense.

ethanpailes commented 1 month ago

Reading through https://sw.kovidgoyal.net/kitty/keyboard-protocol/#quickstart I've started to develop a working theory about what is going on here.

I suspect that neovim supports the kitty keybinding protocol, which means that when it puts the terminal in alternate screen mode, it sends a control code asking to switch to the kitty protocol (CSI > 1 u), which causes the terminal to switch modes to the new protocol. When it exits, it issues a reset code (CSI < u) which causes the terminal to go back to the way things are. An interesting test here might be opening up neovim inside shpool in a terminal that supports the kitty protocol, then kill -9ing neovim to prevent it from resetting the terminal and seeing if the detach keybinding remains broken.

The fix for this from shpool's perspective is that we need to introduce some sort of notion of keybinding mode. To start off with, this will look like

enum KeybindingMode {
    Kitty(Bindings),
    Default(Bindings),
}

where Default is the extremely unprincipled table of escape codes that I put together by logging observed codes (if anyone knows how to do this better, please chime in). Come to think of it, Default implies an inappropriate level of competance. Maybe we should call it ObservedValuesLetsHopeThisWorks instead.

Each mode will have its own binding engine, and we will switch between the active engine when we observe the shell program we have launched emitting a protocol switch code. Exactly how we will do this is not yet clear to me, since I'm not sure how kitty protocol enabled programs like neovim determine if their mode switch control sequence actually worked. If they all do add-hoc tests, rather than there being some sort of well-defined handshake, this could get pretty hairy. Another complication is that the keybinding engine currently just scanns over the input stream, and mode switch bindings seem like they go in the other opposite direction, so we may need to attach a scanning trie to the output stream as well and coordinate between it and the keybinding engine. Otherwise if this was all just in the input stream, we could just introduce a new ModeSwitch(Arc<KeybindingMode>) branch of the Action enum.

ethanpailes commented 1 month ago

It looks like the terminal switching codes are a little more complex than the quickstart guide explains: https://sw.kovidgoyal.net/kitty/keyboard-protocol/#progressive-enhancement. The codes the quickstart guide mentions mean "push the disambiguated escape code mode to the stack" and "pop off the stack".

It seems we will need to scan for and interpret these flags control codes, as well as implement a stack structure of our own to mirror the one in kitty enabled terminals, and since the kitty protocol calls for seperate stacks for the main and alternate screen we will need to track transitions between those as well.

One thing that's not clear to me is how the = mode set command interacts with the stack. Does it just clobber the top value in the stack, or is there a seperate entry for the currently active mode that sometimes is directly set and sometimes points to the value on the top of the mode stack. The docs don't say.

ethanpailes commented 1 month ago

I'm also realizing that we'll probably need some new scanning infrastructure, since simple tries don't handle parsing CSI sequences very well. The main problem is that CSI control sequences contain numbers that need to extracted and processed. The right tool for the job here is really an online regular expression engine with capture group support, but we can get away with just extending our trie based matcher with support for character classes, capture groups, and eager kleene star, which should allow us to continue to use a DFA without resorting to the subset construction and blowing up our ability to track capture groups. This is starting to feel like something where I should look for a library.

ethanpailes commented 1 month ago

Looks like we might be able to use regex-cursor which is a fork of the pikevm from regex-automata that should allow streaming. It seems pretty much perfectly designed for the usecase, though we'll have to keep an eye on it in order to switch to regex-automata once it gets upstreamed.

Edit: looking at the API and README of regex-cursor a bit, it seems like it is not currently designed for handling streaming chunks of data due to lookaround concerns. It is really only able to work on discontinuous in-memory data structures.

ethanpailes commented 1 month ago

Thinking about it some more, an NFA simulation is going to be a lot slower than the DFA we can have if we do something tailored to our particular usecase, and having to deal with all the lookaround and unicode baggage of the rust regex ecosystem will probably wind up being more troulbe than it is worth. If there was something we could use out of the box, that would be one thing, but it seems like I would probably have to send patches to regex-cursor in order to get it working for us, and it doesn't seem worth it to patch a fork of the real upstream crate in order to get a non-optimal and more-complex solution to our actual problem.

ethanpailes commented 1 month ago

Actually, we can probably just create a Parser from the vte crate and attach it to the output stream. It contains a terminal-escape-sequence focused state machine that should do what we want for data coming out of the shell process. We already depend on vte, so it won't even be a new dep. I'm not sure if that helps us with escape sequences being generated by the terminal though, and while in kitty mode those contain numbers we need to handle.

Aetf commented 1 month ago

Without looking deeply into it, IMHO having the matcher depend on terminal state machine sounds like the proper way to do this. Otherwise we might run into similar problems down the road with some other arcane terminal states...

I need to take this into account for #46. I wonder if it's possible to generalize the Parser concept to other terminal crates.

ethanpailes commented 1 month ago

Yeah, handling this in the in-memory terminal would be one advantage of doing something homebrew