sharkdp / bat

A cat(1) clone with wings.
Apache License 2.0
49.49k stars 1.25k forks source link

Keyboard arrow keys not working for scrolling from STDIN, rather shows keycode of the key pressed #1006

Open yedhink opened 4 years ago

yedhink commented 4 years ago

What version of bat are you using? bat 0.15.0

Describe the bug you encountered: Using keys to scroll through the long STDIN is emitting keycodes, rather than actually scrolling down. Some terminal details:-

terminal
--------
kitty 0.17.4 created by Kovid Goyal

exports(relevant ones)
---------------------
export BAT_PAGER="less -RF"
export EDITOR="nvim"
export TERMINAL="kitty"

Context

I was trying to pass the stdin from the node/npm package makemehapi to bat.

npx makemehapi print | bat

The content was a lengthy output. Thus i tried scrolling with arrow keys like i usually do. But currently the keycodes of the key pressed is displayed rather than actually scrolling down.

Content with paging and error - keycodes of arrow keys displayed at bottom

28:23:18_2020-05-18_1366x768

Content with paging and --show-all and error - keycodes of arrow keys displayed at bottom

15:24:18_2020-05-18_1366x768

Things I have tried

...

Describe what you expected to happen? ... To be able to scroll through the long content using arrow keys or even Page Down/UP keys without having to press ENTER for each line.

How did you install bat? yay -S bat


Info.sh

system

$ uname -srm Linux 5.6.13-arch1-1 x86_64

bat

$ bat --version bat 0.15.0

$ env BAT_PAGER=less -RF

bat_config

$ cat /home/ikigai/.config/bat/config

--pager="less -RF"

bat_wrapper

No wrapper script for 'bat'.

bat_wrapper_function

No wrapper function for 'bat'.

No wrapper function for 'cat'.

tool

$ less --version less 551 (PCRE regular expressions)

Full STDOUT of npx makemehapi print

  Create a server which responds to requests to /?name=Handling using a  
  template located at templates/index.html which outputs the following HTML:  

     <html>  
         <head><title>Hello Handling</title></head>  
         <body>  
             Hello Handling  
         </body>  
     </html>  

 ─────────────────────────────────────────────────────────────────────────────  
  ##HINTS  

  This exercise requires you to install the vision module, which is a hapi  
  plugin for rendering templates. You'll need to register the plugin in your  
  code in order to render your templates:  

     var Vision = require('vision');  

     await server.register(Vision);  

  The view key can be used to define the template to be used to generate the  
  response.  

     handler: {  
         view: "index.html"  
     }  

  server.views() is the server method that we use to configure the templates  
  used on our server. This method receives a configuration object in which  
  we can set different engines based on file extension. This object can also  
  set a directory path for your templates.  

     server.views({  
         engines: {  
             html: require('handlebars')  
         },  
         path: Path.join(__dirname, 'templates')  
     });  

  In this exercise, we'll be using Handlebars. To install handlebars:  

     npm install handlebars  

  With Handlebars templates, you can render a variable directly in HTML by  
  surrounding the variable with curly braces, e.g. {{foo}}.  

  The template receives some information from the request. For example, the  
  query parameters that were passed in via the URL are available in the  
  query object. These parameters can then be used in the template.  Query  
  params get automatically parsed and aren't declared in the route path.  

     <div>{{query.paramName}}</div>  
sharkdp commented 4 years ago

Thank you for the detailed report!

the keycodes of the key pressed is displayed rather than actually scrolling down.

I can confirm this.

tried bat --pager=never still no use

Oh, that's wrong. This tries to start a pager named "never" :smile:. Try bat --paging=never instead. You will see that it does not suffer from the same issue (well... there is no pager that could react to the arrow keys).

Interestingly, piping to less -FR (which is what bat calls if the version is as new as yours) works just fine:

npx makemehapi print | less -FR

So it really seems like bat is messing things up here.

I don't really know what's going on here. I was expecting makemehapi to print some terminal control characters (like: clear screen, position cursor), but they would have to be present in the output of bat -A, and I don't see anything.

The bug can not be reproduced if the output is first piped to a file. This works just fine:

npx makemehapi print > /tmp/output
bat /tmp/output

npx prints a lot of stuff on stderr, including control characters (for progress bars). But removing that doesn't help either:

npx makemehapi print 2> /dev/null | bat
yedhink commented 4 years ago

@sharkdp I just tried the following test cases:-

npx makemehapi print | less -Fr  # this works just fine. arrow keys work
npx makemehapi print | less -FR  # arrow keys work for me here too

-R is Like -r, but only ANSI "color" escape sequences are output in "raw" form.

But with bat setting after trying to use -r like export BAT_PAGER="less -Fr" and in ~/.config/bat/config like --pager "less -Fr", the error persists in this case.

A weird thing is trying the below command results in a very weird result full of keycodes:-

npx makemehapi print | bat --pager 'less -F'

-F or --quit-if-one-screen : Causes less to automatically exit if the entire file can be displayed on the first screen.
34:52:08_2020-05-20_1366x768

If we add a -A too with above command, then it results in a total mess. So is the problem fully with the makemehapi print is what I'm wondering.

How makemehapi print works?

I think this is how it works:-

Source code relevant blocks:-

I'm not sure how helpful this is. But still doing the best i can. Please mention me if anymore help is needed. Thanks.

sharkdp commented 4 years ago

One more thing I found out: this works perfectly fine:

unbuffer makemehapi print | bat

This probably means that something weird is going on with buffering/flushing when makemehapi prints its output.

Note that makemehapi prints a different output based on whether or not it is printing to an interactive terminal (or to a pipe)... something that bat also does. Using unbuffer, the process is tricked into thinking it writes to an interactive terminal.

Another weird thing is that (the interactive version of) makemehapi print writes an unfinished escape code (\x1b[39 without the closing m) at the end of its output. It also doesn't write a trailing newline.

sharkdp commented 4 years ago

A weird thing is trying the below command results in a very weird result full of keycodes:-

npx makemehapi print | bat --pager 'less -F'

That is expected. Those are the ANSI escape sequences for colorization. Even if there is no syntax highlighting, bat still prints a colorized grid and line numbers.

yedhink commented 4 years ago

@sharkdp The unbuffer trick works just fine. So do you think this issue should be closed? I mean if it's a single isolated issue part of a package like makemehapi and their way of buffering, then is it something bat can be made to adapt to, given that npx makemehapi print | less -FR works fine?

sharkdp commented 4 years ago

So do you think this issue should be closed?

I don't know. I would definitely like to understand what's going on here :smile:. Even if this seems like something that doesn't occur very frequently.

is it something bat can be made to adapt to, given that npx makemehapi print | less -FR works fine?

I don't know. We would first need to understand what's going on. Maybe by trying to build a minimal example program that would show the same behavior.

yedhink commented 4 years ago

That makes sense. Lets leave the issue open till a proper solution is figured out.

chocolateboy commented 4 years ago

OS: Linux (Arch) | bat: 0.15.4

I have the same issue with the output of several Node.js (v14.9.0) programs, e.g.

Doesn't happen with all Node.js programs, though. e.g. these are fine:

It's not caused by a specific third-party dependency because, e.g., pidtree doesn't have any.

sharkdp commented 4 years ago

@chocolateboy Thank you for the information. I can verify this for prettier. Haven't tested any of the other tools. But that's a clear indication that we should try to find out what is going on.

Side note (CC @eth-p): prettybat is not affected by this, even though it is using prettier. Probably because the output is stored in a temporary file.

Enselic commented 3 years ago

I came up with the following when looking for a minimal node program. Seems to be something intrinsic to node, and somehow timing/buffer-size specific, because I can reproduce the problem with this:

NODE_DISABLE_COLORS=1 node -e 'for (var i = 0; i <= 10000; i++) { console.log(i); }' | bat

but if I change the end condition to either 1000 or 100000, i.e. either higher or lower, I can not reproduce! However, taking node out of the equation, all numbers work fine: 1000, 10000 and 100000:

for i in $(seq 0 10000); do echo $i; done | bat

So, very strange. Do you see the same thing? Maybe the number used to reproduce depends on how fast your machine is or something weird like that. For the record, I use the following versions, on a MacBook Pro from 2018.

% node --version
v12.18.4
% bat --version
bat 0.17.1
Enselic commented 3 years ago

I can reproduce this without any bat code whatsoever, so this bug probably does not belong in this project.

It really seems to be something timing dependent, because if I change the sleep from 10ms to 1ms, the bug goes away. Likewise, if I change the sleep from 10ms to 100ms, the bug goes away.

It also seems to be node dependent, because if I stop using node, the bug goes away. I tested with the latest node version v15.3.0 but the bug is still present even if using that version.

It would be very helpful for to know if it is easy for others to reproduce as per below. If others also easily can reproduce using the below method, the next step is perhaps to see if it can be reproduced in some language other than Rust, to know if this is a rust-specific problem. If not, a bug should perhaps be filed against the Node project...

Anyway, here is how I can easily reproduce without any bat code. Does the below code work for you too to reproduce?

main.rs

fn main() {
    std::thread::sleep(std::time::Duration::from_millis(10));

    let child = std::process::Command::new("less")
        .stdin(std::process::Stdio::piped())
        .spawn()
        .unwrap();

    let mut child_stdin = child.stdin.unwrap();

    // Press Ctrl + C to exit via SIGINT
    let mut i = 0;
    loop {
        i += 1;
        use std::io::Write;
        child_stdin.write_all(format!("{}\n",i).as_bytes()).unwrap();
    }
}

Compile (I use rustc 1.48.0 (7eac88abb 2020-11-16)):

rustc main.rs

Reproduce:

node -e '1' | ./main
Enselic commented 3 years ago

Sorry for somewhat spamming this bug report, but I've started to obsess about getting to the bottom of this bug.

Latest news is that this seems to be a regression in node v12.5.0. I can't reproduce with node v12.4.0, but can easily reproduce in node v12.5.0.

Looking at the changelog of node v12.5.0 I found this very suspicious bug fix: https://github.com/nodejs/node/pull/24260 src: restore stdio on program exit

They are messing around with file descriptors 0-2, i.e. stdin among other things. And since they ostensibly do it upon exiting node, it would explain the dependence upon specific timing (the exit of node) and thus the need to have sleeps in the repro code.

I'll see if I can come up with a convincing bug report for the node project. To be continued...

sharkdp commented 3 years ago

Sorry for somewhat spamming this bug report, but I've started to obsess about getting to the bottom of this bug.

I love it.. keep investigating :smile: :+1:

obfusk commented 3 years ago

Does the below code work for you too to reproduce?

Not for me (rustc 1.48.0, node v12.19.0). But the node for loop + bat does.

Enselic commented 3 years ago

Sorry for my slow pace, but I have a full-time job and a family, so spare time is a bit limited ;)

After quite a bit of research and debugging involving custom builds with custom debug prints in both node and less, I now have a firm grasp of what's going on. I can explain all the behavior we see. But first the main point:

Summary =======

This issue should be fixable, or at least greatly mitigated, by making bat start its pager much earlier.

Details =====

Primer on input from terminals

Terminals can be in many modes, but the most common are commonly called "raw mode" and "canonical mode". In "raw mode", a read() call on a terminal file descriptor will return what was typed as soon as a key is pressed, and typed characters will not be echoed back. In "canonical mode", the read() call wait for a newline character to be typed before returning what was typed, and characters are echoed back to the terminal as they are typed.

Note that the terminal mode is per-terminal. This means that if two processes use the same terminal, the terminal mode is shared, even if different file descriptors in different processes are used.

How less uses terminal modes

When less starts, it will open a file descriptor to /dev/tty, i.e. the current terminal. Then it stores the current terminal mode in a global variable, and then change the terminal mode to "raw mode". Then it starts waiting for key presses with read(). Changing to "raw mode" is what makes it possible to scroll in less with individual KeyDown presses. Without "raw mode", it would be necessary to press KeyDown and then Return to scroll down one line. Just before the less process quits, it will reset the terminal mode to what it was when less was started. This is necessary since the terminal mode is per-terminal, and not per-process. It is simply a necessary restoration of global, shared state.

How node uses terminal modes

When node starts, it will also store the current terminal mode. It will do it for stdin and stderr, but since those are bound to the current terminal, it will be the same terminal as less is using. Then, right before node exists, it will reset the terminal mode to what it was when node started, since it might have changed the modes due to JS scripts. This is good practice and it should not stop doing that. As mentioned above, less also behaves like this.

How the original problem occurs

The problem occurs like this. Note that less and node operates independently, so this is all very racy.

  1. node reads current terminal mode as "canonical mode" and stores that to restore it later
  2. less reads current terminal mode as "canonical mode" and stores that to restore it later
  3. less changes current terminal mode to "raw mode"
  4. node exits and restore current terminal mode to "canonical mode"
  5. Now when less calls read(), the method will block until a newline is typed.
  6. less exits and restore current terminal mode to "canonical mode" (which it already is in)

How making bat start less quicker will fix the problem

Assuming bat can start less as good as quick as pure less, this is what will happen:

  1. less reads current terminal mode as "canonical mode" and stores that to restore it later
  2. less changes current terminal mode to "raw mode"
  3. node reads current terminal mode as "raw mode" and stores that to restore it later
  4. node exits and restore current terminal mode to "raw mode"
  5. All is well. When less calls read(), it will return as soon as e.g. KeyDown is pressed, since terminal mode is still "raw mode"
  6. less exits and restore current terminal mode to "canonical mode"

Explanation of reproducibility examples

The reason node -e 'for (var i = 0; i <= 10000; i++) { console.log(i); }' | bat fails is the same reason as above.

The reason node -e 'for (var i = 0; i <= 1000; i++) { console.log(i); }' | bat' is NOT buggy is because node starts and exits fully BEFORE less even starts. So node can't mess up the terminal mode for less, since they run in series and not in parallel.

The reason node -e 'for (var i = 0; i <= 100000; i++) { console.log(i); } | bat' is NOT buggy is... false. It is actually buggy. It's just that you have to scroll down enough lines (in my case 88000) to make node exit. If you don't scroll down, node will wait for the stdout buffer to leave room for more data, and not quit. As soon as node has no more data to write, it will quit, and the bug will occur.

The reason node -e 'for (var i = 0; i <= 20000; i++) { console.log(i); }' | less is not buggy is that less starts quick enough to change the terminal mode before node reads it. Since less is quick enough to change the terminal mode node will restore the terminal mode to "raw mode" when it exits, since that was what it was when it started. Since less was so quick... This is the working scenario that I describe above.

You will get the same bug with pure less if you just delay the start of less. Try this and scroll down to the bottom, and you will see that pure less also is "buggy": node -e 'for (var i = 0; i <= 20000; i++) { console.log(i); }' | (sleep 1; less) (assuming your system doesn't have big enough stdin buffer with room for all 20000 lines from node).

The reason that std::thread::sleep(std::time::Duration::from_millis(10)); is needed is to make less start slow enough to give node time to read the current terminal mode as "canonical mode". If you remove the sleep, less is able to start and change terminal mode before node reads it the first time, so everything works. I.e. the working scenario I describe above.

Closing words of this comment

Ok, that was a mouthful... Let me know if you have any questions, I will gladly elaborate on the details or answer any other questions you might have.

I think the next step here is a proof-of-concept patch that makes bat start less much quicker, which should solve the issue. Maybe I'll take a stab at it. We'll see :)

sharkdp commented 3 years ago

Excellent analysis - thank you very much!

When node starts, it will also store the current terminal mode. It will do it for stdin and stderr, but since those are bound to the current terminal, it will be the same terminal as less is using. Then, right before node exists, it will reset the terminal mode to what it was when node started, since it might have changed the modes due to JS scripts. This is good practice and it should not stop doing that.

Hm. But we haven't seen similar problems with other programs. I would rather expect node not to change terminal modes at all if I run a program like

for (var i = 0; i <= 20000; i++) { console.log(i); }

As mentioned above, less also behaves like this.

Ok, but it has a good reason to change the terminal mode.

Enselic commented 3 years ago

Well, I actually prefer the node behavior over e.g the python3 behavior. If you for example run

python3 -c 'import tty; import sys; tty.setraw(sys.stdin)'

it will mess up your terminal because python3 will not restore the terminal mode when it exits. You have to restore it manually with e.g.

stty cooked    # aka 'canonical'

Although I must admit that one can make a good argument for that the python3 behavior is more "correct"...

On the surface of it, yes, it seems unnecessary for node to store and restore the terminal mode when you only do console.log, but it's probably tricky to implement terminal mode restoration in a robust way if you want it to only happen when it's really needed. Either way, if we want bat to "behave" like less in terms of terminal modes, there is no getting around that we need to start less much faster.

I took a shot at it but it's a hard nut to crack. Even if you for proof-of-concept temporarily get rid of the slow things that happens before Command::spawn() such as assets_from_cache_or_binary() and retrieve_less_version() and a handful of other things, it is not enough. I got it down to 2 ms, but then it only sometimes work. You probably need to get down to the ~< 1 ms second range from start of bin/bat/main() to Command::spawn() for it to work reliably.

Right now my best bet for this is to make a fast-path to Command::spawn(), for example by bypassing all clap parsing to get just the stuff you need to know in order for properly spawning the pager, but it might also be possible by just super-optimizing the code paths we have now, although I am skeptical about that.

jgsqware commented 3 years ago

Hi, I got the same issue today with the AWS cdk cli when piping the output of cdk synth to bat

staranto commented 1 year ago

Curious to know if there's any new thinking on this issue. I'm running into it with --

: uname -a Linux pop-os 6.2.6-76060206-generic #202303130630~1685473338~22.04~995127e SMP PREEMPT_DYNAMIC Tue M x86_64 x86_64 x86_64 GNU/Linux

: bat --version bat 0.23.0 (871abd2)

: node --version v20.0.0

academician commented 2 months ago

Just for additional information, I experienced the same issue when running buf config migrate --diff | bat -l Diff on a project I'm working on. This is on Ubuntu Noble under WSL on Windows 11.

This is the buf CLI tool for working with Protocol Buffers, which is written in Go, not Node - so I thought it seemed this bug wasn't restricted to Node tools. The unbuffer workaround mentioned above also made the problem go away.

However, I realized I had installed the buf CLI with npm - so it was effectively running it wrapped in Node tooling (npx). After uninstalling the npm version and trying both source-built and pre-built tarball versions of buf, I found that the issue disappeared. So this is almost certainly something to do with Node, and probably specifically npx.

That's as far as I got, since I solved my own issue, but hopefully this is helpful data point. I wonder if there's any way to detect buffered input piped to bat and account for it, or if output buffering just has to be disabled first - in which case, the unbuffer workaround just needs to be documented.