gitext-rs / git-dive

Dive into a file's history to find root cause
Apache License 2.0
59 stars 3 forks source link

Re-implement git's pager support #24

Closed epage closed 2 years ago

epage commented 2 years ago

core.pager

This setting determines which pager is used when Git pages output such as log and diff. You can set it to more or to your favorite pager (by default, it’s less), or you can turn it off by setting it to a blank string:

$ git config --global core.pager ''

If you run that, Git will page the entire output of all commands, no matter how long they are.

https://www.git-scm.com/book/en/v2/Customizing-Git-Git-Configuration

Though it looks like there might be a pager.<cmd>?

See https://github.com/git/git/blob/7c46ea0ded70c921aedf22ced1afeea2a2a9ed26/builtin.h#L110

epage commented 2 years ago

[-p | --paginate | -P | --no-pager]

This is on the top-level command, so I assume it'll be passed down to us via environment variables. We need to confirm

EDIT:

$ git env | rg GIT
GIT_EXEC_PATH=/usr/lib/git-core

$ git --no-pager env | rg GIT
GIT_EXEC_PATH=/usr/lib/git-core
GIT_PAGER=cat
       -p, --paginate
           Pipe all output into less (or if set, $PAGER) if standard output is a terminal. This overrides the pager.<cmd>
           configuration options (see the "Configuration Mechanism" section below).

       -P, --no-pager
           Do not pipe Git output into a pager.

I guess we are supposed to read GIT_PAGER if present and then ignore pager.<cmd>

And the CLI overrides GIT_PAGER

$ GIT_PAGER=less git -P env | rg GIT
GIT_EXEC_PATH=/usr/lib/git-core
GIT_PAGER=cat

$ git -c core.pager="" env | rg GIT
GIT_CONFIG_PARAMETERS='core.pager'=''
GIT_EXEC_PATH=/usr/lib/git-core

NOTE: PAGER does not get forwarded to GIT_PAGER

$ PAGER=less git env | rg GIT
GIT_EXEC_PATH=/usr/lib/git-core
epage commented 2 years ago

GIT_PAGER controls the program used to display multi-page output on the command line. If this is unset, PAGER will be used as a fallback.

https://git-scm.com/book/en/v2/Git-Internals-Environment-Variables

We'll need to check if git will handle this for us or if we need to handle it directly

epage commented 2 years ago

git seems to start up the pager and then forwards stdout/stderr to the pager process and abandons it, cleaning it up on process exit, and does some signal stuff in there

https://github.com/git/git/blob/master/pager.c#L107

Git sets value on its own environment (GIT_PAGER_IN_USE) to act as a global to track the use of the pager.

One use for this is if it wants to check the status of color being enabled after re-mapping stdout/stderr, it also checks this and just assumes color is supported.

https://github.com/git/git/search?q=pager_in_use

epage commented 2 years ago

For columns/lines, it sounds like git prefers the env variables to the auto-detection

/*

  • Return cached value (if set) or $COLUMNS environment variable (if
  • set and positive) or ioctl(1, TIOCGWINSZ).ws_col (if positive),
  • and default to 80 if all else fails. */

https://github.com/git/git/blob/master/pager.c#L153

epage commented 2 years ago

git seems to have a compile-time constant for pager configuration: PAGER_ENV="LESS=FRX LV=-c"

https://github.com/git/git/blob/master/pager.c#L76

https://github.com/git/git/blob/c50926e1f48891e2671e1830dbcd2912a4563450/contrib/buildsystems/CMakeLists.txt#L235

For the pager itself, the precedence seems to be

  1. GIT_PAGER
  2. pager.<cmd> then core.pager
  3. PAGER
  4. DEFAULT_PAGER (default is less but can be overridden)

If its still not set or if its set to cat, it will just skip the pager

https://github.com/git/git/blob/master/pager.c#L49

The pager command (e.g. content of GIT_PAGER) is spawned via a shell: https://github.com/git/git/blob/master/pager.c#L102

epage commented 2 years ago

Going to contrast this with bat (since I am somewhat familiar with it) in case we move in the direction of making general code.

Bat reads pager information from

  1. --config
  2. BAT_PAGER
  3. PAGER
  4. default

It then parses the value with shell_words::split to separate the bin from the args (so no shell expressions including setting inline env variables to override the caller)

If the pager came from PAGER, it forces the use of less instead of more and most due to compatibility issues (but trusts the user if BAT_PAGER or --config is used) and instead of bat to avoid recursion.

https://github.com/sharkdp/bat/blob/master/src/pager.rs

It then looks up the binary path with grep_cli::resolve_binary which is mostly used to force the use of PATH on windows, skipping bins in the current directory

If the pager is less, then LESSCHARSET=UTF-8 is set in the environment

If the pager is less and either the source is PAGER or there are no arguments, custom arguments will be used

bat seems to focus on -X for making -F work. Another side effect of -X is that it will leave the last screen of text around while lack of it causes the screen to be reset to what it was. Leaving the screen around clutter things up but makes it easier to reference text you were looking at after the program quit. Unsure which motivation drives gits behavior.

https://github.com/sharkdp/bat/blob/5114c0189d9c9a99312aa82f1d7217109c1ae28d/src/output.rs

epage commented 2 years ago

For our purposes, I think it makes sense to set LESSCHARSET=UTF-8 as we are doing everything in UTF-8

epage commented 2 years ago

Was curious now how other commands handle things

from man man

   -P pager, --pager=pager
          Specify which output pager to use.  By default, man uses pager, falling back to cat if pager is not found or is  not
          executable.   This  option overrides the $MANPAGER environment variable, which in turn overrides the $PAGER environ‐
          ment variable.  It is not used in conjunction with -f or -k.

          The value may be a simple command name or a command with arguments, and may use shell quoting  (backslashes,  single
          quotes,  or  double  quotes).   It  may  not use pipes to connect multiple commands; if you need that, use a wrapper
          script, which may take the file to display either as an argument or on standard input.

The code

So the help calls out that PAGER is at least shell quoted but doesn't mention being evaluated by the shell and supporting setting env variables in it. It looks like man works by piping several commands together, so I didn't see if they invoke a shell or not.

The code for the launching commands is at https://libpipeline.nongnu.org/

epage commented 2 years ago

Implemented in dcb3e9c947fcee6c3a911d0dfa044d2b447b3f23

epage commented 1 year ago

Some more things to do g into with

Text viewer for use by Git commands (e.g., less). The value is meant to be interpreted by the shell. The order of preference is the $GIT_PAGER environment variable, then core.pager configuration, then $PAGER, and then the default chosen at compile time (usually less).

When the LESS environment variable is unset, Git sets it to FRX (if LESS environment variable is set, Git does not change it at all). If you want to selectively override Git’s default setting for LESS, you can set core.pager to e.g. less -S. This will be passed to the shell by Git, which will translate the final command to LESS=FRX less -S. The environment does not set the S option but the command line does, instructing less to truncate long lines. Similarly, setting core.pager to less -+F will deactivate the F option specified by the environment from the command-line, deactivating the "quit if one screen" behavior of less. One can specifically activate some flags for particular commands: for example, setting pager.blame to less -S enables line truncation only for git blame.

Likewise, when the LV environment variable is unset, Git sets it to -c. You can override this setting by exporting LV with another value or setting core.pager to lv +c.