Open grodin opened 1 year ago
Thanks for reporting. Interestingly, I haven't been able to replicate this. For a concrete example, I tried with xfce4-terminal v0.8.10 sized 52 x 24:
Similar problem for me on Kitty (Fedora 36 on Sway).
1mNAME0m
ls - list directory contents
1mSYNOPSIS0m
1mls 22m[4mOPTION24m]... [4mFILE24m]...
1mDESCRIPTION0m
List information about the FILEs (the current directory by default). Sort entries alphabetically if none of 1m-cftuvSUX 22mnor 1m--sort 22mis
specified.
@SaElAh Yours looks like a different problem with ANSI escape codes, not with unicode characters. Please search the issue tracker if this has been reported and open a new ticket otherwise.
I cannot reproduce this either.
What is your locale? Maybe it's related to that?
Seems like I don't even get Unicode characters in the first place:
▶ LANG=C MANPAGER="sh -c 'col -bx | grep Warnings | hexdump -C'" man man
00000000 20 20 20 20 20 20 20 20 20 20 20 20 20 20 64 65 | de|
00000010 66 61 75 6c 74 20 69 73 20 22 6d 61 63 22 2e 20 |fault is "mac". |
00000020 20 53 65 65 20 74 68 65 20 22 57 61 72 6e 69 6e | See the "Warnin|
00000030 67 73 22 20 6e 6f 64 65 20 69 6e 20 69 6e 66 6f |gs" node in info|
00000040 20 67 72 6f 66 66 20 20 66 6f 72 20 20 61 20 20 | groff for a |
00000050 6c 69 73 74 20 20 6f 66 20 20 61 76 61 69 6c 61 |list of availa|
00000060 62 6c 65 20 20 77 61 72 6e 69 6e 67 0a |ble warning.|
0000006d
I have groff 1.22.4
Currently also an issue on Alacritty on Arch Linux (running man ls
)
@ChocolateOverflow Have you tried MANROFFOPT="-c"
as suggested in the readme?
I had the same problem and this helped.
@christoph-heinrich Yeah MANROFFOPT="-c"
seems to fix my issue.
Sorry for not replying to this for ages!
I have LANG=en_GB.UTF-8
.
Running LANG="C" MANPAGER=sh -c 'col -bx | bat -l man -p' man man
displays the expected output so it's clearly a locale related issue. I'm not sure if it's expected to need to use LANG="C"
but aliasing man='LANG=C man'
is a usable workaround.
FWIW, I stumbled across something similar today — and I can see how, in *my* case, the Problem Exists Between the Keyboard And Chair...
(I'm just documenting it here in case it helps anybody else, as well as for posterity — i.e., when I run into the same problem again in six months, this will show up when I Google it, hehe!)
Anyway, I'm used to doing, e.g.:
# Run a command and save its output:
bash% someCmd > /tmp/out.1
# Then making some changes and re-running:
bash% someCmd > /tmp/out.2
# So I can:
bash% diff /tmp/out.{1,2}
# Which was fine until I ran:
bash% less /tmp/out.1
As a Minimal Reproducible Example, say I have two files named, e.g., /tmp/one.1
and /tmp/two.2
:
bash% printf '\033[31mRed\033[m\n' > /tmp/one.1
bash% od -c /tmp/one.1
0000000 033 [ 3 1 m R e d 033 [ m \n
0000014
### Prepend a backslash...
bash% printf '\\\033[31mRed\033[m\n' > /tmp/two.2
bash% od -c /tmp/two.2
0000000 \ 033 [ 3 1 m R e d 033 [ m \n
0000015
Note that /tmp/two.2
is the same as /tmp/one.1
except it has a preceding \
backslash before the escape character...
Now, If I run:
### Sanitize environment...
bash% unset BAT_STYLE BAT_THEME; export BAT_CONFIG_PATH=/dev/null
bash% cat /tmp/one.1 | bat # Works
───────┬──────────────────────────────────
│ STDIN
───────┼──────────────────────────────────
1 │ Red
───────┴──────────────────────────────────
bash% cat /tmp/two.2 | bat # Works
───────┬──────────────────────────────────
│ STDIN
───────┼──────────────────────────────────
1 │ \Red
───────┴──────────────────────────────────
bash% bat /tmp/one.1 # Works
───────┬──────────────────────────────────
│ File: /tmp/one.1
───────┼──────────────────────────────────
1 │ Red
───────┴──────────────────────────────────
bash% bat /tmp/two.2 # Not What I Was Expecting!
───────┬──────────────────────────────────
│ File: /tmp/two.2
───────┼──────────────────────────────────
1 │ \[0m[31mRed
───────┴──────────────────────────────────
### However...
bash% bat -l txt /tmp/two.2 # Works
───────┬──────────────────────────────────
│ File: /tmp/two.2
───────┼──────────────────────────────────
1 │ \Red
───────┴──────────────────────────────────
### And...
bash% cat /tmp/one.1 | bat -l troff # Gives The Funny Output 💡
───────┬──────────────────────────────────
│ File: STDIN
───────┼──────────────────────────────────
1 │ \[0m[31mRed
───────┴──────────────────────────────────
So my mistake was using .<digit>
for something other than "nroff -man"
files! «grin»
PS — I will add that nroff -man /usr/share/man/man1/bash.1 | bat -l man
gives me some funny:
SEE ALSO
Bash Reference Manual, Brian Fox and Chet Ramey
The Gnu Readline Library, Brian Fox and Chet Ramey
The Gnu History Library, Brian Fox and Chet Ramey
Portable Operating System Interface [0m4m(POSIX) Part 2: Shell and Utili‐
ties, IEEE
sh[0m24m(1), ksh[0m24m(1), csh[0m24m(1)
_______emacs[0m24m(1), vi[0m24m(1)
_______readline[0m24m(3)
output under macOS... mandoc
does better there, but only colors "SEE" instead of "SEE ALSO" — and the latter does *not* like the x^Hx
pseudo-bold hack! — but I don't trust my understanding of -l man
to know whether or not *I'm* the one doing it wrong... again! :-}
PS — I will add that
nroff -man /usr/share/man/man1/bash.1 | bat -l man
gives me some funny:SEE ALSO Bash Reference Manual, Brian Fox and Chet Ramey The Gnu Readline Library, Brian Fox and Chet Ramey The Gnu History Library, Brian Fox and Chet Ramey Portable Operating System Interface [0m4m(POSIX) Part 2: Shell and Utili‐ ties, IEEE sh[0m24m(1), ksh[0m24m(1), csh[0m24m(1) _______emacs[0m24m(1), vi[0m24m(1) _______readline[0m24m(3)
I did some investigating a week ago, and it appears the man
/nroff
/groff
implementation used by most Linux distros has switched to emitting ANSI escape sequences by default instead of overtyping (the pseudo-bold hack).
The man
syntax definition doesn't handle ANSI escape sequences and bat
's ANSI parsing doesn't work across highlighting regions, which is likely why you're encountering broken sequences. You'll want to pass -c
to nroff
to have it revert back to using overtyping.
output under macOS...
mandoc
does better there, but only colors "SEE" instead of "SEE ALSO" — and the latter does *not* like thex^Hx
pseudo-bold hack! — but I don't trust my understanding of-l man
to know whether or not *I'm* the one doing it wrong... again! :-}
MacOS's mandoc
still uses overtyping by default, which is why it behaves a bit better there. I'm on my phone, so I can't test this myself, but try piping into col -bx
before piping to bat
. That will remove the overtyping, which should help determine if your issue is caused by the backspace character.
What steps will reproduce the bug?
man bat
(or other manpages) withMANPAGER="sh -c 'col -bx | bat -l man -p'"
on a terminal with a width small enough that man hyphenates some words.What happens?
man output such as
What did you expect to happen instead?
The output should be:
How did you install
bat
?Occurs with bat v0.22.1 installed by brew on Ubuntu 22.04 and v0.19 installed on the same system via apt.
bat version and environment
> bat --diagnostic
Software version
bat 0.22.1
Operating system
Linux 5.15.0-60-generic
Command-line
Environment variables
System Config file
Could not read contents of '/etc/bat/config': No such file or directory (os error 2).
Config file
Could not read contents of '/home/jscdev/.config/bat/config': No such file or directory (os error 2).
Custom assets metadata
Could not read contents of '/home/jscdev/.cache/bat/metadata.yaml': No such file or directory (os error 2).
Custom assets
'/home/jscdev/.cache/bat' not found
Compile time information
Less version
More details
A partial workaround I've discovered is to run man with
--no-hyphenation|--nh
but there are still some unicode code points that are making it to the output. Here's a snippet ofMANPAGER="sh -c 'col -bx | bat -l man -p'" man --nh man
and then
MANPAGER="bat -A" man --nh man
I've checked that it's not caused by
less
with by running withBAT_PAGER
set and empty.Terminal emulator is alacritty, but I can't see what difference that would make.