vaeth / eix

eix can access Gentoo portage ebuild information and description very quickly (using a local cache). It can also be used to access information on installed packages, local settings, and local and external overlays, and informs about changes in the tree
GNU General Public License v2.0
166 stars 14 forks source link

Symbol in the eix output does not match the ones used in the man page ( ˆ != ^ ) #123

Closed kamikadoYukio closed 1 month ago

kamikadoYukio commented 1 month ago

Would you please consider change this entry

Output

       1.0*ilvsˆfmpbstuidP{tbz2,gpkg:3,pak:2}
       ...someDescription

       5.0-r3(5.0R3)ˆf     or     5.0-r3:5.0R3ˆf
       ...someDescription

to

Output

       1.0*ilvs^fmpbstuidP{tbz2,gpkg:3,pak:2}
       ...someDescription

       5.0-r3(5.0R3)^f     or     5.0-r3:5.0R3^f
       ...someDescription

The reason to do it so is that it would help other users to find the output section quickly instead of reading through the whole document just to know what versionNum^someChar means, since the eix output uses ^ and the man pages uses ˆ two different symbols ( ˆ != ^ ), making it very hard to search because it's not consistent with its actual output ( ^ ).

^ - https://kbdlayout.info/how/%5E

ˆ - https://kbdlayout.info/how/%CB%86

This was discussed here https://forums.gentoo.org/viewtopic-t-1170971-highlight-.html

My best regards

vaeth commented 1 month ago

In the manpage as well as in eix the symbol ^ (the ASCII symbol, not the utf8 symbol) is specified.

If your manpage or terminal program displays this symbol in one case differently, it is IMHO the fault of the manpage program (or manpage formatter) or terminal program.

That being said, if you want to change the symbol in the eix output, you can do so locally by setting FORMAT_RESTRICTSEPARATOR in e.g. some file in /etc/eixrc correspondingfly, see the output of eix --dump|grep FORMAT_RESTRICTSEPARATOR. However, by default, I will change neither the manpage nor eix to a non-ASCII-symbol.

Chiitoo commented 1 month ago

I suppose this is a localisation issue of some kind, somewhere.

While the sources [1] seem to indeed use a caret, it becomes a circumflex for me as well when viewing the manual normally.

If I do something like LANG="" man eix, then the caret is there instead.

My usual LANG is set to en_GB.UTF-8.

  1. https://github.com/vaeth/eix/blob/main/manpage/en-eix.1.in#L1084
kamikadoYukio commented 1 month ago

https://github.com/vaeth/eix/blob/main/manpage/en-eix.1.in#L1084 By checking the link provided I can find the ^ without issues.

LANG="" man eix And yes by typing \^ I can find the character

My eselect locale is set to en_US.utf8 Make.conf has LC_MESSAGES=C.utf8 Any ideas how to set manpages encoding without adding LANG before man eix?

kamikadoYukio commented 1 month ago

Firstly I'd like to apologize for reopening this issue as it seems the source was found It was not a locale issue but more likely a groff issue.

I'd like to bring this excerpt to your attention man 7 groff_man_style

Notes
       Some tips on troubleshooting your man pages follow.

       o Some ASCII characters look funny or copy and paste wrong.
              On  devices with large glyph repertoires, like UTF-8-capable terminals and PDF, several keyboard
 glyphs are mapped to code points outside the Unicode basic Latin range because that usually results in better 
typography in the general case.  When documenting GNU/Linux command or C language syntax, however, this 
translation is sometimes not desirable.

              To get a "literal"...   ...should be input.
              --------------------------------------------
                                  '   \(aq
                                  -   \-
                                  \   \(rs
                                  ^   \(ha
                                  `   \(ga
                                  ~   \(ti
              --------------------------------------------

By editing the eix.1 file and replacing ^ (ascii) to \(ha I was able to get ^ (circumflex accent) and not the ˆ (modifier circumflex), even though my locale was set to utf8 and the eix.1 is ascii encoded.

Another way to verify this is by running: echo "5.0-r3(5.0R3)\(haf or 5.0-r3:5.0R3^f"|groff -Tutf8|sed '/^$/d'

Best regards.

vaeth commented 1 month ago

PR will get merged into eix-0.36.9