simonmichael / hledger

Robust, fast, intuitive plain text accounting tool with CLI, TUI and web interfaces.
https://hledger.org
GNU General Public License v3.0
3.08k stars 320 forks source link

Pager ignores local encoding #2275

Open thielema opened 1 month ago

thielema commented 1 month ago

The following applies to hledger/HEAD (fa3676df7da0d18ede79a8fb87558d2ed6e81f8f).

I have a journal encoded in Latin-1. I can watch it like so:

$ iconv -f latin1 hledger-journal
~ monthly from 2024-01-01  Periodic
    Forderung   42 Euro
    Erlös

2024-01-01 Bla
    Bank  23 Euro
    Forderung

Please note the umlaut "ö" in "Erlös".

Now, for short outputs the local encoding is respected, for long output is encoded in UTF-8, even if written to a file.

$ LANG=de_DE hledger print -f hledger-journal -e 2024-02-10 --forecast >hledger-journal-printed 
$ iconv -f latin1 hledger-journal-printed
2024-01-01 Bla
    Bank              23 Euro
    Forderung

2024-02-01 Periodic
    Forderung         42 Euro
    Erlös

This is the expected result. LANG=de_DE switches to Latin-1 encoding for this one run of hledger.

Now a longer output:

$ LANG=de_DE hledger print -f hledger-journal -e 2024-10-10 --forecast >hledger-journal-printed 
$ iconv -f latin1 hledger-journal-printed
2024-01-01 Bla
    Bank              23 Euro
    Forderung

2024-02-01 Periodic
    Forderung         42 Euro
    Erlös

2024-03-01 Periodic
    Forderung         42 Euro
    Erlös

...

I think it is not a good idea to have different encodings depending on the output length. For piping to a file there should not be a pager involved, at all.

My preferred solution would be to always use the local encoding, as it was the case until now.

simonmichael commented 1 month ago

That’s certainly surprising behaviour. Thanks for the report.

Since enabling this feature I’ve documented the logic in runPager. It’s probably still too weak at detecting when to use a pager (compared to our ansi colour detection, eg, which is aware of pipes).

There might also be a problem with loss of encoding when legitimately using a pager, it sounds like.