ledger / vim-ledger

Vim plugin for Ledger
GNU General Public License v2.0
372 stars 55 forks source link

Align currency with no decimal point correctly #160

Closed RyanGibb closed 3 months ago

RyanGibb commented 3 months ago

The docs say:

If an amount has no decimal point, the imaginary decimal point to the right of the least significant digit will align.

But this was not the behaviour I observed when calling LedgerAlign, there seemed to be an off by one error. For example, with g:ledger_align_at = 50, I would get:

2024-07-13 test
  Expense                                 £10000
  Expense                                     £10.10
  Assets

This PR subtracts one from pos if there is no decimal point to give:

2024-07-13 test
  Expense                                  £10000
  Expense                                     £10.10
  Assets

Which is what I expected.

I've made this PR with what corrects this for me, but I'm not familiar with the codebase so if I've overlooked something any feedback or advice is welcome.

alerque commented 3 months ago

I see the problem (and thanks for the contribution), but I don't think this is the right solution. Yet.

I don't typically use anything without a decimal separator so I hadn't noticed this before. I tried it is some of my ledgers and the "off by one" result is sometimes off by one, sometimes off by 2, and sometimes off by 3. So basically the whole thing is screwed.

For example if you switch to $ in your example I get a different result. That suggests that we're somehow completely miscalculating characters, but it isn't like the pound sign is a double-byte character either so I don't know how that is happening. I also tried it with negative amounts, and got yet a different alignment.

Rather than a 1 character stop gap I think we need to figure out why we're not calculating the width right at all.

RyanGibb commented 3 months ago

Rather than a 1 character stop gap I think we need to figure out why we're not calculating the width right at all.

Agreed! I'll try and dig into the root cause of this when I find some time.

RyanGibb commented 3 months ago

I think this is due to variable unicode character byte length. See:

2024-07-13 test
  Expense                                  $10000
  Expense                                     $10.10
  Expense                                 £10000
  Expense                                     £10.10
  Expense                                €10000
  Expense                                     €10.10
  Expense                                      10.10
  Assets

Note that in UTF-8 $ is one byte, £ is two, and € is three.

matchend returns the byte index, however.

If we change:

    if pos < 0
      " Find the position after the first digits
      let pos = matchend(rhs, '\m\d[^[:space:]]*')
    endif

To (stolen from decimalpos):

      let pos = matchend(rhs, '\m\d[^[:space:]]*')
      if pos > 0
        let pos = strchars(rhs[:pos])
      endif

This appears to fix it:

2024-07-13 test
  Expense                                  $10000
  Expense                                     $10.10
  Expense                                  £10000
  Expense                                     £10.10
  Expense                                  €10000
  Expense                                     €10.10
  Expense                                      10.10
  Expense                                       0
  Expense                                      1 XXX @ $1
  Assets
RyanGibb commented 3 months ago

Numbers with a space after them are still mis-aligned by 1, though:

2024-07-13 test
  Expense                                 $10000 ; comment
  Expense                                     $10.10
  Expense                                  £10000
  Expense                                     £10.10
  Expense                                      1 XXX @ $1
  Assets
RyanGibb commented 3 months ago

This seems to be because matchend returns the index after the match, which will include an extra char when calling strchars if there is a space after it.

So:

      let pos = matchend(rhs, '\m\d[^[:space:]]*') - 1
      if pos >= 0
        let pos = strchars(rhs[:pos])
      endif

Gives us:

2024-07-13 test
  Expense                                  $10000 ; comment
  Expense                                     $10.10 ; comment
  Expense                                  £10000   ; comment
  Expense                                     £10.10
  Expense                                  €10000
  Expense                                     €10.10
  Expense                                      10.10
  Expense                                       0 GBP
  Expense                                       1 XXX @ $1
  Assets
alerque commented 3 months ago

In my incredulity over not having noticed the pound sign being a multi-byte character it has come to my attention that in my own personal finance records (where I handle a lot of currencies but only a trivial about of travel related things in Pounds) I've been using (U+20A4 LIRA SIGN) instead of £ (U+00A3 POUND SIGN) in my ledger transactions. Yikes. I'll need to re-train my fingers for Compose-L instead of Compose=L.

Also an even further side note, apparently historically pound sterling (and ever other numbers after decimalization) were supposed to use · (U+00B7 MIDDLE DOT) as their decimal separator with accommodation for . (U+0023 FULL STOP) when the former was not supported. Even UK keyboards not having an obvious way to enter interpuncts has lead the full stop to be pretty much de facto, but I'm now disappointed to find hledger does not support interpunct as a possible decimal separator at all! I may have to go fix that.

In the mean time this sounds like a much better diagnosis and fix. I'll give it a spin when I get a minute, but thanks for digging in and finding the real issue.

Did you poke around to see if we may be making the same mistake in any other calculations?

RyanGibb commented 3 months ago

In my incredulity over not having noticed the pound sign being a multi-byte character it has come to my attention that in my own personal finance records (where I handle a lot of currencies but only a trivial about of travel related things in Pounds) I've been using (U+20A4 LIRA SIGN) instead of £ (U+00A3 POUND SIGN) in my ledger transactions. Yikes. I'll need to re-train my fingers for Compose-L instead of Compose=L.

Also an even further side note, apparently historically pound sterling (and ever other numbers after decimalization) were supposed to use · (U+00B7 MIDDLE DOT) as their decimal separator with accommodation for . (U+0023 FULL STOP) when the former was not supported. Even UK keyboards not having an obvious way to enter interpuncts has lead the full stop to be pretty much de facto, but I'm now disappointed to find hledger does not support interpunct as a possible decimal separator at all! I may have to go fix that.

Sounds like a minefield :-)

In the mean time this sounds like a much better diagnosis and fix. I'll give it a spin when I get a minute, but thanks for digging in and finding the real issue.

That's great! You're most welcome.

Did you poke around to see if we may be making the same mistake in any other calculations?

I couldn't see anything obvious, the other use of matchend uses strchars to support multi-byte strings as well.