Closed RyanGibb closed 3 months ago
I see the problem (and thanks for the contribution), but I don't think this is the right solution. Yet.
I don't typically use anything without a decimal separator so I hadn't noticed this before. I tried it is some of my ledgers and the "off by one" result is sometimes off by one, sometimes off by 2, and sometimes off by 3. So basically the whole thing is screwed.
For example if you switch ₤
to $
in your example I get a different result. That suggests that we're somehow completely miscalculating characters, but it isn't like the pound sign is a double-byte character either so I don't know how that is happening. I also tried it with negative amounts, and got yet a different alignment.
Rather than a 1 character stop gap I think we need to figure out why we're not calculating the width right at all.
Rather than a 1 character stop gap I think we need to figure out why we're not calculating the width right at all.
Agreed! I'll try and dig into the root cause of this when I find some time.
I think this is due to variable unicode character byte length. See:
2024-07-13 test
Expense $10000
Expense $10.10
Expense £10000
Expense £10.10
Expense €10000
Expense €10.10
Expense 10.10
Assets
Note that in UTF-8 $ is one byte, £ is two, and € is three.
matchend
returns the byte index, however.
If we change:
if pos < 0
" Find the position after the first digits
let pos = matchend(rhs, '\m\d[^[:space:]]*')
endif
To (stolen from decimalpos
):
let pos = matchend(rhs, '\m\d[^[:space:]]*')
if pos > 0
let pos = strchars(rhs[:pos])
endif
This appears to fix it:
2024-07-13 test
Expense $10000
Expense $10.10
Expense £10000
Expense £10.10
Expense €10000
Expense €10.10
Expense 10.10
Expense 0
Expense 1 XXX @ $1
Assets
Numbers with a space after them are still mis-aligned by 1, though:
2024-07-13 test
Expense $10000 ; comment
Expense $10.10
Expense £10000
Expense £10.10
Expense 1 XXX @ $1
Assets
This seems to be because matchend
returns the index after the match, which will include an extra char when calling strchars
if there is a space after it.
So:
let pos = matchend(rhs, '\m\d[^[:space:]]*') - 1
if pos >= 0
let pos = strchars(rhs[:pos])
endif
Gives us:
2024-07-13 test
Expense $10000 ; comment
Expense $10.10 ; comment
Expense £10000 ; comment
Expense £10.10
Expense €10000
Expense €10.10
Expense 10.10
Expense 0 GBP
Expense 1 XXX @ $1
Assets
In my incredulity over not having noticed the pound sign being a multi-byte character it has come to my attention that in my own personal finance records (where I handle a lot of currencies but only a trivial about of travel related things in Pounds) I've been using ₤
(U+20A4 LIRA SIGN) instead of £
(U+00A3 POUND SIGN) in my ledger transactions. Yikes. I'll need to re-train my fingers for Compose-L instead of Compose=L.
Also an even further side note, apparently historically pound sterling (and ever other numbers after decimalization) were supposed to use ·
(U+00B7 MIDDLE DOT) as their decimal separator with accommodation for .
(U+0023 FULL STOP) when the former was not supported. Even UK keyboards not having an obvious way to enter interpuncts has lead the full stop to be pretty much de facto, but I'm now disappointed to find hledger
does not support interpunct as a possible decimal separator at all! I may have to go fix that.
In the mean time this sounds like a much better diagnosis and fix. I'll give it a spin when I get a minute, but thanks for digging in and finding the real issue.
Did you poke around to see if we may be making the same mistake in any other calculations?
In my incredulity over not having noticed the pound sign being a multi-byte character it has come to my attention that in my own personal finance records (where I handle a lot of currencies but only a trivial about of travel related things in Pounds) I've been using
₤
(U+20A4 LIRA SIGN) instead of£
(U+00A3 POUND SIGN) in my ledger transactions. Yikes. I'll need to re-train my fingers for Compose-L instead of Compose=L.Also an even further side note, apparently historically pound sterling (and ever other numbers after decimalization) were supposed to use
·
(U+00B7 MIDDLE DOT) as their decimal separator with accommodation for.
(U+0023 FULL STOP) when the former was not supported. Even UK keyboards not having an obvious way to enter interpuncts has lead the full stop to be pretty much de facto, but I'm now disappointed to findhledger
does not support interpunct as a possible decimal separator at all! I may have to go fix that.
Sounds like a minefield :-)
In the mean time this sounds like a much better diagnosis and fix. I'll give it a spin when I get a minute, but thanks for digging in and finding the real issue.
That's great! You're most welcome.
Did you poke around to see if we may be making the same mistake in any other calculations?
I couldn't see anything obvious, the other use of matchend
uses strchars
to support multi-byte strings as well.
The docs say:
But this was not the behaviour I observed when calling
LedgerAlign
, there seemed to be an off by one error. For example, withg:ledger_align_at = 50
, I would get:This PR subtracts one from pos if there is no decimal point to give:
Which is what I expected.
I've made this PR with what corrects this for me, but I'm not familiar with the codebase so if I've overlooked something any feedback or advice is welcome.