simonmichael / hledger

Robust, fast, intuitive plain text accounting tool with CLI, TUI and web interfaces.
https://hledger.org
GNU General Public License v3.0
3.08k stars 320 forks source link

Unbalanced entries from commodity exchange #2051

Closed claudi closed 10 months ago

claudi commented 1 year ago

From https://groups.google.com/g/hledger/c/QwDqPO8nepA/m/UXS_fgBFAAAJ.

It appears using --exchange COMM can lead to unbalanced entries. For example with the input

commodity 10,000.00 NOK

P 2020-05-08 EUR 11.0885 NOK

2020-05-08 Sample
    acc1       71.58 EUR
    acc2      -35.79 EUR
    acc3

running hledger print -X NOK results in

2020-05-08 Sample
    acc1      793.71 NOK
    acc2     -396.86 NOK
    acc3     -396.86 NOK

which is unbalanced.

According to #931, and more specifically to #1430, seems like this was fixed at some point, but as of version 1.30 on Linux it does not work.

simonmichael commented 1 year ago

Thanks for the report. I think #931 fixed print's default output, but now we need to do something about the valued (-V or -X) output.

The problem can arise with particular amounts and output precisions. Eg with this example it happens with two decimal places but not with one or three:

$ hledger print -X A | hledger -f- print
2020-12-15
    assets           0.3 A
    assets           0.1 A
    income          -0.4 A

$ hledger print -X A | hledger -f- print
hledger: Error: -:1-4:
1 | 2020-12-15
  |     assets          0.28 A
  |     assets          0.14 A
  |     income         -0.43 A

This transaction is unbalanced.
The real postings' sum should be 0 but is: -0.01 A
Consider adjusting this entry's amounts, or adding missing postings.
$ hledger print -X A | hledger -f- print
2020-12-15
    assets         0.281 A
    assets         0.145 A
    income        -0.426 A

I'm not immediately sure how to fix this, any ideas ?

simonmichael commented 1 year ago

print without valuation prints the all decimal places seen in the journal, which is ok. But valuation can probably generate very many decimal places sometimes.

I think ideally it would print with the target commodity's display precision, but also add just enough extra decimals to show a balanced entry, when needed.

But will there be entries which can never be rendered as a balanced entry when valued, because of infinitely repeating/recurring decimals ?

simonmichael commented 1 year ago

Here's the example again, and the results of printing with display precisions 0-10. In this case precisions 2, 4 and 5 cause unbalanced output.

commodity 1,000.00 A

P 2020-12-10 A 72.35

2020-12-15
 assets   20.35   ;   0.281271596406358   A
 assets   10.46   ; + 0.14457498272287495 A
                  ; = 0.42584657912923296 A
 income  -30.81   ;  -0.4258465791292329  A

comment
Results with display precisions 0-10, unbalanced entries are printed
at precisions 2, 4 and 5:

$ hledger print -X A -cA0. | hledger -f- print
2020-12-15
    assets               0
    assets               0
    income               0

$ hledger print -X A -cA0.0 | hledger -f- print
2020-12-15
    assets            A0.3
    assets            A0.1
    income           A-0.4

$ hledger print -X A -cA0.00 | hledger -f- print
hledger: Error: -:1-5:
1 | 2020-12-15
  |     assets           A0.28
  |     assets           A0.14
  |     income          A-0.43

This transaction is unbalanced.
The real postings' sum should be 0 but is: A-0.01
Consider adjusting this entry's amounts, or adding missing postings.

$ hledger print -X A -cA0.000 | hledger -f- print
2020-12-15
    assets          A0.281
    assets          A0.145
    income         A-0.426

$ hledger print -X A -cA0.0000 | hledger -f- print
hledger: Error: -:1-5:
1 | 2020-12-15
  |     assets         A0.2813
  |     assets         A0.1446
  |     income        A-0.4258

This transaction is unbalanced.
The real postings' sum should be 0 but is: A0.0001
Consider adjusting this entry's amounts, or adding missing postings.

$ hledger print -X A -cA0.00000 | hledger -f- print
hledger: Error: -:1-5:
1 | 2020-12-15
  |     assets        A0.28127
  |     assets        A0.14457
  |     income       A-0.42585

This transaction is unbalanced.
The real postings' sum should be 0 but is: A-0.00001
Consider adjusting this entry's amounts, or adding missing postings.

$ hledger print -X A -cA0.000000 | hledger -f- print
2020-12-15
    assets       A0.281272
    assets       A0.144575
    income      A-0.425847

$ hledger print -X A -cA0.0000000 | hledger -f- print
2020-12-15
    assets      A0.2812716
    assets      A0.1445750
    income     A-0.4258466

$ hledger print -X A -cA0.00000000 | hledger -f- print
2020-12-15
    assets     A0.28127160
    assets     A0.14457498
    income    A-0.42584658

$ hledger print -X A -cA0.000000000 | hledger -f- print
2020-12-15
    assets     A0.281271596
    assets     A0.144574983
    income    A-0.425846579

$ hledger print -X A -cA0.0000000000 | hledger -f- print
2020-12-15
    assets     A0.2812715964
    assets     A0.1445749827
    income    A-0.4258465791
simonmichael commented 1 year ago

If we don't round at all, hledger shows the Decimal library's maximum 255 decimal places, and the amounts still do not balance (these valuations seem to be irrational numbers):

$ stack exec -- hledger print -X A | hledger -f- print
hledger: Error: -:1-5:
1 | 2020-12-15
  |     assets     A0.281271596406357982031789910158949550794747753973738769868693849343469246717346233586731167933655839668279198341395991706979958534899792674498963372494816862474084312370421561852107809260539046302695231513476157567380787836903939184519695922598479612992407
  |     assets     A0.144574982722874913614374568071872840359364201796821008984105044920525224602626123013130615065653075328265376641326883206634416033172080165860400829302004146510020732550103662750518313752591568762957843814789219073946095369730476848652384243261921216309611
  |     income    A-0.425846579129232895646164478230822391154111955770559778852798894263994471319972356599861782999308914996544574982722874913614374568071872840359364201796821008984105044920525224602626123013130615065653075328265376641326883206634416033172080165860400829302017

This transaction is unbalanced.
The real postings' sum should be 0 but is: A0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
Consider adjusting this entry's amounts, or adding missing postings.
simonmichael commented 1 year ago

I suppose we are hitting a limitation of our valuation approach: it is currently done with simple arithmetic, whereas (more) serious financial software probably considers currency's smallest subunit and avoids splitting those. So I think fixing this requires specifying and implementing subunit-aware valuation, or just document that print -X can show unbalanced entries, which you can usually fix by experimenting with different display precisions.

claudi commented 1 year ago

If we accept the information loss in the conversion we could try to work around it.

What about having print --exchange COMM show implicit amounts for, say, the last account in an entry? Or any other account that already had implicit amounts.

So, in my example the output would be something like

2020-05-08 Sample
    acc1      793.71 NOK
    acc2     -396.86 NOK
    acc3

Sketchy, but one could still use the --explicit flag to force them back.

Another idea is to allow the user to specify some "overfill" account where the ±unit would be lost? Similar to --close-acc and similar in hledger close. For an example output:

2020-05-08 Sample
    acc1             793.71 NOK
    acc2            -396.86 NOK
    acc3            -396.86 NOK
    equity:overfill    0.01 NOK

Don't know if equity is the right place for this. I guess they would tend to balance each other out, but this one is even sketchier.

the-solipsist commented 1 year ago

I personally don't think this is a 'bug' in the software (though it is a problem to be dealt with by the user) as this is a problem inherent in multi-currency accounting. Perhaps the simplest solution is to do what is currently being done: to flag an error and allow the user to solve it, This way, at least, a valid journal isn't replaced by an invalid journal via print.

Here's some discussion on how beancount handles tolerances and precision:

http://furius.ca/beancount/doc/tolerances

simonmichael commented 1 year ago

These are helpful ideas, thanks!

I agree with the principle that print should never produce unparseable output. We have worked towards this in the past.

I also think it should never fail [(once it has successfully read the input)]; anyone could get into this situation unexpectedly through normal usage, and we don't want print to feel unreliable. So let's not raise an error (1).

Keeping implicit amounts, or forcing one to be implicit if there are none (2), would allow the transaction to balance. But I think adding the imbalance to a somewhat random posting is not ok, as it can change the transaction's accounting meaning.

Adding a tolerance for imprecision (3) is what we currently do - like Beancount (and Ledger ?), except ours is based on global display precision for the commodity, whereas Beancount and Ledger calculate it as needed for each individual transaction. This is something we want to do anyway. Would it solve this issue, and how exactly ? I'm not clear on it yet.

Adding an extra equity posting to catch the imbalance (4) is also possible, and seems easy and not too objectionable for users ?

Calculating value in a subunit-aware way, that avoids generating irrational decimals (5), should also solve this issue. This sounds nice to have but it's too costly for me to take on at present.

I think I need to spend some more time reviewing hledger/Ledger/BC's handling of 3. Other comments welcome.

simonmichael commented 1 year ago

PS I'm going to call this an enhancement not a bug, since it's the next step in our long quest to print always-valid journal output.

simonmichael commented 1 year ago

Well, still at least a documentation bug. Ok.

simonmichael commented 1 year ago

Problems with adding an imbalance posting (4):

simonmichael commented 1 year ago

Related: #1964

simonmichael commented 1 year ago

Some more notes:

Essentially a balanced transaction is one where the amounts sum to zero. But it's more complicated in practice. There are two kinds of transaction balancing, mostly independent. Currently,

"Input balancing" is done by hledger at read time, converting each journal entry to a complete and balanced transaction in memory (see Transaction type). A balanced Transaction is one that that appears balanced when amounts are rendered with their commodities' display precisions, ie imprecisions not visible when displayed are ignored (see isTransactionBalanced function).

"Output balancing" is done by the print command at output time, converting each Transaction to a balanced journal entry. A balanced journal entry can be read successfully by hledger, but also its amounts appear to the human eye to sum to zero exactly (no imprecision allowed), and all amounts must be displayable with all of their decimal places (to support print's --explicit/-x flag).

(#1964 is about input balancing, #2051 is about output balancing.)

hledger uses a decimal number representation internally, with up to 255 decimal places (see Decimal). Amounts in hledger's input always have a finite number (<= 255) of decimal places. But certain hledger operations (using division) can generate amounts with excessively many or infinitely periodic decimal places, which I'll call "long decimals". Eg converting to value with an inferred reverse market price (see https://hledger.org/dev/hledger.html#finding-market-price) can generate long decimals (1 / 72.35 in an example above has a repeating cycle of 1446 decimals).

Transaction's imprecise, display-precision-guided balancing makes it practical for us humans to work with cost and value conversions, which often generate more decimal places than we'd want to deal with. It also means Transactions can hold long decimals (rounded to 255 places), and still be considered balanced.

Journal entries require finite, preferably short, decimals. So generating a balanced journal entry requires (among other things) getting rid of long decimals. (And if numbers were stored internally as ratios (fractions) like Ledger, these would need to be converted to finite decimals.)

simonmichael commented 1 year ago

PS I've had some help with my math terminology.. in fact these are not irrational numbers - wherever I've used that term, I mean "periodically repeating decimals" or just "too many decimals". I've edited the comment above.

simonmichael commented 10 months ago

Since hledger 1.32, print is smarter about decimals, and it now shows this:

$ hledger-1.32 print -X NOK
2020-05-08 Sample
    acc1      793.71483 NOK
    acc2    -396.857415 NOK
    acc3    -396.857415 NOK

So I think this has been fixed.