Performance issues for "bal -V" when having a lot of market prices

erikryb commented 5 years ago

I have a lot of market prices in my journal. Daily market prices for several commodities, about 25000 entries in total. The command "bal -V" is very slow as a result. This is the result of profiling:

    Mon Apr  1 22:15 2019 Time and Allocation Profiling Report  (Final)

       hledger +RTS -p -RTS bal assets liabilities -V

    total time  =      119.74 secs   (119737 ticks @ 1000 us, 1 processor)
    total alloc = 106,803,275,920 bytes  (excludes profiling overheads)

COST CENTRE MODULE              SRC                                      %time %alloc

amountValue Hledger.Data.Amount Hledger/Data/Amount.hs:(452,1)-(455,16)   96.2   97.0

                                                                                                                 individual     inherited
COST CENTRE                   MODULE                SRC                                         no.   entries  %time %alloc  %time %alloc
    journalApplyValue         Hledger.Cli.Utils     Hledger/Cli/Utils.hs:(127,1)-(133,22)       2481       1    0.0    0.0   96.3   97.0
     overJournalAmounts       Hledger.Data.Journal  Hledger/Data/Journal.hs:1009:1-74           2612       1    0.0    0.0   96.3   97.0
      traverseJournalAmounts  Hledger.Data.Journal  Hledger/Data/Journal.hs:(1017,1)-(1026,38)  2613       1    0.0    0.0   96.3   97.0
       amountValue            Hledger.Data.Amount   Hledger/Data/Amount.hs:(452,1)-(455,16)     2717    9921   96.2   97.0   96.3   97.0

The function "amountValue" calls the function "commodityValue", which seems to be sorting the market prices. This may be the reason it performs so slowly. Perhaps the solution would be to store the market prices internally in a data type which has faster lookup, instead of having to sort the entries from the journal every time?

EDIT: One option is having jmarketprices :: SortedList MarketPrice in the Journal type, where SortedList is from here.

simonmichael commented 5 years ago

Thanks for the report. Just curious, how many years' worth of data do you have in that one file ?

erikryb commented 5 years ago

Thanks for the report. Just curious, how many years' worth of data do you have in that one file ?

Some of the entries in that run dated back to 2001, which I realize is more than necessary for me. If I remove all entries before 2010, it still takes over one minute to run the command. I now have daily prices of 10 commodities, where two go back to 2017, two to 2013, one to 2011, and the rest to 2010.

simonmichael commented 5 years ago

@erikryb not necessary for this issue but it would be interesting to see stats output since 2001 & since 2010 too. As a one-year-at-a-time user I'm interested in what sort of performance others experience.

simonmichael commented 5 years ago

Some benchmark results, just for the record (with the same number of P records as transactions):

$ quickbench  "hledger -f examples/100x100x10-noprice.journal bal -V" "hledger -f examples/100x100x10.journal bal" "hledger -f examples/100x100x10.journal bal -V"
Running 3 tests 1 times at 2019-04-19 15:57:57 PDT:

Best times:
+-------------------------------------------------------++------+
|                                                       ||      |
+=======================================================++======+
| hledger -f examples/100x100x10-noprice.journal bal -V || 0.12 |
| hledger -f examples/100x100x10.journal bal            || 0.12 |
| hledger -f examples/100x100x10.journal bal -V         || 0.12 |
+-------------------------------------------------------++------+
$ quickbench  "hledger -f examples/1000x1000x10-noprice.journal bal -V" "hledger -f examples/1000x1000x10.journal bal" "hledger -f examples/1000x1000x10.journal bal -V"
Running 3 tests 1 times at 2019-04-19 15:45:47 PDT:

Best times:
+---------------------------------------------------------++------+
|                                                         ||      |
+=========================================================++======+
| hledger -f examples/1000x1000x10-noprice.journal bal -V || 0.30 |
| hledger -f examples/1000x1000x10.journal bal            || 0.30 |
| hledger -f examples/1000x1000x10.journal bal -V         || 0.46 |
+---------------------------------------------------------++------+
$ quickbench  "hledger -f examples/10000x1000x10-noprice.journal bal -V" "hledger -f examples/10000x1000x10.journal bal" "hledger -f examples/10000x1000x10.journal bal -V"
Running 3 tests 1 times at 2019-04-19 15:45:53 PDT:

Best times:
+----------------------------------------------------------++-------+
|                                                          ||       |
+==========================================================++=======+
| hledger -f examples/10000x1000x10-noprice.journal bal -V ||  1.20 |
| hledger -f examples/10000x1000x10.journal bal            ||  1.43 |
| hledger -f examples/10000x1000x10.journal bal -V         || 41.34 |
+----------------------------------------------------------++-------+

simonmichael commented 5 years ago

And profiterole-simplified profiles for the 10000 txn case without and with -V.

simonmichael commented 5 years ago

@erikryb SortedList sounds good, but I'm unsure how it will work with "-V chooses the most recent one, or in case of equal dates, the last-parsed one".

simonmichael commented 5 years ago

The latest commit fixes the wasteful sorting of market prices that you pointed out, and documents how they should be sorted. I think we can't use SortedList because our market prices are only partly sorted (same-date prices should remain in parse order). Benchmark results show a speedup:

$ stack exec -- quickbench "hledger.999.pre -f examples/1000x1000x10.journal bal" "hledger.999.pre -f examples/1000x1000x10.journal bal -V" "hledger.999 -f examples/1000x1000x10.journal bal -V"
Running 3 tests 1 times at 2019-04-20 14:19:48 PDT:

Best times:
+---------------------------------------------------------++------+
|                                                         ||      |
+=========================================================++======+
| hledger.999.pre -f examples/1000x1000x10.journal bal    || 0.37 |
| hledger.999.pre -f examples/1000x1000x10.journal bal -V || 0.51 |
| hledger.999 -f examples/1000x1000x10.journal bal -V     || 0.47 |
+---------------------------------------------------------++------+

$ stack exec -- quickbench "hledger.999.pre -f examples/10000x1000x10.journal bal" "hledger.999.pre -f examples/10000x1000x10.journal bal -V" "hledger.999 -f examples/10000x1000x10.journal bal -V"
Running 3 tests 1 times at 2019-04-20 14:20:04 PDT:

Best times:
+----------------------------------------------------------++-------+
|                                                          ||       |
+==========================================================++=======+
| hledger.999.pre -f examples/10000x1000x10.journal bal    ||  1.26 |
| hledger.999.pre -f examples/10000x1000x10.journal bal -V || 48.77 |
| hledger.999 -f examples/10000x1000x10.journal bal -V     || 11.37 |
+----------------------------------------------------------++-------+

I have not yet profiled the latest, perhaps it can be improved further.

simonmichael commented 5 years ago

More measurements:

$ stack exec -- quickbench -n2 -v -whledger.999.pre,hledger.999 "hledger bal -V -fexamples/1000x1000x10.journal" "hledger bal -V -fexamples/2000x1000x10.journal" "hledger bal -V -fexamples/3000x1000x10.journal" "hledger bal -V -fexamples/4000x1000x10.journal" "hledger bal -V -fexamples/5000x1000x10.journal" "hledger bal -V -fexamples/6000x1000x10.journal" "hledger bal -V -fexamples/7000x1000x10.journal" "hledger bal -V -fexamples/8000x1000x10.journal" "hledger bal -V -fexamples/9000x1000x10.journal" "hledger bal -V -fexamples/10000x1000x10.journal"
Running 10 tests 2 times with 2 executables at 2019-04-20 15:02:26 PDT:

Best times:
+-----------------------------------------++-----------------+-------------+
|                                         || hledger.999.pre | hledger.999 |
+=========================================++=================+=============+
| bal -V -fexamples/1000x1000x10.journal  ||            0.56 |        0.47 |
| bal -V -fexamples/2000x1000x10.journal  ||            1.31 |        0.80 |
| bal -V -fexamples/3000x1000x10.journal  ||            2.23 |        1.20 |
| bal -V -fexamples/4000x1000x10.journal  ||            4.15 |        2.92 |
| bal -V -fexamples/5000x1000x10.journal  ||           10.67 |        4.15 |
| bal -V -fexamples/6000x1000x10.journal  ||           10.99 |        4.80 |
| bal -V -fexamples/7000x1000x10.journal  ||           20.94 |        9.74 |
| bal -V -fexamples/8000x1000x10.journal  ||           27.87 |        8.18 |
| bal -V -fexamples/9000x1000x10.journal  ||           36.83 |       12.65 |
| bal -V -fexamples/10000x1000x10.journal ||           48.54 |       15.62 |
+-----------------------------------------++-----------------+-------------+

The old one looks superlinear (not O(n^2) as I expected), and the new one looks quicker but still superlinear. I suppose the fact that both number of transactions and number of prices are increasing complicates things a bit. Also there are some odd bumps in the performance curve so maybe these numbers can't really tell the story.

simonmichael commented 5 years ago

+------------------------------------------++-----------------+---------+
|                                          || hledger.999.pre | hledger |
+==========================================++=================+=========+
| -f examples/10000x1000x10.journal bal -V ||           44.88 |   21.38 |
| -f examples/20000x1000x10.journal bal -V ||          315.92 |   85.86 |
| -f examples/30000x1000x10.journal bal -V ||          687.99 |  184.97 |
+------------------------------------------++-----------------+---------+

simonmichael commented 5 years ago

It's still slow because we are still searching/filtering the whole price list once per posting.
It's not that easy to see how to make this faster. It can be done (eg split prices by commodity, split into yearly buckets..), but..
This is an inherently expensive operation that really we shouldn't be doing. Originally we converted the final reported amounts to value, which is a much smaller number of amounts. Since baf232d3 we convert all posting amounts to value (as of the report end date) before running reports. This was an easy way to make the feature available to all commands. But it does too much work, and it won't allow showing the value as of each period end in periodic reports (I think). Better to move value conversion back to the individual reports, even though it's more tedious coding.

simonmichael commented 5 years ago

Namely the balance, print and register commands and their variants.

simonmichael commented 5 years ago

@erikryb, let me know how that one performs in your setup. From the commit message:

Instead of converting all journal amounts to value early on, we now
convert just the report amounts to value, before rendering.
...
This should mostly fix -V's performance when there are many
transactions and prices (the price lookups could still be optimised),
and allow more flexibility for report-specific value calculations.

+------------------------------------------++-----------------+-------------------+--------------------------+
|                                          || hledger.999.pre | hledger.999.1sort | hledger.999.after-report |
+==========================================++=================+===================+==========================+
| -f examples/1000x1000x10.journal bal -V  ||            1.08 |              0.96 |                     0.76 |
| -f examples/2000x1000x10.journal bal -V  ||            1.65 |              1.05 |                     0.73 |
| -f examples/3000x1000x10.journal bal -V  ||            2.43 |              1.58 |                     0.84 |
| -f examples/4000x1000x10.journal bal -V  ||            4.39 |              1.96 |                     0.93 |
| -f examples/5000x1000x10.journal bal -V  ||            7.75 |              2.99 |                     1.07 |
| -f examples/6000x1000x10.journal bal -V  ||           11.21 |              3.72 |                     1.16 |
| -f examples/7000x1000x10.journal bal -V  ||           16.91 |              4.72 |                     1.19 |
| -f examples/8000x1000x10.journal bal -V  ||           27.10 |              9.83 |                     1.40 |
| -f examples/9000x1000x10.journal bal -V  ||           39.73 |             15.00 |                     1.51 |
| -f examples/10000x1000x10.journal bal -V ||           50.72 |             25.61 |                     2.15 |
+------------------------------------------++-----------------+-------------------+--------------------------+

There's one new limitation, not yet resolved: -V once again can pick a
valuation date in the future, if no report end date is specified and
the journal has future-dated transactions. We prefer to avoid that,
but reports currently are pure and don't have access to today's date.

erikryb commented 5 years ago

@simonmichael, the results look very promising! I will let you know when I have tested it. Thanks a lot! :smiley:

erikryb commented 5 years ago

I have since last time added daily prices for the rest of the commodities that I have, and have now 57852 price entries in total. The new commit makes a great improvement!

Before: hledger bal assets liabilities -V 301.17s user 0.76s system 100% cpu 5:01.65 total

After: hledger bal assets liabilities -V 2.77s user 0.10s system 99% cpu 2.873 total

Without -V (latest version): hledger bal assets liabilities 2.49s user 0.09s system 99% cpu 2.592 total

The difference between -V and no -V is quite small in the latest version, so performance tuning should probably be directed elsewhere for now. You may close the issue if you agree. Thanks again!

simonmichael commented 5 years ago

Presumably the difference is small because your report is not showing many numbers. I realised that with many columns, there could easily be more numbers to convert under the new mechanism. Eg in the test below,

we used to convert 20k amounts (10k txns x 2 postings)
now we convert 14 million amounts! (10k accounts x 1400 weekly columns)

$ stack exec -- quickbench -w hledger.999.pre,hledger.999.1sort,hledger.999.after-report "hledger -f examples/10000x1000x10.journal bal -W" "hledger -f examples/10000x1000x10.journal bal -WV"
Running 2 tests 1 times with 3 executables at 2019-04-24 07:22:16 PDT:

Best times:
+-------------------------------------------++-----------------+-------------------+--------------------------+
|                                           || hledger.999.pre | hledger.999.1sort | hledger.999.after-report |
+===========================================++=================+===================+==========================+
| -f examples/10000x1000x10.journal bal -W  ||           41.07 |             38.60 |                    38.85 |
| -f examples/10000x1000x10.journal bal -WV ||           87.64 |             70.65 |                    38.50 |
+-------------------------------------------++-----------------+-------------------+--------------------------+

And yet we see no slowdown at all here. Perhaps because most of the reported values are zero ? I tested with more non-zero reported amounts (20k in the balance report):

$ stack exec -- quickbench -w hledger.999.after-report "hledger -f examples/10000x1000x10.journal bal -Y" "hledger -f examples/10000x1000x10.journal bal -YV" "hledger -f examples/10000x1000x10.journal reg" "hledger -f examples/10000x1000x10.journal reg -V" "hledger -f examples/10000x1000x10.journal print" "hledger -f examples/10000x1000x10.journal print -V"
Running 6 tests 1 times with 1 executables at 2019-04-24 08:04:06 PDT:

Best times:
+---------------------------------------------------------------------++--------------------------+
|                                                                     || hledger.999.after-report |
+=====================================================================++==========================+
| hledger.999.after-report -f examples/10000x1000x10.journal bal -Y   ||                     3.14 |
| hledger.999.after-report -f examples/10000x1000x10.journal bal -YV  ||                     3.20 |
| hledger.999.after-report -f examples/10000x1000x10.journal reg      ||                     4.31 |
| hledger.999.after-report -f examples/10000x1000x10.journal reg -V   ||                     4.82 |
| hledger.999.after-report -f examples/10000x1000x10.journal print    ||                     3.35 |
| hledger.999.after-report -f examples/10000x1000x10.journal print -V ||                     3.53 |
+---------------------------------------------------------------------++--------------------------+

Still no slowdown. I'm not sure why.

erikryb commented 5 years ago

For the bal -W report most of the amounts are indeed zero. I think the number of non-zero amounts cannot be larger than the number of postings, so there should be no slowdown compared to the old way. But if you do bal -WH, with historical amounts, you may get arbitrarily many non-zero amounts. This would result in a slowdown compared to the old version. However, the balance of an account may change only when the account has a transaction, so it should still be possible to compute the values in O(#postings) time.

erikryb commented 5 years ago

Here is a comparison which shows the running time of computing daily historical balances on the file generated by runhaskell tools/generatejournal.hs 10000 1000 10.

Old version: hledger -f examples/10000x1000x10.journal bal -DVH 431.93s user 3.25s system 100% cpu 7:14.91 total

New version: hledger -f examples/10000x1000x10.journal bal -DVH 530.83s user 4.07s system 100% cpu 8:54.60 total

The difference in running time should be minimized if we convert values before adding them when doing historical or cumulative reports.

simonmichael commented 5 years ago

As noted in commit ebf5ed9, certain valuation modes are now slow again (balance --value-at=transaction, mainly) in order to get correct results. The valuation modes which can be done quickly (eg the default -V) are still quick.

simonmichael commented 5 years ago

Valuation code has changed quite a bit in latest master. @erikryb if you have a chance to rerun performance tests, that would be interesting.

simonmichael commented 5 years ago

Oh oh. These latest numbers suggest performance has regressed again:

$ stack exec -- quickbench -w hledger-1.14,hledger-1.15 "_ -f examples/10000x1000x10.journal bal -Y" "_ -f examples/10000x1000x10.journal bal -YV" "_ -f examples/10000x1000x10.journal reg" "_ -f examples/10000x1000x10.journal reg -V" "_ -f examples/10000x1000x10.journal print" "_ -f examples/10000x1000x10.journal print -V"
Running 6 tests 1 times with 2 executables at 2019-07-25 11:50:29 IST:

Best times:
+--------------------------------------------++--------------+--------------+
|                                            || hledger-1.14 | hledger-1.15 |
+============================================++==============+==============+
| -f examples/10000x1000x10.journal bal -Y   ||         3.21 |         3.30 |
| -f examples/10000x1000x10.journal bal -YV  ||        50.17 |        70.39 |
| -f examples/10000x1000x10.journal reg      ||         3.08 |         3.07 |
| -f examples/10000x1000x10.journal reg -V   ||        53.90 |       104.39 |
| -f examples/10000x1000x10.journal print    ||         2.46 |         2.42 |
| -f examples/10000x1000x10.journal print -V ||        43.85 |       110.46 |
+--------------------------------------------++--------------+--------------+

simonmichael commented 5 years ago

Profile of bal -V with 1001 amounts:

$ make quickprof-"bal -V" 
stack build --profile hledger
stack exec --profile -- hledger +RTS -P -RTS bal -V -f examples/1000x1000x10.journal >/dev/null
profiterole hledger.prof
Writing to hledger.profiterole.txt ...
Writing to hledger.profiterole.html ...
Done

    Fri Jul 26 00:40 2019 Time and Allocation Profiling Report  (Final)

       hledger +RTS -P -RTS bal -V -f examples/1000x1000x10.journal

    total time  =        1.35 secs   (1347 ticks @ 1000 us, 1 processor)
    total alloc = 1,482,942,296 bytes  (excludes profiling overheads)

COST CENTRE                         MODULE                        SRC                                                 %time %alloc  ticks     bytes

pricesAtDate.(...)                  Hledger.Data.Valuation        Hledger/Data/Valuation.hs:(121,5)-(128,43)           16.2   36.9    218 547884016
nubSortBy                           Data.List.Extra               src/Data/List/Extra.hs:(656,1)-(659,19)               8.1    6.2    109  91671272
nubSortBy.f                         Data.List.Extra               src/Data/List/Extra.hs:(657,11)-(659,19)              7.0    3.1     94  45600000
likelyExecutablesInPath             Hledger.Cli.CliOptions        Hledger/Cli/CliOptions.hs:(667,1)-(670,31)            6.2    1.2     84  18281784
getDirectoryContentsSafe            Hledger.Cli.CliOptions        Hledger/Cli/CliOptions.hs:(720,1)-(721,115)           2.4    1.4     33  21456648
priceDirectiveToMarketPrice         Hledger.Data.Valuation        Hledger/Data/Valuation.hs:(139,1)-(144,14)            2.3   11.3     31 167200000
realFracToDecimal                   Data.Decimal                  src/Data/Decimal.hs:100:1-54                          2.2    1.3     29  19651200
token                               Text.Megaparsec.Internal      Text/Megaparsec/Internal.hs:324:3-28                  1.9    1.4     26  20822464
acommodity                          Hledger.Data.Types            Hledger/Data/Types.hs:207:7-16                        1.8    0.0     24         0
sortAccountItemsLike.\              Hledger.Reports.BalanceReport Hledger/Reports/BalanceReport.hs:150:20-50            1.5    0.0     20     40000
showMixedAmountWithoutPrice.showamt Hledger.Data.Amount           Hledger/Data/Amount.hs:(677,5)-(679,59)               1.3    0.4     18   6409648
...

Cleaned up with profiterole: hledger.profiterole.txt

simonmichael commented 5 years ago

I think this call to nubSortBy is one of the biggest costs. I think it's supposed to be O(n log n). The calling function pricesAtDate seems to get called 1900 times for a balance report showing 1000 amounts. GHC reports nubSortBy being called 0 times, but I guess it's once per pricesAtDate call. nubSortBy's f subfunction is called 1900000 times. Relevant part of the profiterole profile:

                                                                      TOT   INH   IND
* Hledger.Data.Valuation pricesAtDate.(...) (1900)                   37.9  36.4  16.0
** Data.List.Extra nubSortBy (0)                                     14.5  14.5   7.7
*** Data.List.Extra nubSortBy.f (1900000)                             6.8   6.8   6.8
** Hledger.Data.Valuation priceDirectiveToMarketPrice (1900000)       4.9   3.4   3.4
*** Hledger.Data.Types acommodity (1900000)                           1.5     -     -
** Hledger.Data.Valuation pricesAtDate.(...).\ (3796200)              2.0   2.0   2.0
** Hledger.Data.Types pddate (1900000)                                 .5    .5    .5

TOT = total inherited time
INH = unique inherited time in this execution branch, excluding common subbranches
IND = individual time of this expression

simonmichael commented 5 years ago

In a -f examples/1000x1000x10.journal bal -YV benchmark taking ~1s, about 1000 account balances are shown across 3 years (3000 valuations/s). All 1000 price directives are in the past, so in effect.

In the -f examples/10000x1000x10.journal bal -YV benchmark above, 1000 account balances are shown across 28 years in ~70s (400 valuations/s). About 7000 of the price directives are in effect as of today.

I think these reports are something like: O( ( number of amounts shown ) x ( number of price directives dated before the valuation date ) ). That means 3 million (3000 x 1000) and 196 million (28000 x 7000) operations respectively for these benchmarks.

simonmichael commented 5 years ago

Good news, we are doing a lot of redundant work - re-filtering all price directives and building the price graph again for every amount. Really that need only be done once per valuation date, and I think my favourite secret weapon uglymemo might take care of this. TBD.

simonmichael commented 5 years ago

Can anyone see why the following memoisation makes it twice as slow, instead of faster ?

-- New:
-- | A memoising version of pricesAtDate_.
-- Caches the price graph for each [PriceDirective], Day pair.
pricesAtDate :: [PriceDirective] -> Day -> PriceGraph
pricesAtDate = curry (memo (uncurry pricesAtDate_))

-- Old:
-- | Convert a list of market price directives in parse order to a
-- graph of all prices in effect on a given day, allowing efficient
-- lookup of exchange rates between commodity pairs.
pricesAtDate_ :: [PriceDirective] -> Day -> PriceGraph
pricesAtDate_ pricedirectives d = PriceGraph{prGraph=g, prNodemap=m, prDeclaredPairs=dps}
  where
    ...

simonmichael commented 5 years ago

@adept, @cocreature, @hpdeifel, @mstksg, @ony, @schoettl etc., any ideas ?

simonmichael commented 5 years ago

Or was my benchmarking wrong ? Now it seems to be twice as fast. I expected more speedup.

00:02:30 ~/src/hledger$ quickbench -n2 -N2 -w hledger-1.15a,hledger-1.15b "hledger -f examples/1000x1000x10.journal bal -V" "hledger -f examples/2000x1000x10.journal bal -V"  "hledger -f examples/3000x1000x10.journal bal -V"  "hledger -f examples/4000x1000x10.journal bal -V"  "hledger -f examples/5000x1000x10.journal bal -V" 
Running 5 tests 2 times with 2 executables at 2019-07-31 00:03:52 IST:

Best times 1:
+-----------------------------------------++---------------+---------------+
|                                         || hledger-1.15a | hledger-1.15b |
+=========================================++===============+===============+
| -f examples/1000x1000x10.journal bal -V ||          0.69 |          0.58 |
| -f examples/2000x1000x10.journal bal -V ||          1.26 |          0.94 |
| -f examples/3000x1000x10.journal bal -V ||          2.05 |          1.19 |
| -f examples/4000x1000x10.journal bal -V ||          3.02 |          1.91 |
| -f examples/5000x1000x10.journal bal -V ||          4.48 |          2.27 |
+-----------------------------------------++---------------+---------------+

Best times 2:
+-----------------------------------------++---------------+---------------+
|                                         || hledger-1.15a | hledger-1.15b |
+=========================================++===============+===============+
| -f examples/1000x1000x10.journal bal -V ||          0.69 |          0.58 |
| -f examples/2000x1000x10.journal bal -V ||          1.28 |          0.92 |
| -f examples/3000x1000x10.journal bal -V ||          2.01 |          1.24 |
| -f examples/4000x1000x10.journal bal -V ||          3.22 |          1.74 |
| -f examples/5000x1000x10.journal bal -V ||          4.58 |          2.21 |
+-----------------------------------------++---------------+---------------+
00:05:10 ~/src/hledger$ quickbench -w hledger-1.15a,hledger-1.15b "hledger -f examples/10000x1000x10.journal bal -V"
Running 1 tests 1 times with 2 executables at 2019-07-31 00:05:39 IST:

Best times:
+------------------------------------------++---------------+---------------+
|                                          || hledger-1.15a | hledger-1.15b |
+==========================================++===============+===============+
| -f examples/10000x1000x10.journal bal -V ||          8.93 |          4.63 |
+------------------------------------------++---------------+---------------+

simonmichael commented 5 years ago

One more for tonight, similar to the 7/25 benchmark.

The bal -V test above prices 1000 amounts and seemed faster
the bal -VY test below prices 28000 amounts (28 yearly columns) and is clearly slower. (Why?)

The reg -V and print -V tests both price 20000 amounts I think, and seem to have improved, matching hledger 1.14.


00:16:01 ~/src/hledger$ stack exec -- quickbench -w hledger-1.14,hledger-1.15old,hledger-1.15memo "_ -f examples/10000x1000x10.journal bal -Y" "_ -f examples/10000x1000x10.journal bal -YV" "_ -f examples/10000x1000x10.journal reg" "_ -f examples/10000x1000x10.journal reg -V" "_ -f examples/10000x1000x10.journal print" "_ -f examples/10000x1000x10.journal print -V"
Running 6 tests 1 times with 3 executables at 2019-07-31 00:16:10 IST:

Best times: +--------------------------------------------++--------------+-----------------+------------------+ | || hledger-1.14 | hledger-1.15old | hledger-1.15memo | +============================================++==============+=================+==================+ | -f examples/10000x1000x10.journal bal -Y || 2.44 | 2.53 | 2.77 | | -f examples/10000x1000x10.journal bal -YV || 46.82 | 57.18 | 118.59 | | -f examples/10000x1000x10.journal reg || 2.93 | 3.01 | 3.10 | | -f examples/10000x1000x10.journal reg -V || 51.39 | 109.77 | 57.51 | | -f examples/10000x1000x10.journal print || 3.12 | 2.77 | 2.60 | | -f examples/10000x1000x10.journal print -V || 46.03 | 108.86 | 41.28 | +--------------------------------------------++--------------+-----------------+------------------+

simonmichael commented 5 years ago

Now it seems to be twice as fast. I expected more speedup.

Oh - those 1000x1000x10, 2000x1000x10... files all had 1000 accounts, so the number of valuations was not changing, only the number of transactions being parsed and summed was. And I should remember that these benchmarks are measuring that as well, not just valuation performance.

simonmichael commented 5 years ago

Aha. For the record, it has been scaling badly with (number of price directives) x (number of valuation dates). This explains the slow bal -YV benchmark above, which has 10000 directives x 28 dates. The memoisation is a pessimisation in this case because I naively used the full list of price directives as part of the memo key, and each price directive contains an Amount with its Decimal quantity, and comparing all of these on every call (800 million times) is slow. Comparing profiterole profiles was what finally revealed it.

simonmichael commented 5 years ago

Many wrong turns later.. I understand memoisation better, and compared to the old valuation code or the unmemoised new valuation code, valuation is now "blazing fast":

$ quickbench -w hledger-1.14,hledger-1.15old,hledger-1.15memo "_ -f examples/10000x1000x10.journal bal -Y" "_ -f examples/10000x1000x10.journal bal -YV" "_ -f examples/10000x1000x10.journal reg -Y" "_ -f examples/10000x1000x10.journal reg -YV"
Running 4 tests 1 times with 3 executables at 2019-08-19 01:47:09 IST:

Best times:
+-------------------------------------------++--------------+-----------------+------------------+
|                                           || hledger-1.14 | hledger-1.15old | hledger-1.15memo |
+===========================================++==============+=================+==================+
| -f examples/10000x1000x10.journal bal -Y  ||         2.43 |            2.43 |             2.44 |
| -f examples/10000x1000x10.journal bal -YV ||        44.91 |           96.34 |             6.48 |
| -f examples/10000x1000x10.journal reg -Y  ||         4.60 |           10.39 |             4.15 |
| -f examples/10000x1000x10.journal reg -YV ||        61.09 |           72.74 |             7.21 |
+-------------------------------------------++--------------+-----------------+------------------+

erikryb commented 5 years ago

Awesome! :smiley:

simonmichael / hledger

Performance issues for "bal -V" when having a lot of market prices #999