Open simonmichael opened 11 months ago
Are those all compiled with the same GHC version ? Actually no, those are built with 9.0.2, 9.2.2, 9.4.6 and 9.6.3 respectively. So it's not quite a fair test.
They are all ARM binaries, and all optimised builds (I'm fairly certain).
Profiling and some performance fixes were done as part of #2153. Summary: 1.32.3 is much faster than 1.29-1.32, but still slower than 1.25 (which is somewhat expected since it is providing more powerful features in the journal format).
$ for v in 1.25 1.26 1.28 1.29 1.30 1.31 1.32 1.32.3; do printf "$v:\t"; hledger-$v -f examples/10ktxns-10kaccts.journal stats | grep throughput; done
1.25: Run time (throughput) : 0.93s (10735 txns/s)
1.26: Run time (throughput) : 1.18s (8501 txns/s)
1.28: Run time (throughput) : 1.16s (8607 txns/s)
1.29: Run time (throughput) : 3.73s (2684 txns/s)
1.30: Run time (throughput) : 6.13s (1631 txns/s)
1.31: Run time (throughput) : 6.77s (1478 txns/s)
1.32: Run time (throughput) : 5.95s (1680 txns/s)
1.32.3: Run time (throughput) : 1.60s (6239 txns/s)
See #2153 for some recent benchmarking data. That issue fixed a major performance drop for unusually large numbers of accounts from 1.29 onward. On this one we'll continue discussing the remaining, lesser performance drop since 1.25. This one can be argued as a wish rather than a bug or regression since absence of (reasonable) performance fluctuations has never been promised. I think as part of resolving this issue though we should implement performance regression testing, and try to avoid or at least document any performance drops going forward.
Well, it's true that performance was never guaranteed, but I'd still consider it a regression, since hledger became slower, not faster since 1.25. With my ledger file, with 17k transaction and 470 accounts, I get similar results as in your last comment, runtime for balance is somewhere around ~1,5 times slower in git head than in 1.25. It may not be much, but it's enough to cross the subjective threshold of annoyance.
I know that there have been a lot of features added to hledger since 1.25 that explain the general slowdown. But then, there is this philosophical question should a feature add its weight in terms of performance even if not used? Could I get a much faster stripped down version with only basic features for a daily use, and have slower execution only when I actually use some advanced or heavier functions?
As far as I'm concerned, I can keep using 1.25 for the next few years until my journal file grows enough for me to reach the annoyance threshold and have to revisit again. Or split the file and get on with the life.
Noted @markokocic, thanks for your input as one of the (few?) users running large >10k-txn files. I agree it's arguable, and it's at least a bug in our process that it took so long to discover. But I feel like letting this one escape the dreaded black label on a technicality. There's no bounty involved since I reported it first so I'm assuming it's harmless to do so. Certainly it'll be considered a regression if it happens again, and certainly I hope we'll find ways to bring performance back to 1.25 level and beyond.
PS any ideas for simple robust automated performance testing ? I'm thinking about:
PS I recently added memory usage to the stats
output, that could be interesting to test also.
I experimented with printing CPU time rather than elapsed time but it seems unlikely to be needed.
I don't think saving results to file would help, since the run results may vary across runs. Maybe have a script that will do benchmark against ledger, hledger-1.25 and previous release which can be run on demand or on a ci.
Btw, I just checked a new stats
command. Maybe you want to split memory data to a separate line for a better readability.
Regarding speed, I finally managed to fix my journal to its format can be read directly by toht ledger and hledger, and here's the quick comparison with the latest from git.
marko@fine ~/s/cvstree> time hledger bal > /dev/null
________________________________________________________
Executed in 3.89 secs fish external
usr time 3.60 secs 1.21 millis 3.60 secs
sys time 0.31 secs 0.00 millis 0.31 secs
marko@fine ~/s/cvstree> time ledger bal > /dev/null
________________________________________________________
Executed in 382.45 millis fish external
usr time 309.59 millis 1.46 millis 308.13 millis
sys time 72.52 millis 0.49 millis 72.03 millis
I like hledger usability more, but wish it would be possible to approach ledger balance performance as close as possible :)
Also, I don't care if the ticket is a regression or whoever reported it, at least not in a sense of bounty, I'd only wish some performance back.
I tend to see this (register much faster, but print and balance a bit slower) in recent releases:
More notably, right now on this machine I can see ~25k TPS with hledger 1.25
stats
, but only ~15k TPS with 1.32. A new round of profiling could be in order.Related: