camsaul / toucan2

Successor library to Toucan with a modern and more-extensible API, more consistent behavior, and support for different backends including non-JDBC databases and non-HoneySQL queries. Currently in active beta.
Eclipse Public License 1.0
81 stars 11 forks source link

[jdbc.row] Replace memoize in lazy thunk construction with delays #192

Closed alexander-yakushev closed 1 week ago

alexander-yakushev commented 1 month ago

I'll need your feedback on this PR as it's not as clear-cut as the other ones. memoize is quite inefficient on a per-call basis (either a hit or a miss). The cheapest (amortized) way to look up per-column thunks is to compute them early and put into a vector. This is the best thing to do for regular selects where the whole row is realized anyway. However, it may be doing unnecessary work for other types of select where only one column is extracted, or PKs, or whatever.

Still, the overhead would only happen once per ResultSet, not per each value cell like now. I'll defer to your judgement.

An alternative would be to roll out a custom memoize-like structure for this (e.g. vector in an atom).

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.60%. Comparing base (3729d20) to head (4f1a199). Report is 2 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #192 +/- ## ========================================== + Coverage 83.58% 83.60% +0.01% ========================================== Files 37 37 Lines 2498 2501 +3 Branches 212 212 ========================================== + Hits 2088 2091 +3 Misses 198 198 Partials 212 212 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

alexander-yakushev commented 1 month ago

UPD: instead of fully precomputing, the lazily constructed chunks now leave in a vector instead of map, and they are wrapped in Delays. This proves to be more efficient than memoize.

Benchmarking results:

before (master)
Time per call: 8.67 ms   Alloc per call: 11,131,418b
Time per call: 9.13 ms   Alloc per call: 11,088,740b
Time per call: 8.36 ms   Alloc per call: 11,087,633b
Time per call: 8.22 ms   Alloc per call: 11,087,385b

after (this PR)
Time per call: 7.75 ms   Alloc per call: 8,450,440b
Time per call: 7.51 ms   Alloc per call: 8,449,525b
Time per call: 7.62 ms   Alloc per call: 8,448,447b