Open alexander-yakushev opened 1 month ago
Not sure this is really "non-invasive" if everything using query-reducible
and select-reducible
and the like has to be updated to use transduce
instead of reduce
(as evidenced by the tests you had to update)... I'm fairly certain this is going to require us to make changes in upstream Metabase code which makes it a breaking change. I'm honestly not sure the minor (?) performance benefits we get here are worth making the code harder to use correctly. (What are the performance benefits, btw? I would still love to see some benchmarks for these PRs)
I understand what you are saying. However, the tests that I've changed, for example, extend the method pipeline/transduce-execute-with-connection
. I'm not sure there is a hard rule regarding this, but I would expect anything with the word "transduce" in the name to call the 1-arity of the reducing function. Otherwise, it's violating the "transduce contract," so to speak.
EDIT: this may be relevant https://clojure.org/reference/transducers#_creating_transducible_processes.
A completing process must call the completion operation on the final accumulated value exactly once.
It is possible to rewrite this PR without having to resort to 1-arity if the compatibility here is crucial.
Regarding the benchmarks: I did benchmarks for multiple changes at once. Each change is not very impressive number-wise on its own, but together they chip away quite a bit. I'll do separate benchmarks for this PR and post them.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 83.64%. Comparing base (
3729d20
) to head (dd09a66
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I did benchmarks of this against the current master and the allocations are 20% lower but timing results are inconclusive. Let's wait with this PR if favor of the others, and revisit it once other are merged.
Another small but non-invasive improvement. This works best when many rows are processed, but shouldn't bring too much of an overhead if one/few rows are selected.