Open raingerber opened 6 years ago
What a find! Could we implement support for the __async__
property into reducers one by one or would we need to add it to all in a single PR? (Thinking smaller PRs are easier to get done and released.)
Report: Speed: sync was faster by 98.82% (150,758Hz vs 1,779Hz)
😳
150,758 is much more than 98.82% faster than 1,779...
Wouldn’t it be (150,758 - 1,779) / 1,779 = 83.7x faster (8,370%)? 🤔
I’m not sure about this equation: https://github.com/ViacomInc/data-point/blob/master/packages/bench-trial/runner.js#L155
Edit: either fixed my math or broke it or was wrong both times.
Good point - we can modify the reducers one by one, but the first step is adding support for __async__
in the main resolveReducer function:
// simplified version of this function
function resolveReducer (manager, accumulator, reducer) {
const resolve = getResolveFunction(reducer)
// the problem is that "resolve" MIGHT return a promise, but might not
const result = resolve(manager, resolveReducer, accumulator, reducer)
return result
}
These are possible solutions (I haven't tested how they'll affect performance though):
async / await
- a simple, idiomatic solution that we don't officially support 😞
Using callbacks:
// adds an error first "done" callback
function resolveReducer (manager, accumulator, reducer, done) {
const resolve = getResolveFunction(reducer)
const result = resolve(manager, resolveReducer, accumulator, reducer)
// adding the "__async__" property to the resolve
// function instead of the reducer instance...
if (resolve.__async__) {
// in this case, "result" will be a promise
result.asCallback(done)
} else {
done(null, result)
}
}
Note that the performance boost wouldn't come directly from this change (and I'm not sure how this function's performance would be affected). The benefit would come after changing the resolve functions for reducers like map
, filter
etc. so they resolve synchronously when possible
Also, I think you have the right equation for bench-trial
; pre-approving that change 😊
Problem description:
Every reducer/entity is asynchronous, which makes them simpler to implement and reason about, but this comes with performance drawbacks. Take this for example:
This operation could be synchronous, but DataPoint will resolve it using
Promise.map
, and the result from each$a
reducer will be wrapped in a promise (which is much slower than usingArray.map
without promises).Suggested solution:
Reducers should have an
__async__
boolean property to indicate if the resolution should be asynchronous. This would apply mainly to path reducers (where__async__
would always be false), as well as reducers that can use path reducers internally. For example, this pseudo code would apply to reducer lists:There would be similar code for reducer objects,
map
,filter
,find
,assign
, andparallel
. This would also require a change to theresolveReducer
function in reducer-types/resolve.js, because we can't assume that every reducer returns a promise (and the resolve function foromit
,pick
, andconstant
should no longer wrap their return values withPromise.resolve
). This would be the most difficult change.Benchmarks
These are benchmarks from running a sync vs async version of
map('$a')
on an array with 100 elements. There are... substantial differences:NOTE: these benchmarks were done with
bench-trial
, but the math doesn't look correct -- 150,758 is much more than 98.82% faster than 1,779...