MatAtBread / fast-async

605 stars 21 forks source link

Benchmarks? #11

Closed steve-gray closed 7 years ago

steve-gray commented 7 years ago

I know this isn't an 'issue', but I've been battling against this myself. Here's my scenaro:

In short, I'm using an async code path that is potentially async in some places - but actually 99% of the time - totally sync. What I've been finding is that Promise.resolve(cachedValue) would bounce off the queue, causing massive slowdown. Your module is a massive improvement I find over the naive Promise based approach.

Been trying to benchmark a scenario where I do 200,000 operations, and finding that I'm not getting a massive uplift from fast-async. I'm wondering if this is a Node 6.4 or Babel improvement, because I only get a material difference when using babel-plugin-transform-bluebird - but even then it's nowhere near as fast as a pure sync solution - there's a serious amount of performance being lost.

In short, async-to-generator with the transform-bluebird seems to achieve most of the same effect, however, if I modify the such that I remove any reference to async/await and only use the sync path I get:

The mysteries I'm facing:

matAtWork commented 7 years ago

This is a common scenario, and not one the async/await definition deals with well.

The solution is a "PromiseCache", where the Promise holds any cached value (I've been meaning to generalise this case into a module).

You can see a (basic) working implemenation here: http://nodent.mailed.me.uk/#function%20PromiseCache(asyncFillerFunction)%20%7B%0A%09this.map%20%3D%20new%20Map()%20%3B%0A%09this.asyncFillerFunction%20%3D%20asyncFillerFunction%20%3B%0A%7D%0A%0APromiseCache.prototype.get%20%3D%20function(key)%20%7B%0A%09if%20(this.map.has(key))%0A%09%09return%20this.map.get(key)%20%3B%0A%09var%20promise%20%3D%20this.asyncFillerFunction(key)%20%3B%0A%09this.map.set(key%2Cpromise)%20%3B%0A%09return%20promise%20%3B%0A%7D%20%3B%0A%0APromiseCache.prototype.delete%20%3D%20function(key)%20%7B%0A%09return%20this.map.delete(key)%20%3B%0A%7D%20%3B%0A%0A%2F*%20use%20it%20*%2F%0A%2F%2F%20My%20async%20function%20that%20knows%20how%20to%20fill%20the%20cache%20given%20a%20key%0Aasync%20function%20fillMe(key)%20%7B%0A%20%20%20%20console.log(%22Filling%20%22%2Bkey)%3B%0A%20%20%20%20setTimeout(()%3D%3E%7B%20async%20return%20%22**%22%2Bkey%20%7D%2C500)%20%3B%0A%7D%0A%0Avar%20m%20%3D%20new%20PromiseCache(fillMe)%20%3B%0A%0Avar%20key%20%3D%20%22MYKEY%22%20%3B%0A%0Aconsole.log(%221)%20%22%2Bawait%20m.get(key))%20%3B%09%0Aconsole.log(%222)%20%22%2Bawait%20m.get(key))%20%3B%09%0A%0A%2F%2F%20Remove%20the%20%22old%22%20cached%20value%0Am.delete(key)%20%3B%09%0Aconsole.log(%223)%20%22%2Bawait%20m.get(key))%20%3B%09%0Aconsole.log(%224)%20%22%2Bawait%20m.get(key))%20%3B%09%0A%0A

Obviously, you can expose the other Map methods by delegation, or change the signature to something less general where the function that fills the cache can be directly invoked within 'set' to avoid you having to pass it in. In my (many) implementations I often store the time the cache entry was set and evict them if they are old, or on a FIFO basis to stop the cache growing forever.

Finally, all proper "Promises" actually resolve on the next tick, which is a significant overhead if you're repeatedly using the cached values. You can avoid this by using the nodent.Thenable type as your Promise implementation via var Promise = require('nodent').Thenable. This will set Promise (locally, in the one file, not globally) to a 'Thenable' which looks and feels enough like a Promise to work with async and await, but actually resolves synchronously if possible, and is considerably faster since it doesn't wait for the next tick. There's a discussion of the options at the bottom of the section https://github.com/MatAtBread/nodent#differences-from-the-es7-specification. The implementation above won't work as expected with nodent.Thenables (since you can have multiple uses of the resolved Promise), and an alternative implementation is needed that actually caches the data, not the "Promise". If you want to see that example, I'll try and dig one out for you.

To see the differences in action, go into nodent's installation (under node_modules for your project) and type ./nodent.js tests --generators. This performs an exhaustive (and long) performance test of the various options and Promise types on a variety of test routines (in nodent/tests/semantics), so you can see the difference. On my Mac with Node 6.4.0 the results are:

                                                          (none) nodent.Thenable    nodent.Eager          native        bluebird            rsvp            when     promiscuous
                  Compiler flags            Mean             164             299             453             657             424             411             391             715
               lazyThenables,es7             100             100               -               -               -               -               -               -               -
                             es7             220             220               -               -               -               -               -               -               -
     lazyThenables,wrapAwait,es7             102             102               -               -               -               -               -               -               -
                   wrapAwait,es7             233             233               -               -               -               -               -               -               -
          lazyThenables,promises             255               -             102             242             427             214             210             169             419
                        promises             241               -             100             225             412             190             179             165             415
lazyThenables,wrapAwait,promises             297               -             106             236             454             210             193             176             703
              wrapAwait,promises             297               -             103             238             454             212             192             174             704
        lazyThenables,generators             667               -             474             649             901             617             602             589             836
                      generators             675               -             482             661             862             626             621             613             859
zyThenables,wrapAwait,generators             696               -             522             688             882             646             638             635             862
            wrapAwait,generators             698               -             500             681             863             677             650             605             911

You'll see that nodent.Thenable is much faster in "Promise" mode than standard Promise implementations*, and that generators are pretty much useless for high performance. When it comes to standard Promise implementations, I find when is the best - bluebird is a fine example of early optmization being a mistake as most of the performance gains (made by avoiding closures and using eval or new Function() to create internal bindings) have been obviated by improvements in V8, leaving bluebird as a simply an over-complicated implementation of standard Promise logic.

I hope that helps - if it does, please close the issue, otherwise feel free to get back to me.

*If you're wondering how/why, it's that nodent's async/await implementation never generates code where multiple listeners await on the same result, and never generate chained Promises (they are nested), and so much of the internal logic of a Promise isn't needed.

matAtWork commented 7 years ago

Found the nodent.Thenable version - basically an additional two lines. If you're doing a lot of sync retrieval, this is faster - although you should make sure it meets your use-cases fully.

var Promise = require('nodent').Thenable ;

function PromiseCache(asyncFillerFunction) {
    this.map = new Map() ;
    this.asyncFillerFunction = asyncFillerFunction ;
}

PromiseCache.prototype.get = async function(key) {
    if (this.map.has(key)) 
        return this.map.get(key) ;
    var promise = this.asyncFillerFunction(key) ;
    this.map.set(key,promise) ;
    var data = await promise ;
    this.map.set(key,data) ;
    return data ;
} ;

PromiseCache.prototype.delete = function(key) {
    return this.map.delete(key) ;
} ;

/* use it */
// My async function that knows how to fill the cache given a key
async function fillMe(key) {
    console.log("Filling "+key);
    setTimeout(()=>{ async return "**"+key },500) ;
}

var m = new PromiseCache(fillMe) ;

var key = "MYKEY" ;

console.log("1) "+await m.get(key)) ;   
console.log("2) "+await m.get(key)) ;   

// Remove the "old" cached value
m.delete(key) ; 
console.log("3) "+await m.get(key)) ;   
console.log("4) "+await m.get(key)) ;   
steve-gray commented 7 years ago

Hey,

This is pretty interesting stuff - I'm basically kind of doing that already - but the addition of the fast-async makes the difference as it no longer smashes the event loop. To avoid the same problem I'd also started basically creating something that looks 'Promise-like', so you can resolve static-values easily.

let count = 0;

/**
 * Sync.resolve
 *
 * Helper/cheat method to perform semi-synchronous resolutions of
 * promises when the value is known and cached. The normal promises
 * implementation will transit via the event loop - which is 'fine',
 * but also quite slow.
 */
function resolve(value) {
  return {
    // Used for 'internal code' to cheat and just pull the
    // value out and skip an async wait
    outcome: value,

    // Pseudo-promise chaining
    then: (func) => {
      const result = func(value);
      if (result && result.then) {
        return result;
      }

      // Every N iterations, bust out a real promise
      // to let the stack breathe a little. Otherwise
      // you will get an overflow fairly quicky. 1000
      // chosen due to un-scientific more/less testing
      count += 1;
      if (count === 1000) {
        count = 0;
        return Promise.resolve(result);
      }
      return resolve(result);
    },

    // To silently allow code that tries to .catch to continue
    // to run, even though a constant can never throw. If the value
    // chains a real promise, that instead will handle .catch.
    catch: () => null,
  }
}

export default {
  resolve,
};

In short, I chain thenables ad-infinitum until I hit either 1000 of them or I hit a non-thenable, then just apply the wrapper again. I then just use Sync.resolve for code that's supposed to generate promises - but I'm wondering if there's any actual gain to be had from doing this now. I suspect the code from nodent is doing effectively this, but is more 'compliant' with promise behaviours.

Perhaps I can ditch my code - out of curiosity how does the nodent solution deal with very deep recursion/long chains? In the code above, I'm having to fire a real promise once in a while from Node in order to clear out some of the call stack.

-Steve

matAtWork commented 7 years ago

The implementation of Thenable is in nodent/lib/thenable.js and nodent/lib/runtime.js lines 13-25.

Because nodent generates nested Promises, rather than chained ones, long chains aren't an issue. Similarly, it never releases Zalgo as it always generates code like return expr.then(function()...) (ie, because of the return there is no subsequent code to execute out of order).

In terms of deep recursion (or iteration of loops containing await), it doesn't deal with it. A loop over >1000 iterations is likely to blow the stack. Theoretically, proper tail-recursion, when implemented, would fix this, as again, the transpiling of await in a loop always generates a return.

In the cases where I have to use nodent-es7 (or nodent.Thenable) for performance, like you I tend to have an escape at around 1000 items. It has to be said, this is not a very portable workaround and very deep stacks are not memory-friendly as the stack is full of object references.

MatAtBread commented 7 years ago

@steve-gray - I'm guessing you've got enough to work with for now, so I'll close this issue. Feel free to re-open if you need any more support.