petkaantonov / bluebird

:bird: :zap: Bluebird is a full featured promise library with unmatched performance.
http://bluebirdjs.com
MIT License
20.45k stars 2.33k forks source link

Implement Promise.using #65

Closed ralt closed 10 years ago

ralt commented 10 years ago

After having this discussion on IRC, making this a little more official.

Having a method allowing us to handle resource lifecycles is a pretty good idea, for any application communicating with a database for example.

It's basically the same concept as C#'s using or Java's try.

Here is the proposed syntax:

return Promise.using(getConnection(), function(connection) {
    return connection.query(fn).then(fn);
});

A naive (i.e. not fully working) implementation of this would be:

Promise.using = function(resource, fn) {
  // wraps it in case the resource was not promise
  var pResource = Promise.cast(resource);
  return pResource.then(fn).finally(function() {
    return pResource.then(function(resource) {
      return resource.close();
    });
  });
}

This means that we require the resource object to implement a close method, that will take care of freeing the resource. It's the same concept as Java requiring an AutoCloseable interface. The method can be dispose or whatever.

Thoughts?

benjamingr commented 10 years ago

+1 I think this is a great idea. Look into the implementation in issue #51

ralt commented 10 years ago

For reference, to support arrays of resources (code below): http://pastebin.com/tKRykm2x

function using() {
    var resources = [].slice.call(arguments, 0, arguments.length - 1);
    if (resources.length < 1) throw new TypeError();
    var fn = arguments[arguments.length - 1];
    if (typeof fn !== "function") throw new TypeError();

    var promiseForInspections = Promise.settle(resources);

    promiseForInspections.then(function(inspections) {
        return inspections.map(function(inspection) {
            if (inspection.isRejected()) {
                throw inspection.error();
            }
            return inspection.value();
        });
    }).spread(fn).finally(function() {
        //.value() never throws here
        ASSERT(promiseForInspections.isFulfilled());
        var inspections = promiseForInspections.value();
        inspections.forEach(function(inspection) {
            if (inspection.isFulfilled()) {
                var resource = inspection.value();
                try {
                    disposer(resource);
                }
                catch (e) {
                    //Not sure what to do here
                    //at the very least we should keep closing the rest of the resources
                    //but we cannot change the rejection reason because this is in .finally
                }
            }
    });
}

The question is whether it's worth it to support multiple resources. Handling just one is way more efficient, from a code point of view, and maybe performance.

benjamingr commented 10 years ago

Looking into Python's with:

Here is the with PEP everyone is encouraged to read it, it details the semantics of single resource with which is what Python had until 2.7 (3.1).

Here are the semantics of with (compound too) in 3.1

In Python, compound with acts in the following way:

with A() as a, B() as b:
    suite

Is equivalent to:

with A() as a:
    with B() as b:
        suite

This is of course not a solution since in Python A() will finish before B() starts because resource acquisition is synchronous. This is really the underlying issue:

Every other language that allows a compound using like construct does not let you acquire multiple resources at the same time. They acquire the first resource and only then start acquiring the second one and so on.


p.s. Whoops on the commit reference. That's my bad. (How did it even get here? I didn't commit it, I deleted the repo and made a new one because github refused to... nevermind)

ralt commented 10 years ago

Then I'd rather have only one resource possible. Having fake asynchronous looks like more dangerous than nesting.

spion commented 10 years ago

Multi-resource with means that your resource-allocating functions must never throw.

mwith(fn1(), fn2(), (r1, r2) => ...)

is the same as

var r1 = fn1(); var r2 = fn2(); mwith(r1, r2, (r1, r2) => ...)

This illustrates better that fn1 and fn2 are invoked before we even start executing mwith

But - what if fn2() throws? Then we will be stuck with an allocated r1, never enter mwith, and never dispose of r1.

You could argue that fn1() and fn2() would always be safe, never throwing. But even with promises you could easily end up with something that throws:

function createDbConnection(endpoint) { 
  var blah = parse(endpoint); // potentially throws
  return connectToDbPromise(blah.host, blah.port); 
}

This createDbConnection cannot be used from mwith

petkaantonov commented 10 years ago

using I imagined is like C# using not like python with. If any resource throws then it will enter the virtual finally clause that disposes the others that were allocated by then and not execute the block (fn) at all. That's what the example implementation does.

spion commented 10 years ago

@petkaantonov the same issue with multiple resources still applies. Basically, the resource allocator functions must not throw, which is hard to guarantee for all resources...

petkaantonov commented 10 years ago

What functions? resources are promises for resources or resources. If one of them rejects then the others are cleaned up

spion commented 10 years ago

Here is a realistic looking example

readConfig().then(config => 
  using(apiEndpoint(config.url), createDbConnection(config.db), (endpoint, db) =>
    endpoint.getStuff().then(withThis(db, db.writeStuff))))

Which is equivalent to

readConfig().then(config => { 
  var pEndpoint = apiEndpoint(config.url), 
        pDb = createDbConnection(config.db); /* throws */ 
  return using(pEndpoint, pDb, (endpoint, db) => 
    endpoint.getStuff().then(withThis(db, db.writeStuff)));
});

If createDbConnection throws synchronously, the using block is never entered. This means that it will be impossible to release the api endpoint connection.

A safe option is to either only accept a single promise:

readConfig().then(config => 
  using(apiEndpoint(config.url), endpoint => using(createDbConnection(config.db), db =>
    endpoint.getStuff().then(withThis(db, db.writeStuff)))));

or accept multiple functions that return promises:

readConfig().then(config => 
  using(_=> apiEndpoint(config.url), _=> createDbConnection(config.db), (endpoint, db) =>
    endpoint.getStuff().then(withThis(db, db.writeStuff))));

(or for convenience, accept both and make it polymorphic? :D)

and since using will be responsible for invoking those functions, it can handle the case where one of them throws and clean up the previously allocated resources.

petkaantonov commented 10 years ago

createDbConnection is a promise returning function, so if it's throwing synchronously that is a bug in the function (using 2 exception channels) not in using.

In practice it would almost always be a promisified callback function (using(mysql.getConnectionAsync())) which cannot throw either and is even doing better than C# using would do in similar case.

spion commented 10 years ago

And then you build a generic database lib with promises, which does

function getConnection(url) {
  var opt = parse(url);
  // adapters are promisified
  var adapter = getAdapter(opt.databaseType); 
  return adapter.getConnectionAsync();
}

If you forget to wrap this with something like Promise.method or Promise.try, it will cause leaks :D

petkaantonov commented 10 years ago

Yes but that function returns a promise while not making sure it will always reject the promise should exception happen. So the function has a very critical bug. You cannot use that function well in any case, not just when using using.

Even the dispose method could do something to break contract like calling process.exit(), that's not something using should handle.

spion commented 10 years ago

I think this mistake is far more common than process.exit, but I guess I might be fine with it, if using comes with a huge, big, giant red warning that says that your resource allocator must never throw, and also recommend the way to wrap it (using Promise.method).

Still, it would be nice if functions are also accepted. That way, if you don't know whether the resource allocator of that library follows the convention or not, you could just wrap it with an arrow lambda.

domenic commented 10 years ago

I don't think that warning sign is specific to using; it's just for promise-returning functions in general.

ralt commented 10 years ago

related. C# is exactly the same as python:

using (ResourceType r1 = e1, r2 = e2) {
}

Is exactly the same as:

using (ResourceType r1 = e1) {
    using (ResourceType r2 = e2) {
    }
}

The problem in our use case is that we still want r1 and r2 to be run in parallel. Or do we?

petkaantonov commented 10 years ago

@Ralt They can and will be acquired in parallel. If any fails, the ones that were acquired will be disposed and the block not run and the returned promise will be rejected.

ralt commented 10 years ago

@petkaantonov it may be easier to instead go in a recursive manner, like C# is doing. This way we don't have any leaking issue. It also means that the resources are not fetched in parallel. Overall, it simplifies the issue a lot.

benjamingr commented 10 years ago

@Ralt

It also means that the resources are not fetched in parallel

This makes Promise.using 50% less useful to me.

I need it to do 2 things:

benjamingr commented 10 years ago

Also, some more use cases:

Adding, this is worth a read: http://stackoverflow.com/questions/21118201/can-using-with-more-than-one-resource-cause-a-resource-leak

spion commented 10 years ago

Hey, I wrote my prototype variant here. :) Needs some more tests, but let me know what you think.

benjamingr commented 10 years ago

I'm in a test week - promise to write unit tests later :)

petkaantonov commented 10 years ago

Eco system support just isn't there and for othercases it's just a finally that's flipped around.

benjamingr commented 10 years ago

Please keep open for discussion :)

On 8 במרץ 2014, at 10:46, Petka Antonov notifications@github.com wrote:

Eco system support just isn't there and for othercases it's just a finally that's flipped around.

— Reply to this email directly or view it on GitHub.

petkaantonov commented 10 years ago

You can discuss even if it's closed :P

benjamingr commented 10 years ago

Just look at what happened with Peomise.sequence - so many dupes :p

petkaantonov commented 10 years ago

A strong use case here? http://stackoverflow.com/a/22543294/995876

benjamingr commented 10 years ago

Yes, a strong case. One of many.

joshperry commented 10 years ago

The SO post is mine and I find this discussion interesting. I think one thing that is important to cover here is that some resources are finite and others are not. The nondeterministic nature of cleanup by the garbage collector or a straight-up leak in the case of something like a pooled resource is what I think you are trying to cover here. The other kind of resource could be something like a JSON object or string of HTML from a web request, or -- like in my case -- the contents of a file.

Often a number of resources that can be allocated in parallel are brought together to do some operation, some of these resources are finite while others are not. In my case the db connection is finite while the string of the SQL script in memory is not (not strictly by my definition above anyway).

So the dichotomy with the using statement in C# and what you are trying to do here is that only finite resources are meant to be used in the using statement and, like was mentioned above, they are allocated serially, while I think the true value of this function would be in the parallel allocations without having to do finally jitsu to get your finite resources cleaned up in both success and failure cases.

joshperry commented 10 years ago

I've been spending some time thinking about this and, to solve the problems of parallel resolution and deterministic finalization in the scope of chainable promises, we need to separate the trigger for disposing of a finite resource from the allocation and consumption.

This in combination with a simple helper to wrap checking for a fulfilled promise would make it fairly obvious what is going on and also get rid of the closures required in the solution in the case of my problem posted on SO.

function execFile(db, file) {
    console.log('Connecting to ' + db);
    return Promise.props({
        client: connect('postgres://' + credentials + host + '/' + db),
        initSql: fs.readFileAsync(file, 'utf8')
    }, true)
    // The true tells 'props' to call 'binds' with this object,
    // maybe make a 'propBind' or empty 'bind' that binds to the prop?
    .spread(function(res) {
        console.log('Connected to ' + db);
        console.log('Running init script ' + file);
        return res.client.queryAsync(res.initSql);
    }).finally(function() {
        this.client.ifFulfilled(function(x) { x.end(); });
        // The callback gets the value of the promise,
        // or it could bind 'this' in the callback to the value of the promise;
        // though I'm not familiar with the performance repercussions.
    });
}

This unfortunately does not help in the case of forgetting the finally handler. One thing that could help in that respect is a weak reference promise that will call a destructor when it is garbage collected. However, we don't have finalizers, and without some sort of hook we can't really catch the case where an object is holding a finite resource at time of garbage collection; any solution that honors chaining is going to necessitate some type of 'call' to notify the resource that it can be safely released.

Something possibly like TooTallNate/node-weak could help on Node, but we would need shims on the browser side... if they exist.

Something more explicit and similar to this using idea could work, but I am not familiar enough with the codebase to know if this would be feasible to implement:

function execFile(db, file) {
    console.log('Connecting to ' + db);
    return Promise.all([
        // Wraps the promise returned by connect in a DisposablePromise
        // providing a disposer
        Promise.disposable(connect('postgres://' + credentials + host + '/' + db), function(x) { x.end(); }),
        fs.readFileAsync(file, 'utf8')
    ]).spread(function(res) {
        console.log('Connected to ' + db);
        console.log('Running init script ' + file);
        return res.client.queryAsync(res.initSql);
    }).dispose();
    // Triggers the disposer of any promises in the chain,
    // before this point, that were wrapped in a DisposablePromise
}

Again, it is still something that needs to remember to be called and, again, I don't know if something like this is even possible with chaining outside the method where this promise chain is returned or other possible use-cases. Just kind of thinking out loud here.

spion commented 10 years ago

@joshperry what are your thoughts on https://github.com/spion/promise-using ?

petkaantonov commented 10 years ago

@joshperry the proposed function would be parallel and handle cleanups with a convenient syntax, e.g. you could have done in your SO problem like so:

function execFile(db, file) {
    console.log('Connecting to ' + db);
    return using(connect('postgres://' + credentials + host + '/' + db),
                 fs.readFileAsync(file, 'utf8'),
        function(client, initSql) {
            console.log('Connected to ' + db);
            console.log('Running init script ' + file);
            return client.queryAsync(initSql);
        });
}

The disposer that knows to call .end on dbclients and do nothing for strings would have been defined elsewhere with onDispose or some such.

petkaantonov commented 10 years ago

@spion I like the idea better though that one could just say Promise.using(connect(...).disposer("end"), ...) - that will dispose the connection by calling .end() method on the resource. Wouldn't require any register disposer juggling. And if you need it in multiple places you can just define connect as return oldConnect(..).disposer("end").

petkaantonov commented 10 years ago

I like how simple the disposer .disposer(methodName), I did an initial implementation of .using in a branch here: https://github.com/petkaantonov/bluebird/commit/a85f8f448839f319e667f985da454b0f61db46af

With that the code could now be (without any outside set up assumptions):

function execFile(db, file) {
    return using(
        connect('postgres://' + credentials + host + '/' + db).disposer("end"),
        fs.readFileAsync(file, 'utf8'), function(client, initSql) {
            return client.queryAsync(initSql);
    });
}

Of course in practice you would just define connect to call disposer("end") - there is no harm in doing it even if you won't pass the promise to using.

joshperry commented 10 years ago

@spion @petkaantonov Yeah, I like how the scope is specified by the anonymous function. Makes sense. I don't love "Magic Strings" as it were, have always preferred something like .disposer(function(x) { x.end() }) but that may just be from my days writing in compiled languages...

This solution handles all my needs and I think is in-line with the reason we need promises and the need for deterministically handling the lifetime of finite resources.

petkaantonov commented 10 years ago

@joshperry in the promise-using branch disposer() currently has 2 overloads including a string and a function so you can use .disposer(function(x){x.end()}) or .disposer("end").

benjamingr commented 10 years ago

@petkaantonov this is for 2.0 right?

petkaantonov commented 10 years ago

Yeah 2.0 still needs some major documentation changes then it's good to go

petkaantonov commented 10 years ago

2.0 is done in the iterators branch, not using branch

benjamingr commented 10 years ago

Awesome, looking forward to using more using :)

On Wed, Apr 23, 2014 at 12:57 PM, Petka Antonov notifications@github.comwrote:

2.0 is done in the iterators branch, not using branch

— Reply to this email directly or view it on GitHubhttps://github.com/petkaantonov/bluebird/issues/65#issuecomment-41143750 .