petkaantonov / bluebird

:bird: :zap: Bluebird is a full featured promise library with unmatched performance.
http://bluebirdjs.com
MIT License
20.44k stars 2.33k forks source link

sequence #70

Closed tkellen closed 10 years ago

tkellen commented 10 years ago

would you accept a PR that implements sequence like when?

benjamingr commented 10 years ago

@tkellen can you show me a use case of the desired functionality with code?

petkaantonov commented 10 years ago

There is way too much overlap with reduce (As it is since 1.0.0):

Promise.reduce(tasks, function(values, task) {
    return task(arg1, arg2 ...).then(function(value) {
        values.push(value);
        return values;
    });
}, []);

If you don't need array of the results:

Promise.reduce(tasks, function(_, task) {
    return task(arg1, arg2 ...);
}, null);

Also, Imo, such approach is in appropriate with promises which are, as seen above, powerful enough not to require dedicated function for every little thing like callbacks would.

benjamingr commented 10 years ago

@petkaantonov reducing over an empty array where the pre-value is a pretty convoluted way to do this - it's not very intuitive to those who are not used to FP and it's kind of clumsy.

That said, personally I don't really see the use case of this with promises, this is quite common with libraries like async but with promises I've never had to do this.

Running things sequentially is simply what then does. If I want a dynamic number of things I can simply do:

if(addAnotherPromise){
    p = p.then(..);
}

If I want access to the same data I can use .bind in Bluebird so that solves that. Which is why I'm interested in hearing the use case and not just the feature :)

petkaantonov commented 10 years ago

@benjamingr It's not clumsy at all, please try to write equal code (which your example is not at all) without using dedicated method like sequence and you will see clumsy. You will see that .then only works well only for a hard-coded sequence.

benjamingr commented 10 years ago
var p = Promise.cast().bind({arg1:arg1,arg2:arg2,...);

tasks.forEach(function(t){
   p = p.then(t); // tasks access same arguments through `this`   
})

What about something like the above?

petkaantonov commented 10 years ago

Where is the array of the results? You are already requiring a statement and you don't even have an array yet.

What you have was equal to:

Promise.reduce(tasks, function(_, task) {
    return task(arg1, arg2 ...);
}, null);
benjamingr commented 10 years ago

That's what I aimed for. Anyway, I'm interested in the use case OP has here.

tkellen commented 10 years ago

Perhaps I can use reduce for this. I'll give it a shot. The use-case is running migrations on a database--I need the tables created in a specified order or the foreign keys will blow up.

For code, see tgriesser/knex/issues/148

tkellen commented 10 years ago

Ping.

spion commented 10 years ago

No need for FP:

Waterfall:

function sequence(tasks) {
  var current = Promise.cast();
  for (var k = 0; k < tasks.length; ++k) {
    current = current.then(tasks[k]);
  }
  return current;
}

Sequence without passing (waterfalling) results to the next item

function sequence(tasks) {
  var current = Promise.cast();
  for (var k = 0; k < tasks.length; ++k) {
    current = current.thenReturn().then(tasks[k]);
  }
  return current.thenReturn();
}

Waterfall but returning all results instead of the last one

function sequence(tasks) {
  var current = Promise.cast(), results = [];
  for (var k = 0; k < tasks.length; ++k) {
    results.push(current = current.then(tasks[k]));
  }
  return Promise.all(results);
}

Sequence without waterfalling, returning all results:

function sequence(tasks) {
  var current = Promise.cast(), results = [];
  for (var k = 0; k < tasks.length; ++k) {
    results.push(current = current.thenReturn().then(tasks[k]));
  }
  return Promise.all(results);
}

Is it necessary to make this a part of the promise library? If yes, which one? :P

tkellen commented 10 years ago

To my mind, sequence should return values in the same manner as all, but the order of execution should be guaranteed. I've now run into like 10 different cases where I need this while developing with Knex and Bookshelf.

spion commented 10 years ago

Oh I see your example. Yeah, using higher-order functions is your only option.

Personally I would simply use .then in your example:

return Promise.cast()
.then(drop('test_snapshot_file'))
.then(drop('test_snapshot'))
...

as I don't feel that there is much benefit from using a specialized sequence function in that particular case (when the sequence is already known)

However, that particular solution won't return all the results (but will definitely bail at the first encountered error)

spion commented 10 years ago

I remember wanting this too initially. At that point @domenic persuaded me that the utility grab-bag way isn't really the promise way. The idea is that a promise library would try to give you a few powerful primitives on top of which you can easily implement your own control flow serving a particular need well. The examples above are a proof that the easily part is true, and that the number of possible variations is indeed big.

xixixao commented 10 years ago

+1 for sequencing function, this example is made up:

Promise.reduce(["file1.txt", "file2.txt", "file3.txt"], function(total, fileName) {
    return fs.readFileAsync(fileName, "utf8").then(function(contents) {
        return total + parseInt(contents, 10);
    });
}, 0).then(function(total) {
    //Total is 30
});

Much more often I just need

Promise.sequence(["file1.txt", "file2.txt", "file3.txt"], function(fileName) {
    return fs.readFileAsync(fileName, "utf8");
}).then(function(files) {
    //files is an array of results
});

Though not with files, this is contrived as well of course.

tkellen commented 10 years ago

@spion The benefit of the specialized sequence method is having an easy way of getting an array of results from an array of operations that must be completed in sequence.

I still think this is a hugely needed feature.

tkellen commented 10 years ago

Out of curiosity, why is it that race is a valid utility, but sequence is not?

petkaantonov commented 10 years ago

Why didn't reduce work for you? It even has same amount of loc when compared to you using a first-class sequence function:

exports.down = function (knex) {
  // stupid helper method required to run these in sequence
  var drop = function(table) {
    return function() {
      return knex.schema.dropTableIfExists(table);
    }
  };
  return sequence([
    drop('test_snapshot_file'),
    drop('test_snapshot'),
    drop('test_snapshot_set'),
    drop('event'),
    drop('password_reset_token'),
    drop('oauth_client'),
    drop('access_token'),
    drop('user')
  ])
};
exports.down = function (knex) {
  return Promise.reduce([
    'test_snapshot_file',
    'test_snapshot',
    'test_snapshot_set',
    'event',
    'password_reset_token',
    'oauth_client',
    'access_token',
    'user'
  ], function(values, table) {
    return knex.schema.dropTableIfExists(table).then(function(value) {
      values.push(value);
    }).return(values);
  }, []);
};
tkellen commented 10 years ago

To my mind, this communicates intent a heck of a lot more clearly:

exports.down = function (knex, Promise) {
  // stupid helper method required to run these in sequence
  var drop = function(table) {
    return function() {
      return knex.schema.dropTableIfExists(table);
    }
  };
  return Promise.sequence([
    drop('test_snapshot_file'),
    drop('test_snapshot'),
    drop('test'),
    drop('event'),
    drop('access_token'),
    drop('account')
  ]).then(function (values) {
  // values
  });
};
petkaantonov commented 10 years ago

Those will be executed in parallel because you have called .dropTableIfExists for each item in the array before .sequence is even called.

tkellen commented 10 years ago

Ah, right you are. Updated the comment. In any case, I will concede that this usage is essentially sugar for reduce, but plenty of the existing collection methods are sugar as well.

spion commented 10 years ago

This is a subjective opinion, but I feel that Promise.race deserves to be there alongside everything else that cannot be implemented without a deferred (or a promise constructor)

Another reason: most promise methods operate on collections of promises - as far as I can see none work on collections of functions.

But thats all just bikeshedding. I think the real questions are:

I understand that its kind of awkward for a promise library not to have any straightforward method to execute things sequentially. But I also wonder what adding a single method would lead to...

petkaantonov commented 10 years ago

I see it like array.sum() vs array.reduce((a, b) => a + b, 0), except much worse. In this case you are not even getting the brevity and you need the awkward helper function. It's an extremly powerless function that still has awkwardness when used for its one and only use case.

benjamingr commented 10 years ago

I agree with Petka but not with his argument. Common functions that are very useful for one use case should still be implemented if that use case is very very common.

petkaantonov commented 10 years ago

@benjamingr Ah yes. I assume uncommon use cases like .sum and .sequence :P

benjamingr commented 10 years ago

I actually think so, I've rarely had to use anything like .sequence and when I did .reduce was a better fit. (Thanks for the recent breaking change btw :P )

As for sum, most languages actually implement it. .Sum in C# or sum() in Python.

Although what I'd really want to do is just array.reduce(:+) like in Ruby :D So, while I agree with you, it seems most common programming languages don't (yes, even C++ although their .reduce just defaults to +).

tkellen commented 10 years ago

@petkaantonov okay, how about this:

exports.down = function (knex, Promise) {
  return Promise.sequence([
    knex.schema.dropTableIfExists.bind(knex.schema, 'test_snapshot_file'),
    knex.schema.dropTableIfExists.bind(knex.schema, 'test_snapshot'),
    knex.schema.dropTableIfExists.bind(knex.schema, 'test'),
    knex.schema.dropTableIfExists.bind(knex.schema, 'event'),
    knex.schema.dropTableIfExists.bind(knex.schema, 'access_token'),
    knex.schema.dropTableIfExists.bind(knex.schema, 'account')
  ]).then(function (values) {
  // values
  });
};
spion commented 10 years ago

Some suggestions

var tables = ['test_snapshot_file', 'test_snapshot', 'test', 'event', 
              'access_token', 'account'];

Promise.reduce(function(previous, table) { 
  var p = knex.schema.dropTableIfExists(table);
  return Promise.all(previous.concat([p]));
}), tables, []);

Generalizing to a higher order functional abstraction

function sequentially(f) {
  return function(acc, el) {
    return Promise.all(acc.concat([f(el)]));
  }
}

then

Promise.reduce(sequentially(knex.schema.dropTableIfExists), tables, []);
tkellen commented 10 years ago

That's a nice method too. I think we've established that there are numerous ways to skin this cat with clever hacks utilizing reduce. The point I'm trying to make is that it would be nice if one of these clever hacks was available in bluebird.

xixixao commented 10 years ago

@spion I would go with the definition I used in my example, with the function argument defaulting to identity. If you need the result from previous promise, you should use reduce, imo.

Promise.sequence(tasks) // Sequence without waterfalling, returning all results
spion commented 10 years ago

@xixixao your example, with sequence

Promise.sequence(["file1.txt", "file2.txt", "file3.txt"], function(fileName) {
    return fs.readFileAsync(fileName, "utf8");
}).then(function(files) {
    //files is an array of results
});

with reduce:

Promise.reduce(["file1.txt", "file2.txt", "file3.txt"], function(items, fileName) {
    return items.concat([fs.readFileAsync(fileName, "utf8")])
}).then(function(files) {
    //files is an array of results
});

I don't see any benefit there. Maybe performance? No idea.

Sequence is only maybe useful with a heterogeneous functions and arguments, like in https://github.com/tgriesser/knex/issues/148#issuecomment-31872029 ... @tkellen are the return values of createTable actually useful there?

Personally, I might be convinced that there is a need if I see an example that

In that case, a sequence function that returns all the results may be useful. The only remaining doubts I'd have would be:

I'd probably be convinced its useful, but I'd still feel it belongs in a separate module. By the way, here is a separate module that implements it in a promise-library-agnostic way

xixixao commented 10 years ago

@spion Aren't you missing an empty array? Which is exactly why I'd argue that sequence is better for the user of the library, especially the ones who know for loops but not FP.

spion commented 10 years ago

@xixixao thats not a valid argument - I forgot it because I wrote the example in 30 seconds without even checking it. Also - why are you reading files in sequence anyway? I think a better example would do a lot to improve the argument there.

By the way, here is what I fear would happen if going the utility grab-bag way:

Lets say there is a sequence method.

Promise.sequence(["file1.txt", "file2.txt", "file3.txt"], function(fileName) {
    return fs.readFileAsync(fileName, "utf8");
}).then(function(files) {
    //files is an array of results
});

But now I want to send the result of reading the previous file to the function too - what would I do then? I need yet another utility function.

Promise.sequenceWaterfall(["file1.txt", "file2.txt", "file3.txt"], function(previousResult, nextName) {
  return someFunction(previousResult, nextName);
}).then(function(files) {
    //files is an array of results
});

Now what if I want to limit parallelism? Of course, yet another utility function.

Promise.mapLimit(["file1.txt", "file2.txt", "file3.txt"], function(fileName) {
    return fs.readFileAsync(fileName, "utf8");
}).then(function(files) {
    //files is an array of results
});

Oh and what about mapWaterfallLimit?

Or perhaps promise streams which can deal with potentially endless lists of objects with backpressure support. Should those also be a part of Bluebird? Why not?

This is an exponential explosion of built in helpers, as seen in caolan's async. Exponential because if you have just 3 options with 2 states, you get 2^3 = 8 helpers.

Now I'll take the other approach:

Normal:

current = Promise.cast(); 
Promise.map(files, function(fileName) {
  return current = current.then(function() { 
    return fs.readFileAsync(fileName, "utf8"); });
  }); 
}).then(function(files) {

});

Waterfallish

current = Promise.cast(); 
Promise.map(files, function(fileName) {
  return current = current.then(function(previous) { 
    return fs.readFileAsync(previous, fileName, "utf8"); 
  });
}).then(function(files) {

});

Parallel with limit and waterfalling

var queued = [], parallel = 3; 
Promise.map(files, function(fileName) {
  var mustComplete = Math.max(0, queued.length - parallel + 1);
  var current = Promise.some(queued, mustComplete).then(function(previous) { 
    return fs.readFileAsync(previous, fileName, "utf8"); 
  });
  queued.push(current);
  return current;
}).then(function(results) {

});

Its still very compact and understandable, even in the parallel-with-limit case, and it sure is a lot more flexible. Its practically the same code. And no reduce was used either. Frankly, sequence is just crude and limited, but of course, if your use case demands it often enough its trivial enough to make a utility function for that specific use case.

xixixao commented 10 years ago

My example is very bad, I just wanted to point out originally how artificial the example for reduce is (so 2 bad examples). I agree it is crude and limited, and it would just be a convenience. My use cases where simple serial async procedures (I need to do a, b, c, each being async, in that order). If you want to keep the API minimal, then I'd agree to not include it.

ashtuchkin commented 10 years ago

Please take my vote for adding the sugar methods, especially for 'parallel-with-limit' cases. I'm using it quite regularly (f.ex. http requests for large list of addresses, or when I need to do lots of db requests) and its hard to believe that the code above is in any way compact and understandable. Compare with something like:

Promise.mapLimit(files, 3, fs.readFileAsync)
.then(function(results) {

}); 
benjamingr commented 10 years ago

I just wanted to say I think something like WithDegreeOfParallelism like @ashtuchkin suggested sounds a lot better than .sequence

hiddentao commented 10 years ago

Reading the discussion so far it sounds like .reduce() is the way to do things if you wish to perform the same async operation sequentially across multiple inputs. But what if I have differing operations I wish to perform in sequence and outputs don't feed inputs (i.e. no need for waterfall)? then I think it would be nice to have a convenience abstraction like sequence ready to use.

matthiasg commented 10 years ago

Though I am also not a big fan of adding too many helpers .. i would see .sequence as a good addition. It would help people to find a common usecase in the documentation. .sequence should be very limited, possible only have very few (if any) options. possible only the degree of parallelism. THe documentation of that simple convenience method could the include some of the great examples of this thread for more elaborate use cases... possible those use cases could also be put in another module ... having a higher-level module might make sense anyway for most or even all higher order functions, but that higher-level module should be mentioned in the documentation too then ...

great discussion btw :) its important to limit the surface area of any API or things can get unwieldy quickly..

benjamingr commented 10 years ago

@matthiasg the argument against it is mainly that promises are already started so limiting their concurrency from the promise library makes little sense in the current model.

@petkaantonov believes that the limit should be performed by the actor creating the promises and not the promise library.

I think it makes sense in methods like the non-static .map.

I think everyone agrees that the use case exists and is a real one.

matthiasg commented 10 years ago

@bejamingr good point from an architecture standpoint internally .. I would also agree its just convenience and should be put into an additional module. its a little like leveldown and levelup etc … still a mention in the API documentation would be helpful to help potential users understand ..

I really like the bluebird implementation so anything to make it easy to get started for new users is helpful… people are still using the Q library and others. If we had an abstraction layer on top that might not even rely on bluebird per se but on the Promises Spec itself , the promise libraries would have a much higher degree of competition and speed of evolution should be higher (in theory since users can switch more simply)

cheers

From: benjamingr Sent: ‎Friday‎, ‎February‎ ‎28‎, ‎2014 ‎11‎:‎36‎ ‎AM To: petkaantonov/bluebird Cc: mgt576@gmail.com

@matthiasg the argument against it is mainly that promises are already started so limiting their concurrency from the promise library makes little sense in the current model.

@petkaantonov believes that the limit should be performed by the actor creating the promises and not the promise library.

I think it makes sense in methods like the non-static .map.

I think everyone agrees that the use case exists and is a real one.

— Reply to this email directly or view it on GitHub.

petkaantonov commented 10 years ago

Sequence would be worst addition as in practice it will just turn out to be a less powerful .reduce.

Everyone here is probably thinking that they would be able to just do Promise.sequence(promises...)but that's not true and in fact impossible - you will need exactly as much boilerplate as you would need with .reduce.

benjamingr commented 10 years ago

That's not what anyone here thinks at all...

On 28 בפבר 2014, at 18:09, Petka Antonov notifications@github.com wrote:

Sequence would be worst addition as in practice it will just turn out to be a less powerful .reduce.

Everyone here is probably thinking that they would be able to just do Promise.sequence(promises...)but that's not true and in fact impossible - you will need exactly as much boilerplate as you would need with .reduce.

— Reply to this email directly or view it on GitHub.

matthiasg commented 10 years ago

I at last was really only referring to the use case of NOT forwarding results to the next job .. just to X things in order at limited concurrency .. use case e.g. querying x http requests or in my case sending x http posts requests which...

benjamingr commented 10 years ago
Promise.map([1,2,3,4,5,6]).sequence(num => scrape("url?page="+num))

Sounds a lot more likely

On 28 בפבר 2014, at 18:09, Petka Antonov notifications@github.com wrote:

Sequence would be worst addition as in practice it will just turn out to be a less powerful .reduce.

Everyone here is probably thinking that they would be able to just do Promise.sequence(promises...)but that's not true and in fact impossible - you will need exactly as much boilerplate as you would need with .reduce.

— Reply to this email directly or view it on GitHub.

petkaantonov commented 10 years ago

That code doesn't make any sense :P map without mapper function? calling sequence without an array?

benjamingr commented 10 years ago

I'm on mobile, that map was to illustrate it's the result of a mapping, not actual code.

I'll have a better example when I get home :)

On 28 בפבר 2014, at 18:17, Petka Antonov notifications@github.com wrote:

That code doesn't make any sense :P map without mapper function? calling sequence without an array?

— Reply to this email directly or view it on GitHub.

matthiasg commented 10 years ago

thats actually a good example ... of course reduce CAN be used .. but the verb reduce is just not right .. but then i think we are mostly in agreement question is only where to put 'sequence'... might make sense as an optional addin ala chai.js or other libs to keep the fluency ... or with something like Promise.map(...).then(sequenceOf(scapeUrl)) ... where sequence would be a function returning a function.... that way the promise object would not be polluted

the extended example from @benjamingr would be this i hope

Promise.map([1,2,3,4,5,6],buildUrlFromNumber).sequence(scrapeUrl)

function buildUrlFromNumber(id){
  return 'http://test/'+d;
}

function scrapeUrl(url)
{
   ... http get url ... write to database ... return promise of that 
}
matthiasg commented 10 years ago

the idea with functions returning functions would look like this .. personally i like this, since these functions are reusable (in a functional programming kind of way (e.g. when you know where to put them) , composable, dont pollute the Promise object .. and could be each either have their own module or at least be in very small external modules...


// in strict sequence with no passing of results
Promise.resolve(urls).then(sequenceOf(function(url) {
  return doYourThingWithUrl(url);
})).then(function() {
  //.. done
});

//  with parallelism

Promise.resolve(urls).then(inParallel(10, function(url) {
  return doYourThingWithUrl(url);
})).then(function() {
  // ... done
});

function sequenceOf(functionToCall) {
  return function(tasks) {
    var current = Promise.cast();
    return Promise.map(tasks, function(task) {
      return current = current.then(function() {
        return functionToCall(task);
      });
    });
  };
};

function inParallel(parallelism, functionToCall) {
  return function(tasks) {
    var queued = [];
    return Promise.map(tasks, function(task) {
      var mustComplete = Math.max(0, queued.length - parallelism + 1);
      var current = Promise.some(queued, mustComplete).then(function() {
        return functionToCall(task);
      });
      queued.push(current);
      return current;
    });
  };
};
matthiasg commented 10 years ago

i encapsulated my previous comment in https://github.com/matthiasg/bluebird-as

markstos commented 10 years ago

Thanks for publishing those utilities, @matthiasg.

marclar commented 10 years ago

In case it's useful, here's how it finally made sense to me (note I'm not keeping track of the results of each individual promise):

//Edited; steps are functions that return promises, not an array of promises as originally written
var steps = [function1, function2, function3];
var chain = Promise.cast();
while(steps.length){
    chain = chain.then(steps.shift());
}