lists of futures - Githubissues

Raffiki commented 10 years ago

trying to get my head around the abstraction of lists of futures.

ex.: getListOfUsers().chain(getUserDetails)

where the function getUserDetails maps the list of data to a list of futures.

Could you give me a hand on how to handle this or change my implementation?

fwiw, like the libraries very much. Thank you

robotlolita commented 10 years ago

Assuming that these functions have types:

getListOfUsers :: () -> Future [User]
getUserDetails :: [User] -> [Future UserDetails]

You'll need to compute a single Future from the list of Futures, in order to return that value to the chain operation. There's a common operation for this on all monads called sequence, which takes a list of monads, and returns a monad containing a list of the result of those computations. My control.monads has an implementation.

It works basically like this:

You have a list of monadic computations, that when resolved returns a, so [Monad(a)]
You can fold the list of things, starting from a monad for the empty list ([]), and compute one operation at a time, combining the result in the accumulated list.

In your code, this could be used like so:

getListOfUsers().chain(function(users) {
  return sequence(getUserDetails(users))
})

// Or using function composition
getListOfUsers().chain(compose(sequence, getUserDetails))

This computes the user details sequentially, however. You might want to compute them in parallel. The control.monads doesn't have this operation, though I'm planning on writing one, and it'll probably end up in control.concurrent. Computing in parallel is slightly more complex than sequentially:

You have a list of monadic computations.
You return a Future saying that you'll give back a list of the result of those computations eventually.
You chain all of the monadic computations at the same time, and stores then into the appropriate place in the resulting list. Once all of them have been computed, the Future is resolved.

An implementation of this could be:

var Future = require('data.future')

function parallel(monads) {
  return new Future(function(reject, resolve) {
    var length = monads.length
    var result = new Array(length)
    var resolved = false
    monads.map(compute)

    function compute(monad, index) {
      monad.fork(
        function(error) { 
          if (resolved) return
          resolved = true
          reject(error)
        }, function(value) {
          if (resolved) return
          result[index] = value
          if (--length === 0) {
            resolved = true
            resolve(result)
          }
        }
      )
  })
}

And you could just replace the sequence operation by the parallel operation without changing anything else in your code to get the speed up of performing the monadic computations in parallel :)

I hope this answer your question. If there's anything that isn't clear yet, just ask away :)

Raffiki commented 10 years ago

yip, get it! thanks

robotlolita commented 10 years ago

Sorry, the previous snippet of parallel I provided was wrong, due to the pure semantics of data.future. You'll need to fork the monad to actually run the computation. Future 2.0 also removes the memoisation by default, so the updated snippet accounts for that and doesn't fulfil the future twice.

I've implemented this, along with timeout and non-deterministic choices in the core.async library.

Good thing about this first-class treatment and the pure I/O part is that your timeout is just data:

var search = parallel(query('foo'), query('bar'))
var failingSearch = nondeterministicChoice(search, timeout(1000))

failingSearch
  .map(doSomethingWithTheResultsOfFooAndBar)
  .orElse(doSomethingWithTheTimeout)

folktale / data.task

lists of futures #5