HenrikBengtsson / future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
https://future.apply.futureverse.org
209 stars 16 forks source link

finer control over future_lapply() #60

Open MLopez-Ibanez opened 4 years ago

MLopez-Ibanez commented 4 years ago

I'd like to implement the following using futures, but it doesn't seem possible yet?

HenrikBengtsson commented 4 years ago

Apply a function over a list of objects and get a list of futures. ...

This is by design. The future.apply API mimics the base R "apply" API as far as possible - but neither more or less than that. So from the "outside", the only difference the developer sees is that the functions starts with a future_ prefix. This way there are no surprises what the future.apply package is meant to do.

Now, I do mention in the README under 'Roadmap' that:

  1. Consider additional future_*apply() functions and features that fit in this package but don't necessarily have a corresponding function in base R. Examples of this may be "apply" functions that return futures rather than values, mechanisms for benchmarking, and richer control over load balancing.

This is also touched upon in Issue #32 and Issue #44, and possibly elsewhere too. However, it's far from obvious what such an API should look like and what it should support or not. It might also be better suited for another package. There's a risk of opening up the current API with features not existing in base R, e.g. it might be confusing and the existing API might be used in the wrong way. I see with with just future()/value() and %<-% where people attempt to do to y %<-% future(...) and end up in an trial'n'error mess.

You can always do:

fs <- lapply(X, FUN = function(x) future({
  ...
}))

to create your own futures. This wouldn't give you chunking ("load balancing") - you'd get one future per element in X. You could hack together some approach where you use chunks <- future_lapply(seq_along(X), FUN = function(idxs) { ... }) to figure out what the chunks are and what .Random.seed each element should be that's rather tedious.

To build your own map-reduce functions for future will be much easier when the future.chunks package is available. This is mentioned in Issue #59. But it's be a while before I get some solid to work on that.

Be able to cancel futures ...

Termination of futures is currently not supported by the Future API. This is something that needs to be implemented in the future package before anything can be done higher up. Getting a consistent API for terminating futures is not easy because it depends on the backend used. Such a feature most likely have to be optional, i.e. it might or might not work depending on backend and context. This further complicates how it can be used in cases like you propose. See https://github.com/HenrikBengtsson/future/issues/93 for more details.