futureverse / future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
https://future.apply.futureverse.org
211 stars 16 forks source link

When stopping with an error, return the item that caused the error #101

Open Kodiologist opened 2 years ago

Kodiologist commented 2 years ago

Consider, for example,

future.apply::future_lapply(1:10, function(x)
    if (x == 8) stop("o no") else x + 1)

The error produced is

Error in ...future.FUN(...future.X_jj, ...) : o no

which doesn't provide any hint that 8 was the guilty input.

I think this is related to but smaller than #32.

HenrikBengtsson commented 2 years ago

Hi. Just to clarify, future.apply tries to reproduce the functionality of the base R apply family, which in this case gives:

> y <- lapply(1:10, function(x) if (x == 8) stop("o no") else x + 1)
Error in FUN(X[[i]], ...) : o no

Although I can make a few guesses, it's not 100% clear to me exactly what you're asking for. In your subject "When stopping with an error, return the item that caused the error" you use "return". Is that what you mean? If so, something like:

y <- lapply(1:10, function(x) tryCatch({
  if (x == 8) stop("o no") else x + 1
}, error = identity))

will do it (also for future_lapply()), i.e.

> str(y)
List of 10
 $ : num 2
 $ : num 3
 $ : num 4
 $ : num 5
 $ : num 6
 $ : num 7
 $ : num 8
 $ :List of 2
  ..$ message: chr "o no"
  ..$ call   : language doTryCatch(return(expr), name, parentenv, handler)
  ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
 $ : num 10
 $ : num 11

... or do you mean the error message should have more clues on which iteration failed?

Kodiologist commented 2 years ago

Sorry I was imprecise. I'm not sure it's necessary that the value be literally returned, no. I just mean that the error message should have more clues on which iteration failed.

Kodiologist commented 2 years ago

Or perhaps the failed iteration could be identified post-hoc with traceback, if that's possible to implement.

HenrikBengtsson commented 1 year ago

Some more thoughts on this below ...

I just mean that the error message should have more clues on which iteration failed.

To stay sane, but also to make it 100% clear what to expect from it, the future.apply package aims at emulating base-R apply functions. So, adding such a feature would in principle require adding it to base-R function, which begs the question:

For instance, instead of:

Error in FUN(X[[i]], ...) : o no

should it say:

Error in FUN(X[[i]], ...) : o no
where X[[i]] is: int 8

To be clear, I don't know the answer to this, but that would be the start of supporting something similar in future.apply.

Or perhaps the failed iteration could be identified post-hoc with traceback, if that's possible to implement.

Maybe one approach would be to try to emulate what we get with:

options(error = function() if (interactive()) utils::recover())

For example,

> y <- lapply(1:10, function(x) if (x == 8) stop("o no") else x + 1)
Error in FUN(X[[i]], ...) : o no

Enter a frame number, or 0 to exit   

1: lapply(1:10, function(x) if (x == 8) stop("o no") else x + 1)
2: FUN(X[[i]], ...)
3: #1: stop("o no")
4: (function () 
if (interactive()) utils::recover())()

Selection: 1
Called from: lapply(1:10, function(x) if (x == 8) stop("o no") else x + 1)
Browse[1]> str(i)
 int 8
Browse[1]> str(X[[i]])
 int 8
Browse[1]> 

When trying the with future.apply, we get:

> y <- future.apply::future_lapply(1:10, function(x) if (x == 8) stop("o no") else x + 1)
Error in ...future.FUN(...future.X_jj, ...) : o no

Enter a frame number, or 0 to exit   

 1: future.apply::future_lapply(1:10, function(x) if (x == 8) stop("o no") else
 2: future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...), get_chu
 3: withCallingHandlers({
    values <- local({
        oopts <- options(future.r
 4: (function() {
    oopts <- options(future.rng.onMisuse.keepFuture = FALSE)

 5: value(fs)
 6: value.list(fs)
 7: resolve(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE)
 8: resolve.list(y, result = TRUE, stdout = stdout, signal = signal, force = TR
 9: signalConditionsASAP(obj, resignal = FALSE, pos = ii)
10: signalConditions(obj, exclude = getOption("future.relay.immediate", "immedi
11: stop(condition)
12: (function () 
if (interactive()) utils::recover())()

There is currently no way to "reach into" the call stack the same way in this case. To do that, the future framework would have to be able record and inject a fake call stack to work with. I don't know how to do that, but I know that recording the call stack can be very expensive if running in parallel, e.g. lots of objects will have to be send back from the parallel worker.