HenrikBengtsson / Wishlist-for-R

Features and tweaks to R that I and others would love to see - feel free to add yours!
https://github.com/HenrikBengtsson/Wishlist-for-R/issues
GNU Lesser General Public License v3.0
133 stars 4 forks source link

parallel: error object produced on a background PSOCK cluster is not accessible on master #57

Open HenrikBengtsson opened 6 years ago

HenrikBengtsson commented 6 years ago

Issue

The original error produced while evaluating an expression on a background PSOCK cluster is not accessible to the master process. Instead, only a regenerated 'simpleError' with the same condtional message is produced and available in the master process.

Example

library("parallel")

cl <- makeCluster(1L, type = "PSOCK")
print(class(cl))
# [1] "SOCKcluster" "cluster"

res <- tryCatch(clusterEvalQ(cl, {
  stop("boom")
}), error = identity)
print(res)
# <simpleError in checkForRemoteErrors(lapply(cl, recvResult)): one node produced an error: boom>
print(class(res))
# [1] "simpleError" "error"       "condition"
stopifnot(inherits(res, "error"), inherits(res, "simpleError"))

> res <- tryCatch(clusterEvalQ(cl, {
+   stop(structure(list(message = "boom"), class = c("MyError", "error", "condition")))
+ }), error = identity)

print(res)
# <simpleError in checkForRemoteErrors(lapply(cl, recvResult)): 2 nodes produced errors; first error: boom>
print(class(res))
# [1] "simpleError" "error"       "condition"
str(res)
# List of 2
#  $ message: chr "2 nodes produced errors; first error: boom"
#  $ call   : language checkForRemoteErrors(lapply(cl, recvResult))
#  - attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

stopifnot(inherits(res, "error"), inherits(res, "MyError"))  ## <== THIS FAILS!
# Error: inherits(res, "MyError") is not TRUE

stopCluster(cl)

Troubleshooting

This is because the error that is regenerated with the original message but without preserving the original error object:

checkForRemoteErrors <- function(val)
{
    count <- 0checkForRemoteErrors <- function(val)
{
    count <- 0
    firstmsg <- NULL
    for (v in val) {
        if (inherits(v, "try-error")) {
            count <- count + 1
            if (count == 1) firstmsg <- v
        }
    }
    ## These will not translate
    if (count == 1)
        stop("one node produced an error: ", firstmsg, domain = NA)
    else if (count > 1)
        stop(count, " nodes produced errors; first error: ", firstmsg, domain = NA)
    val
}
    firstmsg <- NULL
    for (v in val) {
        if (inherits(v, "try-error")) {
            count <- count + 1
            if (count == 1) firstmsg <- v
        }
    }
    ## These will not translate
    if (count == 1)
        stop("one node produced an error: ", firstmsg, domain = NA)
    else if (count > 1)
        stop(count, " nodes produced errors; first error: ", firstmsg, domain = NA)
    val
}

However, since the transferred error is of class try-error, the original error should be available in attribute "condition", e.g.

Error : boom

> str(res)
Class 'try-error'  atomic [1:1] Error : boom

  ..- attr(*, "condition")=List of 1
  .. ..$ message: chr "boom"
  .. ..- attr(*, "class")= chr [1:3] "MyError" "error" "condition"

> attr(res, "condition")
<MyError: boom>

> str(attr(res, "condition"))
List of 1
 $ message: chr "boom"
 - attr(*, "class")= chr [1:3] "MyError" "error" "condition"

EDIT 2018-02-21: There was a typo in my code resulting in invalid error objects.

HenrikBengtsson commented 6 years ago

UPDATE: It turns out that the original error object is lost already on the worker side - it is never sent to the master process. From https://github.com/wch/r-source/blob/R-3-4-branch/src/library/parallel/R/worker.R#L33-L40:

                ## This uses the message rather than the exception since
                ## the exception class/methods may not be available on the
                ## master.
                handler <- function(e) {
                    success <<- FALSE
                    structure(conditionMessage(e),
                              class = c("snow-try-error","try-error"))
}

That source code comment also explains that this is not a mistake but a deliberate decision. However, it wouldn't hurt to at least to pass it along.