DavisVaughan / furrr

Apply Mapping Functions in Parallel using Futures
https://furrr.futureverse.org/
Other
695 stars 39 forks source link

error: external pointer is not valid is not a particularly user actionable message #258

Open twest820 opened 1 year ago

twest820 commented 1 year ago

I've got a future_map() function which runs fine with plan(multisession, workers = 1) and fails with

Error in (function (.x, .f, ..., .progress = FALSE)  : ℹ In index: 1.
Caused by error:
! external pointer is not valid

for workers > 1.

From the point at which the error's thrown I know it's a problem somewhere in 158 lines of code which use 10 different packages. While I've narrowed that down by deleting code from the function passed to future_map() (the first problematic line of code encountered is a call to terra::crop() but there may be others) it'd be nice if furrr could be more specific, for example by indicating which R workspace variable the problematic pointer goes with, saying why the pointer isn't valid, and what to do to make it valid.

I suspect this error might be raised from somewhere under package globals but, as there's no stack trace available, have no way of confirming. (I get no hits for the error text in either furrr or globals but it looks like github search might be having a hard time because the string contains the word not.)

DavisVaughan commented 1 year ago

That error comes from base R, not from furrr.

This vignette explains in more detail https://future.futureverse.org/articles/future-4-non-exportable-objects.html

You could try setting options(future.globals.onReference = "error") as recommended here to try and figure out where it comes from https://future.futureverse.org/articles/future-4-non-exportable-objects.html#protect-against-non-exportable-objects

twest820 commented 1 year ago

Thanks! Unfortunately, options(future.globals.onReference = "error") seems to misunderstand other code and errors out on something which doesn't look like it should be a problem.

Error in (function (.x, .f, ..., .progress = FALSE)  : ℹ In index: 1.
Caused by error:
! Detected a non-exportable reference (‘externalptr’) in the value (of class ‘LAS’) of the resolved future

This appears to be triggered by the LAS object returned from a call to lidR::clip_roi(). However, those objects are manipulated only at furrr function body scope, are held only in memory, and thus never need to be transferred between workers (or written to disk). So there shouldn't be need to export them.

DavisVaughan commented 1 year ago

I'd have to see a minimal example. furrr and future seem to be trying to export that object to your workers

DavisVaughan commented 1 year ago

I think you can set options(future.debug = TRUE) to get a very detailed log

twest820 commented 1 year ago

I'd have to see a minimal example.

Only a one line difference from #259.

library(dplyr)
library(furrr)
library(sf)
library(terra)

plan(multisession, workers = 16)

simpleFeatureCollection = st_read("simpleFeatureCollection.gpkg")
mediumSizeRaster = rast("twoGBraster.tif") # global in this case

with_progress({
  progressBar = progressor(steps = nrow(simpleFeatureCollection))

  future_map(simpleFeatureCollection$ID, function(polygonID)
  {
    regionOfInterestPolygon = (simpleFeatureCollection %>% filter(ID == polygonID))[1]
    rasterRegionOfInterest = crop(mediumSizeRaster, regionOfInterestPolygon)

    <do computationally intensive things>

    progressBar(<update message>)
  })
})

I think you can set options(future.debug = TRUE) to get a very detailed log

Indeed! Unfortunately I'm not spotting anything illuminating. Or seeing anything in the other future options which would write out the failed FutureResult to get at the call stack and maybe have some minimal hint at which symbol is NULL. Seems like this has been, at least partially, a known limitation of future for some time (https://github.com/HenrikBengtsson/future/issues/478). I guess the workaround would to be write against GDAL in C++ with the task parallel library to get a normal amount of diagnostic information, though maybe having this issue will help a little with prioritizing work over in future.

[11:03:35.758] MultisessionFuture started
[11:03:35.758] - Launch lazy future ... done
[11:03:35.759] run() for ‘MultisessionFuture’ ... done
[11:03:35.759] resolve() on list ...
[11:03:35.759]  recursive: 0
[11:03:35.759]  length: 2
[11:03:35.759] 
[11:03:35.999] receiveMessageFromWorker() for ClusterFuture ...
[11:03:35.999] - Validating connection of MultisessionFuture
[11:03:36.000] - received message: FutureResult
[11:03:36.000] - Received FutureResult
[11:03:36.000] - Erased future from FutureRegistry
[11:03:36.000] result() for ClusterFuture ...
[11:03:36.000] - result already collected: FutureResult
[11:03:36.001] result() for ClusterFuture ... done
[11:03:36.001] signalConditions() ...
[11:03:36.001]  - include = ‘immediateCondition’
[11:03:36.001]  - exclude = 
[11:03:36.001]  - resignal = FALSE
[11:03:36.001]  - Number of conditions: 1
[11:03:36.001] signalConditions() ... done
[11:03:36.002] receiveMessageFromWorker() for ClusterFuture ... done
[11:03:36.002] Future #2
[11:03:36.002] result() for ClusterFuture ...
[11:03:36.002] - result already collected: FutureResult
[11:03:36.002] result() for ClusterFuture ... done
[11:03:36.002] result() for ClusterFuture ...
[11:03:36.002] - result already collected: FutureResult
[11:03:36.002] result() for ClusterFuture ... done
[11:03:36.002] signalConditions() ...
[11:03:36.003]  - include = ‘immediateCondition’
[11:03:36.003]  - exclude = 
[11:03:36.003]  - resignal = FALSE
[11:03:36.003]  - Number of conditions: 1
[11:03:36.003] signalConditions() ... done
[11:03:36.003] signalConditionsASAP(MultisessionFuture, pos=2) ...
[11:03:36.003] - nx: 2
[11:03:36.003] - relay: TRUE
[11:03:36.004] - stdout: TRUE
[11:03:36.004] - signal: TRUE
[11:03:36.004] - resignal: FALSE
[11:03:36.004] - force: TRUE
[11:03:36.004] - relayed: [n=2] FALSE, FALSE
[11:03:36.004] - queued futures: [n=2] FALSE, FALSE
[11:03:36.004]  - until=1
[11:03:36.004]  - relaying element #1
[11:03:36.004] - relayed: [n=2] FALSE, FALSE
[11:03:36.005] - queued futures: [n=2] FALSE, TRUE
[11:03:36.005] signalConditionsASAP(NULL, pos=2) ... done
[11:03:36.005]  length: 1 (resolved future 2)
[11:03:46.083] receiveMessageFromWorker() for ClusterFuture ...
[11:03:46.083] - Validating connection of MultisessionFuture
[11:03:46.084] - received message: FutureResult
[11:03:46.084] - Received FutureResult
[11:03:46.085] - Erased future from FutureRegistry
[11:03:46.085] result() for ClusterFuture ...
[11:03:46.085] - result already collected: FutureResult
[11:03:46.085] result() for ClusterFuture ... done
[11:03:46.085] signalConditions() ...
[11:03:46.085]  - include = ‘immediateCondition’
[11:03:46.085]  - exclude = 
[11:03:46.085]  - resignal = FALSE
[11:03:46.086]  - Number of conditions: 1
[11:03:46.086] signalConditions() ... done
[11:03:46.086] receiveMessageFromWorker() for ClusterFuture ... done
[11:03:46.086] Future #1
[11:03:46.086] result() for ClusterFuture ...
[11:03:46.086] - result already collected: FutureResult
[11:03:46.086] result() for ClusterFuture ... done
[11:03:46.086] result() for ClusterFuture ...
[11:03:46.086] - result already collected: FutureResult
[11:03:46.086] result() for ClusterFuture ... done
[11:03:46.087] signalConditions() ...
[11:03:46.087]  - include = ‘immediateCondition’
[11:03:46.087]  - exclude = 
[11:03:46.087]  - resignal = FALSE
[11:03:46.087]  - Number of conditions: 1
[11:03:46.087] signalConditions() ... done
[11:03:46.088] signalConditionsASAP(MultisessionFuture, pos=1) ...
[11:03:46.088] - nx: 2
[11:03:46.088] - relay: TRUE
[11:03:46.088] - stdout: TRUE
[11:03:46.088] - signal: TRUE
[11:03:46.088] - resignal: FALSE
[11:03:46.088] - force: TRUE
[11:03:46.088] - relayed: [n=2] FALSE, FALSE
[11:03:46.089] - queued futures: [n=2] FALSE, TRUE
[11:03:46.089]  - until=1
[11:03:46.089]  - relaying element #1
[11:03:46.089] result() for ClusterFuture ...
[11:03:46.089] - result already collected: FutureResult
[11:03:46.089] result() for ClusterFuture ... done
[11:03:46.089] result() for ClusterFuture ...
[11:03:46.089] - result already collected: FutureResult
[11:03:46.089] result() for ClusterFuture ... done
[11:03:46.089] signalConditions() ...
[11:03:46.090]  - include = ‘immediateCondition’
[11:03:46.090]  - exclude = 
[11:03:46.090]  - resignal = FALSE
[11:03:46.090]  - Number of conditions: 2
[11:03:46.090] signalConditions() ... done
[11:03:46.090] result() for ClusterFuture ...
[11:03:46.090] - result already collected: FutureResult
[11:03:46.090] result() for ClusterFuture ... done
[11:03:46.090] signalConditions() ...
[11:03:46.091]  - include = ‘immediateCondition’
[11:03:46.091]  - exclude = 
[11:03:46.091]  - resignal = FALSE
[11:03:46.091]  - Number of conditions: 2
[11:03:46.091] signalConditions() ... done
[11:03:46.091] result() for ClusterFuture ...
[11:03:46.091] - result already collected: FutureResult
[11:03:46.091] result() for ClusterFuture ... done
[11:03:46.091] signalConditions() ...
[11:03:46.092]  - include = ‘condition’
[11:03:46.092]  - exclude = ‘immediateCondition’
[11:03:46.092]  - resignal = TRUE
[11:03:46.092]  - Number of conditions: 2
[11:03:46.092]  - Condition #1: ‘RngFutureWarning’, ‘FutureWarning’, ‘warning’, ‘RngFutureCondition’, ‘FutureCondition’, ‘condition’
[11:03:46.092]  - Condition #2: ‘purrr_error_indexed’, ‘rlang_error’, ‘error’, ‘condition’
Error in (function (.x, .f, ..., .progress = FALSE)  : 
  ℹ In index: 1.
Caused by error in `.External()`:
! NULL value passed as symbol address
[11:03:46.120] signalConditions() ... done
[11:03:46.120] - relayed: [n=2] FALSE, FALSE
[11:03:46.121] - queued futures: [n=2] TRUE, TRUE
[11:03:46.121] signalConditionsASAP(MultisessionFuture, pos=1) ... done