rvalavi / blockCV

The blockCV package creates spatially or environmentally separated training and testing folds for cross-validation to provide a robust error estimation in spatially structured environments. See
https://doi.org/10.1111/2041-210X.13107
GNU General Public License v3.0
109 stars 24 forks source link

Error: "Initialization of plan() failed" when using spatialAutoRange #10

Closed mmfink closed 4 years ago

mmfink commented 4 years ago

The call:

spAR <- spatialAutoRange(rasterLayer = layerStk,
                         sampleNumber = 5000,
                         doParallel = TRUE,
                         nCores = 6)

where layerStk is a raster::stack of 12 input rasters. These are very large, continuous value rasters.

Error message: Error: Initialization of plan() failed, because the test future used for validation failed. The reason was: Unexpected result (of class ‘NULL’ != ‘FutureResult’) retrieved for MultisessionFuture future (label = ‘future-plan-test’, expression = ‘NA’): . This suggests that the communication with MultisessionFuture worker (‘SOCKnode’ #1) is out of sync.

Additional output:

List of 2
 $ node_idx: int 1
 $ node    :List of 5
  ..$ con         : 'sockconn' int 3
  .. ..- attr(*, "conn_id")=<externalptr> 
  ..$ host        : chr "localhost"
  .. ..- attr(*, "localhost")= logi TRUE
  ..$ rank        : int 1
  ..$ rshlogfile  : NULL
  ..$ session_info:List of 6
  .. ..$ r      :List of 15
  .. .. ..$ platform      : chr "x86_64-w64-mingw32"
  .. .. ..$ arch          : chr "x86_64"
  .. .. ..$ os            : chr "mingw32"
  .. .. ..$ system        : chr "x86_64, mingw32"
  .. .. ..$ status        : chr ""
  .. .. ..$ major         : chr "3"
  .. .. ..$ minor         : chr "6.3"
  .. .. ..$ year          : chr "2020"
  .. .. ..$ month         : chr "02"
  .. .. ..$ day           : chr "29"
  .. .. ..$ svn rev       : chr "77875"
  .. .. ..$ language      : chr "R"
  .. .. ..$ version.string: chr "R version 3.6.3 (2020-02-29)"
  .. .. ..$ nickname      : chr "Holding the Windsock"
  .. .. ..$ os.type       : chr "windows"
  .. ..$ system :List of 8
  .. .. ..$ sysname       : chr "Windows"
  .. .. ..$ release       : chr "Server 2012 R2 x64"
  .. .. ..$ version       : chr "build 9600"
  .. .. ..$ nodename      : chr "*****"
  .. .. ..$ machine       : chr "x86-64"
  .. .. ..$ login         : chr "***"
  .. .. ..$ user          : chr "***"
  .. .. ..$ effective_user: chr "***"
  .. ..$ libs   : chr [1:2] "****" "****"
  .. ..$ pkgs   : NULL
  .. ..$ pwd    : chr "H:/HOTR_models"
  .. ..$ process:List of 1
  .. .. ..$ pid: int 19532
  ..- attr(*, "class")= chr "SOCKnode"

Do you have any suggestions for how to trouble-shoot this? I have successfully used parallel processing on this machine before, using packages parallel, snow, and doSNOW, but I have not worked with future before. @rvalavi

rvalavi commented 4 years ago

@mmfink thanks for your comment! Never had this error before. Are both future and future.apply packages updated?

Did you try it without parallel processing?

mmfink commented 4 years ago

@rvalavi Thinking versions might play into it, I upgraded R to 3.6.3 yesterday and updated all packages, but still had the error. I did try running it without parallel and it ran for ~6 hours before running out of memory, which tells me parallel would also run out of memory (only much faster), though the parallel process isn't even getting to that point. No resources are used before the error message happens. It doesn't seem like the workers are even getting initialized.

Anyway, given the memory issues (these really are ridiculously large rasters), I see I need to find a dataset that will run without parallelization first before further trouble-shooting can happen.

Please stand by. :-)

rvalavi commented 4 years ago

@mmfink I think the problem is memory. If your machine is out of memory for sequential tasks, there is no way you can do parallel on that as you need several times more memory.

One way to go around it is to sample your rasters e.g. new_r <- raster::sampleRegular(r, size = 5e5, asRaster = TRUE), then put the resampled rasters in the function. This can be handled easily, but the process of sampling your rasters might a pain itself as raster package is very slow. A new replacement of raster package is under development, the terra package (it's on CRAN now). Not sure how stable it is right now. But if regular sampling is slow in raster you might want to try it.

mmfink commented 4 years ago

@rvalavi Yep, I can't dispute the memory issue, and my tests with smaller (though different) datasets have not recreated the error. So, closing the issue. Thanks for your time!