r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
596 stars 131 forks source link

running future with catalog_apply crashes r session after second interation. #422

Closed joshualerickson closed 3 years ago

joshualerickson commented 3 years ago

First off, top notch package! I've been having a lot of fun with it and it's going to help some common workflow hurdles for sure. But, I'm having trouble debugging a rstudio crash after running multiple catalog_apply functions with future::multisession. The reprex actually works when running with reprex; however, when I run it manually in the Rstudio session I get a crash after I run the second catalog_apply function. Does the same thing if I bring in other LasCatalogs/LAS files... Not sure what's going on? Tried to add plan(sequential) after running the first chunk but that doesn't seem to help. Any help would be much apprectiated!

library(lidR)
#> Warning: package 'lidR' was built under R version 4.0.3
#> Loading required package: raster
#> Loading required package: sp
library(future)
#> Warning: package 'future' was built under R version 4.0.3
#> 
#> Attaching package: 'future'
#> The following object is masked from 'package:raster':
#> 
#>     values

LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)

LASfile2 <- system.file("extdata", "Topography.laz", package="lidR")
las2 <- readLAScatalog(LASfile2)
#> Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO"): Discarded
#> datum NAD83_Canadian_Spatial_Reference_System in CRS definition

be = function(cluster, ...)
{
  las <- readLAS(cluster)
  if (is.empty(las)) return(NULL)
  las  <- grid_terrain(las, res = 1, algorithm = tin())
  bbox <- extent(cluster)
  be <- crop(las, bbox)
  return(be)
}

plan(multisession, workers = 2L)
be_result <- catalog_apply(las, be)

#> Chunk 1 of 1 (100%): state <U+2713>

plan(sequential)
plan(multisession, workers = 2L)
be_result2 <- catalog_apply(las2, be)
#> Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO"): Discarded
#> datum NAD83_Canadian_Spatial_Reference_System in CRS definition

#> Chunk 1 of 1 (100%): state <U+26A0>

Created on 2021-04-12 by the reprex package (v0.3.0)

Jean-Romain commented 3 years ago

I ran your code. It worked. Please give me more information about your R version, OS and everything that might be useful

Jean-Romain commented 3 years ago

Considering that be is grid_terrain I guess the following should fail as well

plan(multisession, workers = 2L)
be_result <- grid_terrain(las, tin())
be_result2 <- grid_terrain(las2, tin())
joshualerickson commented 3 years ago

Thanks for the quick response! Below is the session info. I just played around with the code

works

plan(multisession, workers = 2L)
be_result <- grid_terrain(las, tin())
be_result2 <- grid_terrain(las2, tin())

and it works just fine unless I do this below, i.e. add plan(multisession, workers = 2L).

crashes

plan(multisession, workers = 2L)
be_result <- grid_terrain(las, tin())
plan(multisession, workers = 2L)
be_result2 <- grid_terrain(las2, tin())

Same thing with the reprex. I've not had this happen with plan() before so I'm a little unsure what's going on. Thanks again.

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] future_1.21.0 lidR_3.1.2    raster_3.4-5  sp_1.4-5     

loaded via a namespace (and not attached):
  [1] RSelenium_1.7.7          leafem_0.1.3             colorspace_1.4-1        
  [4] ellipsis_0.3.1           class_7.3-17             leaflet_2.0.3           
  [7] rgdal_1.5-23             streamstats_0.0.1.9011   base64enc_0.1-3         
 [10] proxy_0.4-25             listenv_0.8.0            DT_0.15                 
 [13] lubridate_1.7.9          xml2_1.3.2               codetools_0.2-16        
 [16] splines_4.0.2            extRemes_2.0-12          Lmoments_1.3-1          
 [19] doParallel_1.0.15        jsonlite_1.7.2           semver_0.2.0            
 [22] png_0.1-7                shiny_1.6.0.9000         readr_1.3.1             
 [25] compiler_4.0.2           httr_1.4.2               assertthat_0.2.1        
 [28] Matrix_1.2-18            fastmap_1.1.0            lazyeval_0.2.2          
 [31] later_1.1.0.1            htmltools_0.5.1.9000     tools_4.0.2             
 [34] ggmap_3.0.0.903          igraph_1.2.5             gtable_0.3.0            
 [37] glue_1.4.2               lmom_2.8                 binman_0.1.1            
 [40] RANN_2.6.1               dplyr_1.0.2              rappdirs_0.3.3          
 [43] maps_3.3.0               Rcpp_1.0.6               RNetCDF_2.4-2           
 [46] vctrs_0.3.4              ape_5.4-1                nlme_3.1-149            
 [49] climateR_0.0.4           iterators_1.0.12         crosstalk_1.1.1         
 [52] nhdplusTools_0.4.0       gbRd_0.4-11              stringr_1.4.0           
 [55] globals_0.14.0           rbibutils_1.4            rvest_0.3.6             
 [58] mime_0.10                lifecycle_1.0.0          XML_3.99-0.5            
 [61] distillery_1.1           MASS_7.3-52              zoo_1.8-8               
 [64] scales_1.1.1             hms_0.5.3                promises_1.2.0.1        
 [67] parallel_4.0.2           RColorBrewer_1.1-2       yaml_2.2.1              
 [70] memoise_1.1.0            gridExtra_2.3            ggplot2_3.3.2           
 [73] latticeExtra_0.6-29      stringi_1.5.3            wildlandhydRo_0.1.0     
 [76] dygraphs_1.1.1.6         foreach_1.5.0            e1071_1.7-6             
 [79] leaflet.extras_1.0.0     caTools_1.18.0           zip_2.1.1               
 [82] RgoogleMaps_1.4.5.3      Rdpack_2.1               rlang_0.4.10            
 [85] pkgconfig_2.0.3          bitops_1.0-6             lattice_0.20-41         
 [88] purrr_0.3.4              sf_0.9-8                 htmlwidgets_1.5.3.9000  
 [91] tidyselect_1.1.0         parallelly_1.24.0        lmomRFA_3.3             
 [94] plyr_1.8.6               magrittr_2.0.1           geojsonsf_2.0.0         
 [97] R6_2.5.0                 generics_0.1.0           DBI_1.1.1               
[100] pillar_1.4.6             wdman_0.2.5              fitdistrplus_1.1-1      
[103] units_0.7-1              xts_0.12-0               survival_3.2-3          
[106] dataRetrieval_2.7.6.9002 tibble_3.0.3             crayon_1.4.1            
[109] KernSmooth_2.23-17       plotly_4.9.2.9000        jpeg_0.1-8.1            
[112] grid_4.0.2               data.table_1.14.0        snotelr_1.0.4           
[115] forcats_0.5.0            digest_0.6.27            classInt_0.4-3          
[118] webshot_0.5.2            xtable_1.8-4             tidyr_1.1.2             
[121] httpuv_1.5.5             openssl_1.4.3            munsell_0.5.0           
[124] fst_0.9.4                lfstat_0.9.4             viridisLite_0.3.0       
[127] askpass_1.1             
Jean-Romain commented 3 years ago

I can't see why and how it could related to lidR. What about

library(future)

plan(multisession, workers = 2L)
f1 = future({1+1})
f1 = value(f1)
plan(multisession, workers = 2L)
f2 = future({1+1})
f2 = value(f2)
joshualerickson commented 3 years ago

At home now, I'll run it in the morning and get back with you. I think you're right though and most likely something on my end (OS)... I'll be curious to see if that crashes as well 🤞

joshualerickson commented 3 years ago

I was hoping it didn't work but it did... Looked at the future package issues and couldn't find anything similar. I'm out of ideas right now. I'll just use plan(multsession, workers = 2L) once in the session when using lidR and maybe later on (when updating R version, package, etc) it will disappear :sunglasses:. I've been using furrr in another package and that doesn't have this same issue. I'm wondering if it has to do with the 'apply' function? But I'm a total noob so not sure if that's a good lead :confused:? Have a good one!

works

library(future)

plan(multisession, workers = 2L)
f1 = future({1+1})
f1 = value(f1)
plan(multisession, workers = 2L)
f2 = future({1+1})
f2 = value(f2)
Jean-Romain commented 3 years ago

I will test you first script on windows later. But right now I'm in day off.

Jean-Romain commented 3 years ago

Can you retry the example that fails with set_lidr_thread(1)

joshualerickson commented 3 years ago

It still crashes; however, plan(sequential) works instead of double calling plan(multisession, workers = 2L).

crashes

library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)

LASfile2 <- system.file("extdata", "Topography.laz", package="lidR")
las2 <- readLAScatalog(LASfile2)

plan(multisession, workers = 2L)
set_lidr_threads(1)

be_result <- grid_terrain(las,res = 1,  tin())

plan(multisession, workers = 2L)
set_lidr_threads(1) #with or without still crashes

be_result2 <- grid_terrain(las2,res = 1, tin())

works

plan(multisession, workers = 2L)
set_lidr_threads(1)

be_result <- grid_terrain(las,res = 1,  tin())

plan(sequential)
set_lidr_threads(1)

be_result2 <- grid_terrain(las2,res = 1, tin())
Jean-Romain commented 3 years ago

Tested on Windows 7 R 4.0.3 in my virtual box and it worked. I cannot reproduce. Can you reproduce on other computers ?

joshualerickson commented 3 years ago

I've got Windows 10 don't know if that would effect anything... I'll give it a try on my wife's comp (mac) and see what happens. I installed R 4.0.5 thinking that might help and reinstalled future, lidR but with no success. Thanks for your help and time!

Jean-Romain commented 3 years ago

Can't test on W10. So far I successfully reproduced every single windows specific issue in my virtual box with W7. Maybe @bi0m3trics you can try on W10 ?

bi0m3trics commented 3 years ago

Just got home from being in the woods... you know, the place we sense remotely ;) I'll see what I can see this evening and post back here.

bi0m3trics commented 3 years ago

Still testing a few things but one thing I've noticed is that the offending code (the one provided above that crashes in windows 10) seems to work fine in RGui (4.0.5) on Windows 10 but crashes every time in RStuido (1.4.1103). @joshualerickson Can you confirm it doesn't crash when you just run it RGui?

This behavior makes me think it's actually RStudio, but before I check it against other versions of RStudio I'd like to know if it works in RGui for @joshualerickson if so what version of RStudio are you running when it crashes?

Jean-Romain commented 3 years ago

Thanks @bi0m3trics for the feed back. Very informative. I updated RStudio from 1.3.something to 1.4.1106 on Linux. It works. Will try on Windows later

bi0m3trics commented 3 years ago

I can confirm that the offensive code runs in R 4.0.5 using RGui in both windows 10 build 19042 and build 16299 (the latter machine's getting reimaged tomorrow) without crashing and when I switch to RStudio 1.4.1103 on either build it crashes. However - If I drop back to RStudio 1.3.1093 (on 16299, didn't try it on 19042) it works fine in both RStudio and RGui. Sounds like an RStudio bug, but we wont know until @joshualerickson responds...

Jean-Romain commented 3 years ago

I updated Rstudio from 1.1.46 to 1.4.1106 on W7 and I confirm it crashed

Notice that, while there is indeed a problem, in practice there is no need to call plan twice. The following works and doesn't crash

library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)

plan(multisession, workers = 2L)

be_result <- grid_canopy(las, res = 1,  p2r())
be_result2 <- grid_canopy(las, res = 1,  p2r())
joshualerickson commented 3 years ago

@bi0m3trics I'm running on RStudio version 1.4.1106 , which seems/is to be the problem. RGui work just fine but RStudio crashes. @Jean-Romain Calling plan once works for me and will most likely be one call for me in what I do, but I'm curious about others that may want to do nested parallelism on different scripts? Below still crashes. Thanks again for both your time and great package!!!


library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)

myfun = function(cluster, ...)
{
  las <- readLAS(cluster)
  if (is.empty(las)) return(NULL)
  las  <- normalize_height(las, tin())
  tops <- tree_detection(las, lmf(2))
  bbox <- extent(cluster)
  tops <- crop(tops, bbox)
  return(tops)
}

plan(multisession, workers = 2L)
set_lidr_threads(2L)
catalog_apply(ctg, myfun, ws = 5)

#for another script....
plan(multisession, workers = 3L)
set_lidr_threads(1L)
catalog_apply(ctg, myfun, ws = 5)
Jean-Romain commented 3 years ago

Now I'm able to reproduce I will be able to investigate further to figure out where it fails and see if yes or no I can reproduce independently of lidR

but I'm curious about others that may want to do nested parallelism on different scripts?

I'm not sure to understand what do you mean. Nested parallelism with future is not supported (yet) in lidR. If you mean using both OpenMP and future it does not seems to be the problem here.

joshualerickson commented 3 years ago

I'm most likely not understanding the process... but I was thinking that some functions use OpenMP and future with different set-ups, e.g. https://rdrr.io/cran/lidR/man/lidR-parallelism.html 'Nested parallelism - part 2'. If you change those set-ups, which can be contingent on whether to chunk or natively use parallism, then it's possible this would still be a scenario where you would use plan(multisession) twice. Sorry if I'm way off and not making this clear as it's most likely my novice understanding of parallelism...

Jean-Romain commented 3 years ago

plan(something) allows to process several chunks at a time either using multiple cores on a single machine or using multiple machine remotely or using several machine on a HPC.

set_lidr_thread() control OpenMP. Some (but not all) algorithm are natively parallel meaning that if you can do my_method(las, algorithm()) it runs on multiple cores on the local machine at C++ level. Our problem here does not seem to be related to OpenMP

Nested parallelism described in part 2 explains that you can use future to process e.g. 2 chunks at a time and use 2 remaining cores with OpenMP. Which is relevant only if you are using a parallelized algorithm. Also if you use 4 and 4 on a 4 core machine it won't work. There is an internal security to disable OpenMP.

Nested parallelism with future is e.g. when you have 2 remotes computers and each with 4 cores. You may want to use something like plan(list(remote, multisession) to send 4 chunks at a times to each two computers. But this is not supported in lidR yet. In the case of 2 remotes computer each computer will process one chunk at a time.

joshualerickson commented 3 years ago

Thanks @Jean-Romain ! Of note, can't produce the 'crash' with readLAS , instead of readLAScatalog.

HenrikBengtsson commented 3 years ago

Just a quick comment after skimming through this issue:

Not that it'll solve the problem, but it might be a tad easier to troubleshoot if you parallelize with a single background worker. You can do this by using:

plan(cluster, workers = 1L)
Jean-Romain commented 3 years ago

The current version of RStudio is 1.4.1717. Do you still experience troubleshooting?

joshualerickson commented 3 years ago

I'll have some time to tomorrow or Wed to give it a shot.

joshualerickson commented 3 years ago

Hey @Jean-Romain I can confirm that the below code works below on

$mode
[1] "desktop"

$version
[1] ‘1.4.1717’

$release_name
[1] "Juliet Rose"

Doesn't Crash!

library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)

LASfile2 <- system.file("extdata", "Topography.laz", package="lidR")
las2 <- readLAScatalog(LASfile2)

plan(multisession, workers = 2L)
set_lidr_threads(1)

be_result <- grid_terrain(las,res = 1,  tin())

plan(multisession, workers = 2L)
set_lidr_threads(1) #with or without still crashes

be_result2 <- grid_terrain(las2,res = 1, tin())