Closed joshualerickson closed 3 years ago
I ran your code. It worked. Please give me more information about your R version, OS and everything that might be useful
Considering that be
is grid_terrain
I guess the following should fail as well
plan(multisession, workers = 2L)
be_result <- grid_terrain(las, tin())
be_result2 <- grid_terrain(las2, tin())
Thanks for the quick response! Below is the session info. I just played around with the code
works
plan(multisession, workers = 2L)
be_result <- grid_terrain(las, tin())
be_result2 <- grid_terrain(las2, tin())
and it works just fine unless I do this below, i.e. add plan(multisession, workers = 2L)
.
crashes
plan(multisession, workers = 2L)
be_result <- grid_terrain(las, tin())
plan(multisession, workers = 2L)
be_result2 <- grid_terrain(las2, tin())
Same thing with the reprex. I've not had this happen with plan()
before so I'm a little unsure what's going on. Thanks again.
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] future_1.21.0 lidR_3.1.2 raster_3.4-5 sp_1.4-5
loaded via a namespace (and not attached):
[1] RSelenium_1.7.7 leafem_0.1.3 colorspace_1.4-1
[4] ellipsis_0.3.1 class_7.3-17 leaflet_2.0.3
[7] rgdal_1.5-23 streamstats_0.0.1.9011 base64enc_0.1-3
[10] proxy_0.4-25 listenv_0.8.0 DT_0.15
[13] lubridate_1.7.9 xml2_1.3.2 codetools_0.2-16
[16] splines_4.0.2 extRemes_2.0-12 Lmoments_1.3-1
[19] doParallel_1.0.15 jsonlite_1.7.2 semver_0.2.0
[22] png_0.1-7 shiny_1.6.0.9000 readr_1.3.1
[25] compiler_4.0.2 httr_1.4.2 assertthat_0.2.1
[28] Matrix_1.2-18 fastmap_1.1.0 lazyeval_0.2.2
[31] later_1.1.0.1 htmltools_0.5.1.9000 tools_4.0.2
[34] ggmap_3.0.0.903 igraph_1.2.5 gtable_0.3.0
[37] glue_1.4.2 lmom_2.8 binman_0.1.1
[40] RANN_2.6.1 dplyr_1.0.2 rappdirs_0.3.3
[43] maps_3.3.0 Rcpp_1.0.6 RNetCDF_2.4-2
[46] vctrs_0.3.4 ape_5.4-1 nlme_3.1-149
[49] climateR_0.0.4 iterators_1.0.12 crosstalk_1.1.1
[52] nhdplusTools_0.4.0 gbRd_0.4-11 stringr_1.4.0
[55] globals_0.14.0 rbibutils_1.4 rvest_0.3.6
[58] mime_0.10 lifecycle_1.0.0 XML_3.99-0.5
[61] distillery_1.1 MASS_7.3-52 zoo_1.8-8
[64] scales_1.1.1 hms_0.5.3 promises_1.2.0.1
[67] parallel_4.0.2 RColorBrewer_1.1-2 yaml_2.2.1
[70] memoise_1.1.0 gridExtra_2.3 ggplot2_3.3.2
[73] latticeExtra_0.6-29 stringi_1.5.3 wildlandhydRo_0.1.0
[76] dygraphs_1.1.1.6 foreach_1.5.0 e1071_1.7-6
[79] leaflet.extras_1.0.0 caTools_1.18.0 zip_2.1.1
[82] RgoogleMaps_1.4.5.3 Rdpack_2.1 rlang_0.4.10
[85] pkgconfig_2.0.3 bitops_1.0-6 lattice_0.20-41
[88] purrr_0.3.4 sf_0.9-8 htmlwidgets_1.5.3.9000
[91] tidyselect_1.1.0 parallelly_1.24.0 lmomRFA_3.3
[94] plyr_1.8.6 magrittr_2.0.1 geojsonsf_2.0.0
[97] R6_2.5.0 generics_0.1.0 DBI_1.1.1
[100] pillar_1.4.6 wdman_0.2.5 fitdistrplus_1.1-1
[103] units_0.7-1 xts_0.12-0 survival_3.2-3
[106] dataRetrieval_2.7.6.9002 tibble_3.0.3 crayon_1.4.1
[109] KernSmooth_2.23-17 plotly_4.9.2.9000 jpeg_0.1-8.1
[112] grid_4.0.2 data.table_1.14.0 snotelr_1.0.4
[115] forcats_0.5.0 digest_0.6.27 classInt_0.4-3
[118] webshot_0.5.2 xtable_1.8-4 tidyr_1.1.2
[121] httpuv_1.5.5 openssl_1.4.3 munsell_0.5.0
[124] fst_0.9.4 lfstat_0.9.4 viridisLite_0.3.0
[127] askpass_1.1
I can't see why and how it could related to lidR
. What about
library(future)
plan(multisession, workers = 2L)
f1 = future({1+1})
f1 = value(f1)
plan(multisession, workers = 2L)
f2 = future({1+1})
f2 = value(f2)
At home now, I'll run it in the morning and get back with you. I think you're right though and most likely something on my end (OS)... I'll be curious to see if that crashes as well 🤞
I was hoping it didn't work but it did... Looked at the future
package issues and couldn't find anything similar. I'm out of ideas right now. I'll just use plan(multsession, workers = 2L)
once in the session when using lidR
and maybe later on (when updating R version, package, etc) it will disappear :sunglasses:. I've been using furrr
in another package and that doesn't have this same issue. I'm wondering if it has to do with the 'apply' function? But I'm a total noob so not sure if that's a good lead :confused:? Have a good one!
works
library(future)
plan(multisession, workers = 2L)
f1 = future({1+1})
f1 = value(f1)
plan(multisession, workers = 2L)
f2 = future({1+1})
f2 = value(f2)
I will test you first script on windows later. But right now I'm in day off.
Can you retry the example that fails with set_lidr_thread(1)
It still crashes; however, plan(sequential)
works instead of double calling plan(multisession, workers = 2L)
.
crashes
library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)
LASfile2 <- system.file("extdata", "Topography.laz", package="lidR")
las2 <- readLAScatalog(LASfile2)
plan(multisession, workers = 2L)
set_lidr_threads(1)
be_result <- grid_terrain(las,res = 1, tin())
plan(multisession, workers = 2L)
set_lidr_threads(1) #with or without still crashes
be_result2 <- grid_terrain(las2,res = 1, tin())
works
plan(multisession, workers = 2L)
set_lidr_threads(1)
be_result <- grid_terrain(las,res = 1, tin())
plan(sequential)
set_lidr_threads(1)
be_result2 <- grid_terrain(las2,res = 1, tin())
Tested on Windows 7 R 4.0.3 in my virtual box and it worked. I cannot reproduce. Can you reproduce on other computers ?
I've got Windows 10 don't know if that would effect anything... I'll give it a try on my wife's comp (mac) and see what happens. I installed R 4.0.5 thinking that might help and reinstalled future, lidR
but with no success. Thanks for your help and time!
Can't test on W10. So far I successfully reproduced every single windows specific issue in my virtual box with W7. Maybe @bi0m3trics you can try on W10 ?
Just got home from being in the woods... you know, the place we sense remotely ;) I'll see what I can see this evening and post back here.
Still testing a few things but one thing I've noticed is that the offending code (the one provided above that crashes in windows 10) seems to work fine in RGui (4.0.5) on Windows 10 but crashes every time in RStuido (1.4.1103). @joshualerickson Can you confirm it doesn't crash when you just run it RGui?
This behavior makes me think it's actually RStudio, but before I check it against other versions of RStudio I'd like to know if it works in RGui for @joshualerickson if so what version of RStudio are you running when it crashes?
Thanks @bi0m3trics for the feed back. Very informative. I updated RStudio from 1.3.something
to 1.4.1106
on Linux. It works. Will try on Windows later
I can confirm that the offensive code runs in R 4.0.5 using RGui in both windows 10 build 19042 and build 16299 (the latter machine's getting reimaged tomorrow) without crashing and when I switch to RStudio 1.4.1103 on either build it crashes. However - If I drop back to RStudio 1.3.1093 (on 16299, didn't try it on 19042) it works fine in both RStudio and RGui. Sounds like an RStudio bug, but we wont know until @joshualerickson responds...
I updated Rstudio from 1.1.46
to 1.4.1106
on W7 and I confirm it crashed
Notice that, while there is indeed a problem, in practice there is no need to call plan
twice. The following works and doesn't crash
library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)
plan(multisession, workers = 2L)
be_result <- grid_canopy(las, res = 1, p2r())
be_result2 <- grid_canopy(las, res = 1, p2r())
@bi0m3trics I'm running on RStudio version 1.4.1106
, which seems/is to be the problem. RGui work just fine but RStudio crashes. @Jean-Romain Calling plan
once works for me and will most likely be one call for me in what I do, but I'm curious about others that may want to do nested parallelism on different scripts? Below still crashes. Thanks again for both your time and great package!!!
library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)
myfun = function(cluster, ...)
{
las <- readLAS(cluster)
if (is.empty(las)) return(NULL)
las <- normalize_height(las, tin())
tops <- tree_detection(las, lmf(2))
bbox <- extent(cluster)
tops <- crop(tops, bbox)
return(tops)
}
plan(multisession, workers = 2L)
set_lidr_threads(2L)
catalog_apply(ctg, myfun, ws = 5)
#for another script....
plan(multisession, workers = 3L)
set_lidr_threads(1L)
catalog_apply(ctg, myfun, ws = 5)
Now I'm able to reproduce I will be able to investigate further to figure out where it fails and see if yes or no I can reproduce independently of lidR
but I'm curious about others that may want to do nested parallelism on different scripts?
I'm not sure to understand what do you mean. Nested parallelism with future
is not supported (yet) in lidR
. If you mean using both OpenMP
and future
it does not seems to be the problem here.
I'm most likely not understanding the process... but I was thinking that some functions use OpenMP
and future
with different set-ups, e.g. https://rdrr.io/cran/lidR/man/lidR-parallelism.html 'Nested parallelism - part 2'. If you change those set-ups, which can be contingent on whether to chunk or natively use parallism, then it's possible this would still be a scenario where you would use plan(multisession)
twice. Sorry if I'm way off and not making this clear as it's most likely my novice understanding of parallelism...
plan(something)
allows to process several chunks at a time either using multiple cores on a single machine or using multiple machine remotely or using several machine on a HPC.
set_lidr_thread()
control OpenMP
. Some (but not all) algorithm are natively parallel meaning that if you can do my_method(las, algorithm())
it runs on multiple cores on the local machine at C++ level. Our problem here does not seem to be related to OpenMP
Nested parallelism described in part 2 explains that you can use future
to process e.g. 2 chunks at a time and use 2 remaining cores with OpenMP
. Which is relevant only if you are using a parallelized algorithm. Also if you use 4 and 4 on a 4 core machine it won't work. There is an internal security to disable OpenMP
.
Nested parallelism with future
is e.g. when you have 2 remotes computers and each with 4 cores. You may want to use something like plan(list(remote, multisession)
to send 4 chunks at a times to each two computers. But this is not supported in lidR yet. In the case of 2 remotes computer each computer will process one chunk at a time.
Thanks @Jean-Romain ! Of note, can't produce the 'crash' with readLAS
, instead of readLAScatalog
.
Just a quick comment after skimming through this issue:
Not that it'll solve the problem, but it might be a tad easier to troubleshoot if you parallelize with a single background worker. You can do this by using:
plan(cluster, workers = 1L)
The current version of RStudio is 1.4.1717. Do you still experience troubleshooting?
I'll have some time to tomorrow or Wed to give it a shot.
Hey @Jean-Romain I can confirm that the below code works below on
$mode
[1] "desktop"
$version
[1] ‘1.4.1717’
$release_name
[1] "Juliet Rose"
Doesn't Crash!
library(lidR)
library(future)
LASfile <- system.file("extdata", "example.laz", package="rlas")
las <- readLAScatalog(LASfile)
LASfile2 <- system.file("extdata", "Topography.laz", package="lidR")
las2 <- readLAScatalog(LASfile2)
plan(multisession, workers = 2L)
set_lidr_threads(1)
be_result <- grid_terrain(las,res = 1, tin())
plan(multisession, workers = 2L)
set_lidr_threads(1) #with or without still crashes
be_result2 <- grid_terrain(las2,res = 1, tin())
First off, top notch package! I've been having a lot of fun with it and it's going to help some common workflow hurdles for sure. But, I'm having trouble debugging a rstudio crash after running multiple
catalog_apply
functions withfuture::multisession
. The reprex actually works when running withreprex
; however, when I run it manually in the Rstudio session I get a crash after I run the secondcatalog_apply
function. Does the same thing if I bring in other LasCatalogs/LAS files... Not sure what's going on? Tried to addplan(sequential)
after running the first chunk but that doesn't seem to help. Any help would be much apprectiated!Created on 2021-04-12 by the reprex package (v0.3.0)