prioritizr / wdpar

Interface to the World Database on Protected Areas
https://prioritizr.github.io/wdpar
GNU General Public License v3.0
37 stars 5 forks source link

Port error #41

Closed RussellGrayxd closed 2 years ago

RussellGrayxd commented 2 years ago

Hi,

While running a loop to download PAs for multiple countries the code fails with a port-in-us error:

Reprex: library(wdpar)

make a list of countries

countries <- c("Vietnam","Malaysia","Laos")

loop over countries to get all shapefile data

for(i in countries){ mlt_raw_pa_data <- ?wdpa_fetch( i, wait = TRUE, download_dir = getwd()) }

After the first file downloads, the function stops with: Error in wdman::phantomjs(verbose = FALSE) : PhantomJS signals port = 4567 is already in use.

I believe this is an issue with Phantom JS or Node JS, I've seen similar instances of port issues like this with rselenium.

sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] mapview_2.10.4 rgdal_1.5-21 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
[6] purrr_0.3.4 readr_2.1.0 tidyr_1.1.4 tibble_3.1.6 ggplot2_3.3.5.9000 [11] tidyverse_1.3.1 sp_1.4-6 wdpar_1.3.2 sf_1.0-5

loaded via a namespace (and not attached): [1] bitops_1.0-7 fs_1.5.0 satellite_1.0.4 lubridate_1.8.0 webshot_0.5.2
[6] httr_1.4.2 tools_4.0.3 backports_1.3.0 utf8_1.2.2 R6_2.5.1
[11] KernSmooth_2.23-20 DBI_1.1.1 colorspace_2.0-2 raster_3.4-8 withr_2.4.2
[16] tidyselect_1.1.1 processx_3.5.2 leaflet_2.0.4.1 curl_4.3.2 compiler_4.0.3
[21] leafem_0.1.6 cli_3.1.0 rvest_1.0.2 xml2_1.3.2 caTools_1.18.2
[26] scales_1.1.1 classInt_0.4-3 askpass_1.1 proxy_0.4-26 rappdirs_0.3.3
[31] digest_0.6.28 wdman_0.2.5 base64enc_0.1-3 pkgconfig_2.0.3 htmltools_0.5.2
[36] dbplyr_2.1.1 fastmap_1.1.0 htmlwidgets_1.5.4 rlang_0.4.12 readxl_1.3.1
[41] rstudioapi_0.13 generics_0.1.1 jsonlite_1.7.2 crosstalk_1.2.0 magrittr_2.0.1
[46] Rcpp_1.0.7.2 munsell_0.5.0 fansi_0.5.0 lifecycle_1.0.1 terra_1.4-22
[51] stringi_1.7.5 yaml_2.2.1 grid_4.0.3 crayon_1.4.2 semver_0.2.0
[56] lattice_0.20-41 haven_2.4.3 hms_1.1.1 ps_1.6.0 pillar_1.6.4
[61] codetools_0.2-18 stats4_4.0.3 reprex_2.0.1 XML_3.99-0.8 glue_1.5.0
[66] packrat_0.7.0 modelr_0.1.8 png_0.1-7 vctrs_0.3.8 tzdb_0.2.0
[71] cellranger_1.1.0 gtable_0.3.0 openssl_1.4.5 assertthat_0.2.1 binman_0.1.2
[76] broom_0.7.10 countrycode_1.3.0 e1071_1.7-9 class_7.3-17 RSelenium_1.7.7
[81] units_0.7-2 ellipsis_0.3.2

jeffreyhanson commented 2 years ago

Hi,

Thank you very much for raising this issue and providing all these details! I'm sorry that it's not working for you. Just to check, I noticed that there was a ? before wdpa_fetch(). Did you encounter the error when running code that did not contain the ??. E.g., something like this:

# load package
library(wdpar)

# make a list of countries
countries <- c("Vietnam","Malaysia","Laos")

# loop over countries to get all shapefile data
dat <- list()
for (i in countries) {
 message("starting ", i)
 dat[[i]] <- wdpa_fetch(i, wait = TRUE, download_dir = getwd())
}

If you encounter the error in the first iteration of the loop, you might need to restart your computer.

Also, here's some additional details in case it helps identify what's going wrong. Normally, wdpa_fetch() will start up a new PhantomJS process, use it to download the data, and then kill it (seperately for each country). The PhantomJS signals port = 4567 is already in use. error indicates that a PhantomJS web driver is already running. So, if you see this on the second iteration then this would suggest that wdpar didn't kill the process correctly after the first iteration.

Also, I wouldn't expect this to make a difference, but could you please try using the latest version of R?

RussellGrayxd commented 2 years ago

Hey sorry I had copied and pasted the code after checking the function doc if I could manually set the port. The same error occurs when the ? is removed.  I have tried on my other laptop with the newest version of R and there was no difference, same error occurred. I tried restarting both laptops and rerunning to no avail. Is it possible to add an argument to set a random port? That seemed to been the solution for the Rselenium devs. --Russell J. Gray Wildlife Ecology & Conservation Consultant |

0934581341 | @.***://www.rjgrayecology.com/ 705 alligator ranch rd. New Smyrna Beach, FL, USA. 32168

On Monday, February 7, 2022, 02:41:26 p.m. EST, Jeff Hanson ***@***.***> wrote:  

Hi,

Thank you very much for raising this issue and providing all these details! I'm sorry that it's not working for you. Just to check, I noticed that there was a ? before wdpa_fetch(). Did you encounter the error when running code that did not contain the ??. E.g., something like this: library(wdpar)

make a list of countries

countries <- c("Vietnam","Malaysia","Laos")

loop over countries to get all shapefile data

for (i in countries){ mlt_raw_pa_data <- wdpa_fetch( message("starting ", i) i, wait = TRUE, download_dir = getwd()) }

If you encounter the error in the first iteration of the loop, you might need to restart your computer.

Also, here's some additional details in case it helps identify what's going wrong. Normally, wdpa_fetch() will start up a new PhantomJS process, use it to download the data, and then kill it (seperately for each country). The PhantomJS signals port = 4567 is already in use. error indicates that a PhantomJS web driver is already running. So, if you see this on the second iteration then this would suggest that wdpar didn't killing the process correctly after the first iteration.

Also, I wouldn't expect this to make a difference, but could you please try using the latest version of R?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

jeffreyhanson commented 2 years ago

Ah ok - thanks for clarifying that! Ok - I'll create a branch in a moment to implement the random ports idea for testing. Note that this doesn't entirely fix the issue though. Since the error is caused by the R code failing to kill Selenium processes once it's finished with them, this means that running the code in a loop will result in lots of zombie processes and, in turn, mean that they could consume lots of memory. Is that ok? Basically, this means you would probably want to restart your computer after the loop has finished running.

jeffreyhanson commented 2 years ago

Another option could involve manually killing the Selenium processes each step in the loop (based on https://github.com/ropensci/RSelenium/issues/228#issuecomment-693735827). For example, something like this:

# load package
library(wdpar)

# make a list of countries
countries <- c("Vietnam","Malaysia","Laos")

# loop over countries to get all shapefile data
dat <- list()
for (i in countries) {
  message("starting ", i)
  dat[[i]] <- wdpa_fetch(i, wait = TRUE, download_dir = getwd())
  try(system("taskkill /im java.exe /f", intern = FALSE, ignore.stdout = FALSE))
}
RussellGrayxd commented 2 years ago

I'm not sure. In my case I'm just trying to get shapefiles for multiple countries. If there is any easier way to do this without failure then I would choose that, but if not then a little memory jab isn't going to hurt the process for me. I've got time.  --Russell J. Gray Wildlife Ecology & Conservation Consultant |

0934581341 | @.***://www.rjgrayecology.com/ 705 alligator ranch rd. New Smyrna Beach, FL, USA. 32168

On Monday, February 7, 2022, 06:03:30 p.m. EST, Jeff Hanson ***@***.***> wrote:  

Ah ok - thanks for clarifying that! Ok - I'll create a branch in a moment to implement the random ports idea for testing. Note that this doesn't entirely fix the issue though. Since the error is caused by the R code failing to kill Selenium processes once it's finished with them, this means that running the code in a loop will result in lots of zombie processes and, in turn, mean that they could consume lots of memory. Is that ok?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

RussellGrayxd commented 2 years ago

For my full list of countries (including Indonesia) this fix runs the error "ERROR: The process "java.exe" not found." --Russell J. Gray Wildlife Ecology & Conservation Consultant |

0934581341 | @.***://www.rjgrayecology.com/ 705 alligator ranch rd. New Smyrna Beach, FL, USA. 32168

On Monday, February 7, 2022, 06:08:01 p.m. EST, Jeff Hanson ***@***.***> wrote:  

Another option could involve manually killing the Selenium processes each step in the loop. For example, something like this:

load package

library(wdpar)

make a list of countries

countries <- c("Vietnam","Malaysia","Laos")

loop over countries to get all shapefile data

dat <- list() for (i in countries) { message("starting ", i) dat[[i]] <- wdpa_fetch(i, wait = TRUE, download_dir = getwd()) try(system("taskkill /im java.exe /f", intern=FALSE, ignore.stdout=FALSE)) }

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

jeffreyhanson commented 2 years ago

Ah ok - thanks for trying that.

jeffreyhanson commented 2 years ago

Could you please try running the code using the wd-proc branch? I've randomized the port and tried to make the code more robust for killing the processes. You can install it with:

remotes::install_github("prioritizr/wdpar@wd-proc")
RussellGrayxd commented 2 years ago

Ayyy this one nailed it! Thanks for the fix! --Russell J. Gray Wildlife Ecology & Conservation Consultant |

0934581341 | @.***://www.rjgrayecology.com/ 705 alligator ranch rd. New Smyrna Beach, FL, USA. 32168

On Monday, February 7, 2022, 06:48:37 p.m. EST, Jeff Hanson ***@***.***> wrote:  

Could you please try running the code using the wd-proc branch? I've randomized the port and tried to make the code more robust for killing the processes. You can install it with: @.***")

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

jeffreyhanson commented 2 years ago

Awesome! Thanks for letting me know that fixed it - I'll merge those fixes into the main branch.