Closed jcvdav closed 5 years ago
Thank you! Could you please verify that Selenium and PhantomJS are installed (or try reinstalling them) using the wdman package? The wdman vignette has more information, but you might be able to do with this the following code:
# Selenium
selServ <- selenium(verbose = TRUE)
selServ$stop()
# PhantomJS
pjsDrv <- phantomjs(verbose = FALSE, check = FALSE)
pjsDrv$log()
pjsDrv$stop()
Could you also please verify if Java is installed on your system? You could do this with the following code:
Sys.which("java")
Ah - Selenium and PhantomJS were NOT installed or updated. I re-installed wdman
and then ran the suggested code.
Here's what I got.
> install.packages("wdman")
> library(wdman)
> # Selenium
> selServ <- selenium(verbose = TRUE)
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://www.googleapis.com/dow...
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://www.googleapis.com/dow...
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://www.googleapis.com/dow...
BEGIN: POSTDOWNLOAD
checking chromedriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://www.googleapis.com/dow...
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://www.googleapis.com/dow...
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://www.googleapis.com/dow...
BEGIN: POSTDOWNLOAD
checking geckodriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://github.com/mozilla/gec...
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://github.com/mozilla/gec...
Creating directory: C:\Users\JC\AppData\Local\binm...
Downloading binary: https://github.com/mozilla/gec...
BEGIN: POSTDOWNLOAD
checking phantomjs versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
> selServ$stop()
[1] TRUE
>
> # PhantomJS
> pjsDrv <- phantomjs(verbose = FALSE, check = FALSE)
> pjsDrv$log()
$stderr
character(0)
$stdout
character(0)
> pjsDrv$stop()
[1] TRUE
And yes, Java is installed:
> Sys.which("java")
java
"C:\\PROGRA~2\\COMMON~1\\Oracle\\Java\\javapath\\java.exe"
However, running wdpa_fetch()
still returns the same error message:
> library(wdpar)
Loading required package: sf
Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
> mlt_raw_pa_data <- wdpa_fetch("Malta", wait = TRUE)
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4567: Connection refused
Hmm, could you please try restarting your computer (since sometimes R will leave zombie PhantomJS processes which can cause problems -- unfortunately, I haven't managed to work out how to kill these in a manner that is agnostic to the host operating system), open R with administrator privileges (this might require running R outside of Rstudio), then try running the code below:
library(wdman)
library(RSelenium)
pjs <- wdman::phantomjs(verbose = FALSE)
rd <- RSelenium::remoteDriver(port = 4567L, browserName = "phantomjs")
rd$open(silent = FALSE)
rd$close()
pjs$stop()
I'm sorry you're experiencing these issues. I was playing around on my Windows 7 machine with a fresh R installation and wdpa_fetch
was throwing the same error message you are getting, but after running the PhantomJS/RSelenium installation commands and manually creating the drivers with wdman a few times (as you did in your previous post), wdpa_fetch
started working.
I suppose one option might be to add the functionality to wdpa_fetch
so you can tell it use a Docker container running Selenium/PhantomJS (instead of trying to run it on the host operating system), could you please try following this tutorial and seeing if you can use Docker with RSelenium to perform the web scraping: https://callumgwtaylor.github.io/blog/2018/02/01/using-rselenium-and-docker-to-webscrape-in-r-using-the-who-snake-database/
Hi @jeffreyhanson,
Thanks for looking into this. I ran the first chunk and still get some errors. This is what I got:
> library(wdman)
> library(RSelenium)
> pjs <- wdman::phantomjs(verbose = FALSE)
> rd <- RSelenium::remoteDriver(port = 4567L, browserName = "phantomjs")
> rd$open(silent = FALSE)
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4567: Connection refused
> rd$close()
Error in checkError(res) :
Undefined error in httr call. httr output: length(url) == 1 is not TRUE
> pjs$stop()
[1] TRUE
In case it matters, I am running on Windows 10. I will try the Docker approach now and will report bacj soon.
So, tried the Docker approach following the post, and ran into some other problems.
Docker Desktop cannot be installed on Widnows 10 Home. Apparently this is because Hyper-V was not a thing on Windows 10 Home back in 2016. It says that I need Second Level Address Translation (SLAT).
However, if I run systeminfo
I get:
Hyper-V Requirements:
VM Monitor Mode Extensions: Yes
Virtualization Enabled In Firmware: Yes
Second Level Address Translation: Yes
Data Execution Prevention Available: Yes
Which indicates not only that Hyper-V is enabled, but also that I have SLAT....
I don't think I'll be able to test it this way. Looking at #6, I may be able to use wdpa_read()
and work with a fresh January 2019 download of WDPA.
I'd be happy to contribute to this project and try to solve this issue to make sure wdpa
works an all systems. Feel free to close for now.
Ah ok - that's a pity - thank you very much for trying out the Docker approach and sharing your experiences.
I'm sorry, I don't have any other ideas for fixing this at the moment. Although one option would be to simply "guess" the download links because they follow a standard pattern (IIRC some combination of country name, month, and year), this approach is unsatisfying because when protectedplanet.net releases a new version of the global data set (about once a month) it will only create country-specific subsets of the data (e.g. shapefile data just for Spain) when they are requested for download from the website. So, if we were to simply try downloading the data using a "guessed" download link then there is no guarantee that the data will be available for download (because it may not exist yet). Additionally, before trying the RSelenium R pacakge, I originally tried using the rvest R package but I was unable to trigger protectedplanet.net's process for creating a country-specific subset of the data using this package (perhaps some JavaScript needs to be executed in a mock browser?).
Ok, I'll close this issue now, but if you (or anyone else reading this) has any ideas for fixing this then please reopen it.
Thanks for such a great contribution. This will make using WDPA much more easier. (I love
wdpa_clean()
!)I am trying to replicate the process on the README, and am getting some errors:
Looking at the code, it seems like the problem may be in
wdpa_url()
? Line 48 callsrd <- RSelenium::remoteDriver(port = 4567L, browserName = "phantomjs")
. If I execute that line and then tryrd$open(silent = TRUE)
I get the same error message as above.Running
pingr::is_online()
returnsTRUE
.This is my session info