ropensci / RSelenium

An R client for Selenium Remote WebDriver
https://docs.ropensci.org/RSelenium
341 stars 81 forks source link

Phantomjs 2.0 - Linux-x86_64 #56

Closed englianhu closed 7 years ago

englianhu commented 9 years ago

There is a permission denied error when I follow below reference. https://cran.r-project.org/web/packages/RSelenium/vignettes/RSelenium-headless.html

I am using the Linux bin/Phantomjs-2.0 https://github.com/ariya/phantomjs/issues/12948#issuecomment-100051959

> require(RSelenium)
> RSelenium::checkForServer()
> RSelenium::startServer()
Error in file(file, ifelse(append, "a", "w")) : 
  cannot open the connection
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
  cannot open file '/usr/lib64/R/library/RSelenium/bin/sellog.txt': Permission denied
> dir(paste0(getwd()))
 [1] "testing.Rmd"  
 [2] "testing.html" 
 [3] "testing.Rproj"
 [4] "custom.css"                                 
 [5] "datasets"                                   
 [6] "figure"                                     
 [7] "function"                                   
 [8] "LICENSE"                                    
 [9] "missfont.log"                               
[10] "phantomjs"                                  
[11] "phantomjs.exe"                              
[12] "PL.R"                                       
[13] "README.md"                                  
> pJS <- phantom()
Error in phantom() : PhantomJS binary not located.
> pJS <- phantom(paste0(getwd(),'/phantomjs'))
sh: /home/ryoeng/Scibrokes/Testing/phantomjs: Permission denied
> pJS <- phantom(paste0(getwd(),'/phantomjs.exe'))
sh: /home/usr/phantomjs.exe: Permission denied
englianhu commented 9 years ago

Try to download the latest selenium-server-2.47.0.jar from official website and refer to below website. Keep same error. https://gist.github.com/textarcana/5855427

$ 20:37:03.079 INFO - Launching a standalone Selenium Server
20:37:03.156 INFO - Java: Oracle Corporation 24.85-b03
20:37:03.156 INFO - OS: Linux 3.10.0-123.8.1.el7.x86_64 amd64
20:37:03.197 INFO - v2.47.1, with Core v2.47.1. Built from revision 411b314
20:37:03.329 INFO - Driver provider org.openqa.selenium.ie.InternetExplorerDriver registration is skipped:
registration capabilities Capabilities [{platform=WINDOWS, ensureCleanSession=true, browserName=internet explorer, version=}] does not match the current platform LINUX
20:37:03.329 INFO - Driver provider org.openqa.selenium.edge.EdgeDriver registration is skipped:
registration capabilities Capabilities [{platform=WINDOWS, browserName=MicrosoftEdge, version=}] does not match the current platform LINUX
20:37:03.329 INFO - Driver class not found: com.opera.core.systems.OperaDriver
20:37:03.330 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered
20:37:03.450 WARN - Failed to start: SocketListener0@0.0.0.0:4444
Exception in thread "main" java.net.BindException: Selenium is already running on port 4444. Or some other service is.
        at org.openqa.selenium.server.SeleniumServer.start(SeleniumServer.java:492)
        at org.openqa.selenium.server.SeleniumServer.boot(SeleniumServer.java:305)
        at org.openqa.selenium.server.SeleniumServer.main(SeleniumServer.java:245)
        at org.openqa.grid.selenium.GridLauncher.main(GridLauncher.java:64)

[3]+  Exit 1                  java -jar selenium-server-standalone-2.47.1.jar
englianhu commented 9 years ago

May I know how do you do that on Linux? http://johndharrison.github.io/RSOCRUG/#15

Although it is works on Win RStudio-Desktop. But my Linux RStudio-Server will be more efficiency to remotely modify and coding anywhere.

> ## web scrapping
> ## https://cran.r-project.org/web/packages/RSelenium/vignettes/RSelenium-headless.html
> ## https://github.com/eugeneware/phantomjs-bin
> pJS <- phantom(paste0(getwd(),'/phantomjs'))
> Sys.sleep(5) # give the binary a moment
> webDr <- remoteDriver(browserName = 'phantomjs')
> webDr$open(silent=TRUE)
> webDr$navigate(lnk[1]) ## for loop
> web <- webDr$getPageSource()[[1]]
> #'@ tab <- web %>% html_session %>% html_nodes('a') ##doesn't work
> tab <- readHTMLTable(htmlParse(web), header=TRUE)
> web %>% html_session %>% html_nodes('a')
Error in curl::curl_fetch_memory(url, handle = handle) : 
  URL using bad/illegal format or missing URL
englianhu commented 9 years ago

Trying to test if its workable on non-hiden browsers. The browser doesn't pop-up but keep connect to phantomjs.

> RSelenium::startServer() # if needed
> ## trying another browser
> remDr <- remoteDriver(browserName = "chrome")
> remDr$open()
[1] "Connecting to remote server"
$browserName
- [1] "phantomjs"
$version
[1] "2.0.0"
$driverName
[1] "ghostdriver"
$driverVersion
[1] "1.2.0"
$platform
[1] "windows-unknown-32bit"
$javascriptEnabled
[1] TRUE
$takesScreenshot
[1] TRUE
$handlesAlerts
[1] FALSE
$databaseEnabled
[1] FALSE
$locationContextEnabled
[1] FALSE
$applicationCacheEnabled
[1] FALSE
$browserConnectionEnabled
[1] FALSE
$cssSelectorsEnabled
[1] TRUE
$webStorageEnabled
[1] FALSE
$rotatable
[1] FALSE
$acceptSslCerts
[1] FALSE
$nativeEvents
[1] TRUE
$proxy
$proxy$proxyType
[1] "direct"
$id
[1] "aa754350-4575-11e5-af0f-83e2a582d052"

> RSelenium::startServer() # if needed
> ## trying default browser
> remDr <- remoteDriver()
> remDr$open()
[1] "Connecting to remote server"
$browserName
- [1] "phantomjs"
$version
[1] "2.0.0"
$driverName
[1] "ghostdriver"
$driverVersion
[1] "1.2.0"
$platform
[1] "windows-unknown-32bit"
$javascriptEnabled
[1] TRUE
$takesScreenshot
[1] TRUE
$handlesAlerts
[1] FALSE
$databaseEnabled
[1] FALSE
$locationContextEnabled
[1] FALSE
$applicationCacheEnabled
[1] FALSE
$browserConnectionEnabled
[1] FALSE
$cssSelectorsEnabled
[1] TRUE
$webStorageEnabled
[1] FALSE
$rotatable
[1] FALSE
$acceptSslCerts
[1] FALSE
$nativeEvents
[1] TRUE
$proxy
$proxy$proxyType
[1] "direct"
$id
[1] "c4b87020-4575-11e5-af0f-83e2a582d052"
englianhu commented 9 years ago

Its work after refer below reference chmod 777 phantomjs. http://hejianghua16.blog.163.com/blog/static/3107155320107411011391/

pJS <- phantom(paste0(getwd(),'/phantomjs'))
webDr <- remoteDriver(browserName = paste0(getwd(),'/phantomjs'))
- webDr$open(silent=TRUE) #Endless keep loading here
webDr$navigate(lnk3)

But there is another error after I restart and rerun the code...

> pJS <- phantom(paste0(getwd(),'/phantomjs'))
- /home/usr/phantomjs: error while loading shared libraries: libpng12.so.0: wrong ELF class: ELFCLASS32
johndharrison commented 7 years ago

Please use either rsDriver or wdman::phantomjs to install and drive phantomjs.