wch / webshot

Take screenshots of web pages from R
http://wch.github.io/webshot/
228 stars 40 forks source link

How to use `rmdshot()` on a remote server? #51

Open mbacou opened 6 years ago

mbacou commented 6 years ago

Hi, I'm trying to capture a dynamic index.Rmd document located on a remote VM. Both appshot() and rmdshot() error out when I run them from R console via SSH or in RStudio Server. This VM has Shiny Server on port 3838 and RStudio Server on port 8787. Not sure which port to use/open on the remote server to make this work. Any tip?

> rmdshot("bio-profiles/index.Rmd", "test.png", delay=3)
Could not load  http://127.0.0.1:3907/
=> errors out
> rmdshot("bio-profiles/index.Rmd", "test.png", delay=3, port=3838)
=> this captures a 404 blank page
> rmdshot("bio-profiles/index.Rmd", "test.png", delay=3, port=8787)
=> this captures RStudio Server Login page
wch commented 6 years ago

If it's on a remote server, you should be using webshot -- the other two, appshot and rmdshot, will run the app/document locally and try to take a screenshot of that. The documentation could be more clear on that point.

If you can point a web browser to an address that shows the app/document, webshot should be able to screenshot it. For opening ports or forwarding ports from a VM, that will vary from system to system, and I can't really give much help.

mbacou commented 6 years ago

These Shiny apps/docs on the VM are secured via Google Oauth and not publicly accessible. This is the reason why I am trying to create thumbnails from an R session on the VM directly. I feel that if rmdshot() is able to thumbnail RStudio Login page on that VM, it should also be possible to capture a rendered rmarkdown doc?

wch commented 6 years ago

Oh OK, I didn't realize you were logged into RStudio on the VM. This works for me on a remote RStudio Server:

rmdshot('test/index.Rmd')

I don't know why it's not working on your setup. It's possible that your server is blocking all incoming traffic except on specific ports. If that's the case, you'll probably need to configure your server to unblock a specific port (say, 5000), and then pass that port number as the port argument of appshot/rmdshot. I'd suggest first trying to get appshot to work, since it's a bit simpler than rmdshot and there's less that could go wrong.

One thing that makes me a bit uncertain about that solution is that the traffic should be all on the VM's localhost, and localhost traffic usually isn't blocked for most firewall configurations.

mbacou commented 6 years ago

Thx, I've now tested a few more Rmd documents, and it seems the issue could be to that my .Rmd document reads in data from a global.R that's ignored by both rmdshot() and appshot()?

A static Rmd document:
> rmdshot("./secure/assets/index.Rmd", "test.png", delay=3)
output file: index.knit.md
Output created: /tmp/Rtmpy1eHLK/webshot1291465ccc2cc.html
=> works

Same document with runtime: shiny
> rmdshot("./secure/assets/index.Rmd", "test.png", delay=3)
output file: index.knit.md
Output created: /tmp/Rtmpy1eHLK/webshot1291465ccc2cc.html
=> works

Another document that needs global.R
> rmdshot("./secure/bio-profiles/index.Rmd", "test.png", delay=3)
Could not load  http://127.0.0.1:8341/
Error in webshot(sprintf("http://127.0.0.1:%d/", port), file = file, ...) : 
  webshot.js returned failure value: 1

Same using appshot()
> appshot("./secure/bio-profiles/", "test.png", delay=3)
Could not load  http://127.0.0.1:5812/
Error in webshot(sprintf("http://127.0.0.1:%d/", port), file = file, ...) : 
  webshot.js returned failure value: 1
wch commented 6 years ago

I'd be surprised if the global.R makes a difference. appshot() runs the application with shiny::runApp(), and rmdshot() runs the document with rmarkdown::run().

mbacou commented 6 years ago

Hi @wch, I'm not able to track this particular problem, but I've made a public version of the Shiny app that seems to cause problem (when using either rmdshot() or appshot() locally, or even when trying to access over the public network with webshot()). I've tried on my local machine, and on a remote VM, but the error seems consistent. Not sure if that's any help.

# Shiny app that fails
webshot("https://data.worldcovr.com/shiny/bio-profiles/", "test.png")
# Could not load  https://data.worldcovr.com/shiny/bio-profiles/
# Error in webshot("https://data.worldcovr.com/shiny/bio-profiles/", "test.png") : 
#  webshot.js returned failure value: 1
# => errors out

# Another Shiny app that works
webshot("http://tools.harvestchoice.org/rainfall/", "test.png")
# => works
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rmarkdown_1.8.5 webshot_0.5.0  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14    lattice_0.20-35 zoo_1.8-0       digest_0.6.13   rprojroot_1.3-1 grid_3.4.3      jsonlite_1.5   
 [8] backports_1.1.2 magrittr_1.5    evaluate_0.10.2 stringi_1.1.6   xts_0.10-0      tools_3.4.3     stringr_1.2.0  
[15] yaml_2.1.16     compiler_3.4.3  htmltools_0.3.6 knitr_1.17.20 
wch commented 6 years ago

It looks like there's some sort of SSL problem. I added a debug option, and when you use it, it does the following:

> webshot("http://data.worldcovr.com/shiny/bio-profiles/", "test.png", debug=T)
[info] [phantom] Starting...
[info] [phantom] Running suite: 1 step
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step anonymous 1/1: done in 33ms.
[info] [phantom] Step anonymous 2/2: done in 47ms.
[info] [phantom] Step _step 3/7: done in 67ms.
[debug] [phantom] opening url: http://data.worldcovr.com/shiny/bio-profiles/, HTTP GET
[debug] [phantom] Navigation requested: url=http://data.worldcovr.com/shiny/bio-profiles/, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Navigation requested: url=https://data.worldcovr.com/shiny/bio-profiles/, type=Other, willNavigate=true, isMainFrame=true
[warning] [phantom] Loading resource failed with status=fail: https://data.worldcovr.com/shiny/bio-profiles/
Could not load  https://data.worldcovr.com/shiny/bio-profiles/
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "about:blank"
Error in webshot("http://data.worldcovr.com/shiny/bio-profiles/", "test.png",  : 
  webshot.js returned failure value: 1

Searching for Loading resource failed with status=fail, I found this, which says it's a problem with SSL: https://stackoverflow.com/questions/22461345/casperjs-status-fail-on-a-webpage

wch commented 6 years ago

@mbacou I think I've fixed it. Please try it out and let me know how it goes.

mbacou commented 6 years ago

@wch thanks very much, that fixes the issue with webshot() over SSL. I'm still unable to screenshot the same app on my VM using either appshot() or rmdshot(). Will try to run more tests and update here.