r-lib / urlchecker

Run CRAN URL checks from older versions of R
https://urlchecker.r-lib.org/
GNU General Public License v3.0
47 stars 4 forks source link

Put RStudio's Pandoc on the path during a check #10

Closed dpprdan closed 3 years ago

dpprdan commented 3 years ago

At least on Windows Pandoc (as installed by RStudio) is unlikely to be on the system PATH, which means that URLs in Rmds and READMEs are not checked. (I have had a package submission rejected due to this.)

RStudio users do have Pandoc installed and the path can be found out with Sys.getenv("RSTUDIO_PANDOC"). So this could be put on the path with withr::local_path(Sys.getenv("RSTUDIO_PANDOC")). This would add a dependency on {withr}, however. (Cf. https://github.com/r-lib/rcmdcheck/issues/109, https://github.com/r-lib/rcmdcheck/pull/132).

Please consider putting RStudio's Pandoc on the path. I could draft a PR if it helps.

Possible workaround: Run withr::with_path(Sys.getenv("RSTUDIO_PANDOC"), url_check()).

jimhester commented 3 years ago

I think probably it would be best to use rmarkdown::find_pandoc() for this and only do it conditional on the presence of the rmarkdown and withr packages.

dpprdan commented 3 years ago

I've used Sys.getenv("RSTUDIO_PANDOC") over rmarkdown::find_pandoc() in https://github.com/r-lib/rcmdcheck/pull/132, because rmarkdown::find_pandoc() effectively only adds ~/opt/pandoc/ as a search location for pandoc at the cost of its dependency:

this function searches for pandoc from the environment variable RSTUDIO_PANDOC (the RStudio IDE will set this variable to the directory of Pandoc bundled with the IDE), the environment variable PATH, and the directory ~/opt/pandoc/ https://rmarkdown.rstudio.com/docs/reference/find_pandoc.html

But maybe ~/opt/pandoc/ is more relevant than I realise. (I.e. this is just a note, either approach is fine).

I am a bit worried that it is easy to miss that certain docs are not checked, if you don't have the right packages installed (re: your suggestion to make this conditional). CRAN will check those docs either way. To put it more bluntly: what's the use of an incomplete URL check? So should there generally be messages/warnings about which docs were (not) checked?

jimhester commented 3 years ago

Ok, you could use something like this to also avoid the withr dependency, which I would prefer.

with_pandoc_path <- function(code) {
  pandoc_path <- Sys.getenv("RSTUDIO_PANDOC")
  if (nzchar(pandoc_path)) {
    old_path <- Sys.getenv("PATH")
    Sys.setenv("PATH", file.path(pandoc_path, old, sep = .Platform$path.sep))
    on.exit(Sys.setenv("PATH", old_path))
  }
  pandoc_location <- Sys.which("pandoc")
  if (!nzchar(pandoc_location)) {
    stop("pandoc not found!")
  }
  force(code)
}