yihui / knitr

A general-purpose tool for dynamic report generation in R
https://yihui.org/knitr/
2.36k stars 873 forks source link

VignetteEngine{knitr::knitr} cannot open file script src prism-core.min.js #2254

Closed tdhock closed 1 year ago

tdhock commented 1 year ago

By filing an issue to this repo, I promise that

I understand that my issue may be closed if I don't fulfill my promises.

Hi @yihui first of all thanks for this wonderful package, which I find very useful.

I have a package on CRAN that uses VignetteEngine{knitr::knitr} to build a vignette, and it recently started failing CRAN checks with the following message (on windows only). https://cloud.r-project.org/web/checks/check_results_nc.html

Check: re-building of vignette outputs
Result: ERROR
    Error(s) in re-building vignettes:
    --- re-building 'v1-capture-first.Rmd' using knitr
    Warning in file(con, "r") :
     cannot open file 'cript src="https://cdn.jsdelivr.net/npm/prismjs@1.29.0/components/prism-core.min.js" defer></script>
    <cript src="https://cdn.jsdelivr.net/npm/prismjs@1.29.0/components/prism-core.min.js" defer></script>
    <': No such file or directory
    Error: processing vignette 'v1-capture-first.Rmd' failed with diagnostics:
    cannot open the connection
    --- failed re-building 'v1-capture-first.Rmd'

The issue seems to happen only on windows, and only if the Rmd file has a code chunk that prints emoji. I expected that it should be possible on windows to render an Rmd that has a code chunk that prints emoji, or at least that there should be a more informative error message, like "Rmd with code chunks that print emoji is not supported on windows, but v1-capture-first.Rmd has a code chunk that prints emoji, please remove."

Based on the error message above and the fact that it only occurs on windows, I suspect that a fix to this vignette builder on your end would entail changing some usage of file.path / backslash separator, to a forward slash.

On my end (user who is trying to build a vignette) a fix is to change VignetteEngine{knitr::knitr} to VignetteEngine{knitr::rmarkdown}.

Code to reproduce issue is below,

(vignette.Rmd <- tempfile(fileext=".Rmd"))
cat('<!--
%\\VignetteEngine{knitr::knitr}
-->\n```{r}
"a\\U0001F60E#"\n```
', file=vignette.Rmd)
tools::vignetteEngine("knitr::knitr")$weave(vignette.Rmd)
traceback()
sessionInfo()

The output I got on my system is below,

> (vignette.Rmd <- tempfile(fileext=".Rmd"))
[1] "C:\\Users\\th798\\AppData\\Local\\Temp\\Rtmpo9uKmD\\file3da45f2e511a.Rmd"
> cat('<!--
+ %\\VignetteEngine{knitr::knitr}
+ -->\n```{r}
+ "a\\U0001F60E#"\n```
+ ', file=vignette.Rmd)
> tools::vignetteEngine("knitr::knitr")$weave(vignette.Rmd)

processing file: C:\Users\th798\AppData\Local\Temp\Rtmpo9uKmD\file3da45f2e511a.Rmd

output file: file3da45f2e511a.md

Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
  cannot open file 'script src="https://cdn.jsdelivr.net/npm/prismjs@1.29.0/components/prism-core.min.js" defer></script>
script src="https://cdn.jsdelivr.net/npm/prismjs@1.29.0/components/prism-core.min.js" defer></script>
': Invalid argument
> traceback()
22: file(con, "r")
21: readLines(con, encoding = "UTF-8", warn = FALSE)
20: xfun::read_utf8(x)
19: base64_url(x, xfun::read_utf8(x), ext)
18: resolve_external(x, is_web, ext)
17: unique.default(c("AsIs", oldClass(x)))
16: I(c(t2[1], resolve_external(x, is_web, ext), t2[2]))
15: one_string(I(c(t2[1], resolve_external(x, is_web, ext), t2[2])))
14: (function (x, ext = "", embed_https = FALSE, embed_local = FALSE) 
    {
        if (ext == "css") {
            t1 = "<link rel=\"stylesheet\" href=\"%s\">"
            t2 = c("<style type=\"text/css\">", "</style>")
        }
        else if (ext == "js") {
            t1 = "<script src=\"%s\" defer></script>"
            t2 = c("<script>", "</script>")
        }
        else stop("The file extension '", ext, "' is not supported.")
        is_web = is_https(x)
        is_rel = !is_web && xfun::is_rel_path(x)
        if (is_web && embed_https && xfun::url_filename(x) == "MathJax.js") {
            warning("MathJax.js cannot be embedded. Please use MathJax v3 instead.")
            embed_https = FALSE
        }
        if ((is_rel && !embed_local) || (is_web && !embed_https)) {
            sprintf(t1, x)
        }
        else {
            one_string(I(c(t2[1], resolve_external(x, is_web, ext), 
                t2[2])))
        }
    })(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]], dots[[4L]][[1L]])
13: mapply(gen_tag, ...)
12: gen_tags(z3[i1], ifelse(js[i1], "js", "css"), embed[1], embed[2])
11: replace(z)
10: FUN(X[[i]], ...)
9: lapply(regmatches(x, m), function(z) {
       if (length(z)) 
           replace(z)
       else z
   })
8: match_replace(x, r, function(z) {
       z1 = sub(r, "\\1", z)
       z2 = sub(r, "\\2", z)
       js = z2 != ""
       z3 = paste0(z1, z2)
       i1 = !grepl("^data:.+;base64,.+", z3)
       z3[i1] = gen_tags(z3[i1], ifelse(js[i1], "js", "css"), embed[1], 
           embed[2])
       i2 = grepl(" (defer|async)(>| )", z) & js
       x2 <<- c(x2, z3[i2])
       z3[i2] = ""
       z3
   })
7: embed_resources(ret, options[["embed_resources"]])
6: xfun::in_dir(if (is_file(file, TRUE)) dirname(file) else ".", 
       embed_resources(ret, options[["embed_resources"]]))
5: mark(..., format = "html", template = template)
4: markdown::mark_html(...)
3: mark_html(out, output, ...)
2: (if (grepl("\\.[Rr]md$", file)) knit2html else if (grepl("\\.[Rr]rst$", 
       file)) knit2pandoc else knit)(file, encoding = encoding, 
       quiet = quiet, envir = globalenv(), ...)
1: tools::vignetteEngine("knitr::knitr")$weave(vignette.Rmd)
> sessionInfo()
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Phoenix
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.3.0   markdown_1.6     tools_4.3.0      knitr_1.42      
[5] xfun_0.39        commonmark_1.9.0 evaluate_0.20   
> 

Thanks again and hope this helps. Toby

yihui commented 1 year ago

This could be a bug in base R. Here is a minimal reproducible example:

x = '\U0001F60E abc'
m = gregexpr('a', x)
regmatches(x, m)

It should return 'a', but returns 'b' instead; m should start from 3 instead of 4. I can work around it by using perl = TRUE in gregexpr(). I'm not sure if it's worth reporting this problem to the R core team.

yihui commented 1 year ago

Should be fixed in the markdown package now:

if (packageVersion('markdown') < '1.6.3')
  install.packages('markdown', repos = c('https://rstudio.r-universe.dev', 'https://cloud.r-project.org'))
tdhock commented 1 year ago

thanks!

cderv commented 1 year ago

My understanding with Emoji character and more generally multibyte character is that useBytes = TRUE should be used.

x = '\U0001F60E abc'
m = gregexpr('a', x, useBytes = TRUE)
regmatches(x, m)
#> [[1]]
#> [1] "a"

Not sure it is a bug.

Anyway, using perl =TRUE is probably an ok change around that.

yihui commented 1 year ago

Thanks! My understanding is that an emoji character is not necessarily one character, which makes it tricky to deal with. I've had some pain with this some time ago in JavaScript.

It seems that R core tends to discourage using useBytes = TRUE, so I often try to avoid it:

cderv commented 1 year ago

It seems that R core tends to discourage using useBytes = TRUE, so I often try to avoid it:

Oh good to know ! thank you for the blog post

github-actions[bot] commented 8 months ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.