rocker-org / rocker-versioned

Run current & prior versions of R using docker
https://hub.docker.com/r/rocker/r-ver
GNU General Public License v2.0
297 stars 169 forks source link

"pandoc: test.html: hClose: hardware fault (I/O error)" when using rocker/verse on Windows #103

Closed vnijs closed 2 years ago

vnijs commented 6 years ago

When trying to create an HTML file from any Rmarkdown file using rocker/version on Windows I (and students) get the error shown below.

I think this may be a Docker-on-Windows problem rather than an Rstudio or Rmarkdown problem but wanted to see if you have any suggestions about what may be going on and how to best report this to try and get it resolved. FYI This error does not happen on Mac or Linux, nor when trying to create a PDF or Word file, nor when no local drive is mounted. I don't recall having any issues with previous versions of Docker-on-Windows.

Docker version: Docker version 18.06.1-ce-win73

Docker command: docker run --rm -p 8787:8787 -e USER="rstudio" -e PASSWORD="something" -v C:/Users/$USERNAME:/home/rstudio rocker/verse

Error:

pandoc: test.html: hClose: hardware fault (I/O error)
Error: pandoc document conversion failed with error 1
Execution halted
cboettig commented 6 years ago

That's pretty interesting. Do you still get the error if you try without linking a local volume?

(also just a note that since it sounds like you're on a local system, you can now skip the password and bind localhost):

docker run --rm  -p 127.0.0.1:8787:8787   -e DISABLE_AUTH=true  rocker/verse
vnijs commented 6 years ago

Nope. This error does not happen on Mac or Linux, nor when trying to create a PDF or Word file, nor when no local drive is mounted.

Do you think this is likely to be a Docker issue or perhaps an Rmarkdown issue?

cboettig commented 6 years ago

Can you try with Docker toolbox on windows, instead of the CE edition? (I would suspect it would work in toolbox, since that's a virtualbox linux emulator, and thus this would appear to be some bug in docker windows virtualization in the CE edition, but hard to be sure...)

vnijs commented 6 years ago

All 70+ students in my class now have Docker CE installed and I'd rather not move to Docker Toolbox at this stage. Any suggestions on how I might report this to https://github.com/docker/for-win in a reproducible manner?

cboettig commented 6 years ago

@vnijs for sure, I hear you on that one. Just wanted to confirm that the error is isolated to the CE edition (I don't have a windows box for testing).

I think Docker folks will probably bounce it back to us as a pandoc error without some more info.

vnijs commented 6 years ago

It does work with self_contained: no. However, I do need the HTML to be self contained :)

cboettig commented 6 years ago

@vnijs thanks for confirming. No worries, I'm not trying to change your workflow, I agree this must be a bug somewhere. I'm just trying to better pinpoint the source of the bug so we can get a productive bug report with the right details to the right source (i.e. problem may still ultimately be on pandoc or RMarkdown end and not technically a Docker Windows CE issue). Since I don't have a Windows box handy to reproduce the error I appreciate you testing these things out.

My current guess is that this has something to do with the use of a temporary directory. Seems to be related to this thread: https://github.com/rstudio/rmarkdown/issues/701, which appears to still be open, though some threads seem to suggest this is a regression in Pandoc 2.x instead.

@yihui might have some further insight on rmarkdown's behavior here.

vnijs commented 6 years ago

Thanks. I did try the github version of Rmarkdown but that didn't help. That issue refers to PDF but PDF works fine. It is just self-contained HTML that doesn't work. I guess it might be related to a "network" drive issue given how Docker-on-Windows maps local drives

yihui commented 6 years ago

Sorry, but I don't have an idea (not a Windows user, and little experience with network drives)...

nuest commented 6 years ago

I can confirm the issue, testing on Windows with latest rocker/verse.

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

loaded via a namespace (and not attached):
 [1] compiler_3.5.1  backports_1.1.2 magrittr_1.5   
 [4] rprojroot_1.3-2 htmltools_0.3.6 tools_3.5.1    
 [7] yaml_2.2.0      Rcpp_0.12.19    stringi_1.2.4  
[10] rmarkdown_1.10  knitr_1.20      stringr_1.3.1  
[13] digest_0.6.17   evaluate_0.11

Also

rstudio@bb2ce41b0ba3:~$ pandoc --version
pandoc 1.19.2.1
Compiled with pandoc-types 1.17.0.4, texmath 0.9, skylighting 0.1.1.4
Default user data directory: /home/rstudio/.pandoc
...

The workaround with self_contained: no does not work for me. Then I get

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS test.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output test.html --smart --email-obfuscation none --standalone --section-divs --template /usr/local/lib/R/site-library/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /tmp/RtmpI10Jk8/rmarkdown-str1fe366e94db.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' 
pandoc: test.html: openFile: does not exist (No such file or directory)
Error: pandoc document conversion failed with error 1
Execution halted

which seems to be a permission issue - I cannot open the file from outside of the container either. But I have no experiences in sharing directories on Windows... Starting the container with

docker run --rm -p 8787:8787 -e DISABLE_AUTH=true -v C:/Users/Daniel:/home/rstudio rocker/verse

the files I create as user rstudio (test2.Rmd actually belong to root. Weird.

rstudio@4589192786d9:~$ ls -l /home/rstudio/Desktop/
total 1426345
-rwxr-xr-x 1 root root        282 Sep 21 07:39 desktop.ini
-rwxr-xr-x 1 root root       2063 Sep 18 08:32 Docker for Windows.lnk
drwxrwxrwx 2 root root          0 Oct  1 10:27 o2r
-rwxr-xr-x 1 root root        794 Oct  2 09:18 test2.Rmd
rstudio@4589192786d9:~$ whoami
rstudio

Googling around a bit, volume mounts "work, but not completely".

This is the mount configuration from docker inspect using the command above:

 "Mounts": [
            {
                "Type": "volume",
                "Name": "95c91fe52b8b26a348031be5270009ce89777d3d71284a7fac902ae14abd9f38",
                "Source": "/var/lib/docker/volumes/95c91fe52b8b26a348031be5270009ce89777d3d71284a7fac902ae14abd9f38/_data",
                "Destination": "/home/rstudio/kitematic",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "bind",
                "Source": "/host_mnt/c/Users/Daniel",
                "Destination": "/home/rstudio",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],

So I tried mounting to the existing volume /home/rstudio/kitematic:

 docker run --rm -p 8787:8787 -e DISABLE_AUTH=true -v D:/:/home/rstudio/kitematic rocker/verse

and also found this and tried

docker run --rm -p 8787:8787 -e DISABLE_AUTH=true -v /host_mnt/d/:/home/rstudio/ rocker/verse

But no fix...

@vnijs Have you checked output of net share c? See https://github.com/docker/for-win/issues/25


So, final attempt: Installing latest pandoc, which requires to run the image as root:

 docker run --rm -p 8787:8787 -e DISABLE_AUTH=true -e ROOT=TRUE -v C:/Users/Daniel/:/home/rstudio/kitematic rocker/verse

In the RStudio terminal:

rstudio@026a8a79287e:/tmp$ wget https://github.com/jgm/pandoc/releases/download/2.3.1/pandoc-2.3.1-linux.tar.gz
[...]
rstudio@026a8a79287e:/tmp$ sudo tar xvzf pandoc-2.3.1-linux.tar.gz --strip-components1 -C /usr/local
[...]
rstudio@026a8a79287e:/tmp$ pandoc --version
pandoc 2.3.1
Compiled with pandoc-types 1.17.5.1, texmath 0.11.1, skylighting 0.7.2
Default user data directory: /home/rstudio/.pandoc
[...]

It works!

@vnijs Maybe you can try that out? It's probably not a solution for your students, but might help understanding the problem.

@cboettig Does that bring us closer to solution? One could use a custom Dockerfile and install the latest Pandoc.

Happy to test more if there's any ideas coming up, until beginning of next week when I'm back on Windows :-)

cboettig commented 6 years ago

@nuest that's great news, very nice work tracking that down.

@yihui Would you have any insight on when we might see pandoc 2.3.1 ship with RStudio?

Otherwise we could consider bundling 2.3.1 directly into the Docker image rather than sym-linking the version shipping with RStudio.

eddelbuettel commented 6 years ago

FWIW 1.19.* is still the last version in Debian/Ubuntu. Given that some converters now impose newer versions (cough, cough), I sometimes do

edd@rob:~$ ls -l bin/pandoc
lrwxrwxrwx 1 edd edd 34 Sep  4 08:42 bin/pandoc -> /usr/lib/rstudio/bin/pandoc/pandoc
edd@rob:~$

which is course something we could easily do in the containers (maybe from /usr/local/bin).

cboettig commented 6 years ago

@eddelbuettel yup, we already link rstudio-server's pre-packaged pandoc rather than installing the debian binary:

https://github.com/rocker-org/rocker-versioned/blob/4b5000924c8e9861cac9a33ed16c366ecaa02641/rstudio/Dockerfile#L35

but looks like we need a newer one here.

eddelbuettel commented 6 years ago

Color me confused--are you using a jurassic RStudio Server? Asking for a friend who is a user of dailies where he can... ;-)

yihui commented 6 years ago

@cboettig The preview version of RStudio ships Pandoc 2.2.1: https://www.rstudio.com/products/rstudio/download/preview/ I haven't heard plans to further upgrade that to the very latest version of Pandoc, but if you have a compelling reason, you can certainly request the RStudio IDE team to consider a later version.

cboettig commented 6 years ago

Thanks @yihui ! That might be good enough to fix it -- maybe Daniel or Vincent can test that for us on Windows?

@eddelbuettel Looks like we ship with the latest RStudio release (as per VERSIONS.md) on rocker/verse:latest:

https://github.com/rocker-org/rocker-versioned/blob/4b5000924c8e9861cac9a33ed16c366ecaa02641/rstudio/Dockerfile#L29-L31

which appears to be rstudio-server 1.1.456 and still at pandoc 1.19.2.1

So, that would give us three courses of action:

  1. Wait until next RStudio release, which should bump the pandoc version and may resolve the issue.
  2. Patch rstudio:latest Dockerfile to install pandoc directly from the GitHub binaries,
  3. Arguably we could also/instead just bump the devel tag such that rstudio:devel would install RStudio preview as well as R-devel.

Votes for these options?

vnijs commented 6 years ago

@nuest Thanks for the detailed investigation. I can confirm that installing pandoc 2.3.1 fixes the problem on my Windows 10 machine. I will also ask my students to test this out and report back if needed.

wget https://github.com/jgm/pandoc/releases/download/2.3.1/pandoc-2.3.1-linux.tar.gz
sudo tar xvzf pandoc-2.3.1-linux.tar.gz --strip-components=1 -C /usr/local
vnijs commented 5 years ago

FYI Rstudio preview now has pandoc 2.3.1 which addresses this issue. Close?

https://www.rstudio.com/products/rstudio/download/preview/