r-lib / pak

A fresh approach to package installation
https://pak.r-lib.org
639 stars 56 forks source link

Installing to site-library with a different user fails #638

Open AlexAxthelm opened 1 month ago

AlexAxthelm commented 1 month ago

Hi, loving the utility and flexibility of pak!

I've encountered what might be an edge-case issue, but if I can find a solution, it would really help.

I'm calling pak as part of my docker build process, and when I try to run pak multiple times, as different (unix) users (which are part of the staff group, which has permissions to site-library), the process is failing, with similar messages to those reported in #310.

#12 33.01 Error:
#12 33.01 ! error in pak subprocess
#12 33.01 Caused by error in `verify_extracted_package(filename, pkg_cache)`:
#12 33.01 !
#12 33.01 '/tmp/Rtmp9AYWUh/file3614634922/src/contrib/x86_64-pc-linux-gnu-ubuntu-22.04/4.3/praise_1.0.0.tar.gz'
#12 33.01 is not a valid R package, it is an empty archive.
#12 33.01 ---
#12 33.01 Backtrace:
#12 33.01 1. pak::pak("praise")
#12 33.01 2. pak::pkg_install(pkg, ...)
#12 33.01 3. pak:::remote(function(...) get("pkg_install_do_plan", asNamespace("pak"))(...), …
#12 33.01 4. err$throw(res$error)
#12 33.01 ---

Repro Steps

I've constructed a (mostly) minimal Dockerfile that demonstrates the issue. In it, I:

  1. set the CRAN repo to Posit Package manager (binaries)
  2. Install pak
  3. Install a package, using pak
  4. Add and set a demo user (adding them to the staff group)
  5. Prove that the new user has write permissions to site-library
  6. (Attempt to) install another package.
FROM docker.io/rocker/r-ver:4.3.1 AS base

# set frozen CRAN repo and RProfile.site
RUN echo "options(repos = c(CRAN = 'https://packagemanager.posit.co/cran/__linux__/jammy/2023-10-30'))" \
  > "${R_HOME}/etc/Rprofile.site"

# Install pak
RUN Rscript -e 'install.packages("pak", repos = sprintf("https://r-lib.github.io/p/pak/stable/%s/%s/%s", .Platform$pkgType, R.Version()$os, R.Version()$arch))'

RUN whoami

RUN Rscript -e 'pak::pak("base64enc")'

# Create and use non-root user
# -m creates a home directory,
# -G adds user to staff group allowing R package installation.
RUN useradd \
      -m \
      -G staff \
      demo-user
USER demo-user
WORKDIR /home/demo-user

# prove that we have permissions on site-library
RUN whoami \
      && touch /usr/local/lib/R/site-library/foo.txt \
      && rm /usr/local/lib/R/site-library/foo.txt

RUN Rscript -e "\
    pak::pak('praise'); \
    "

In this example, praise can be installed under any of the following conditions:

Things that do not get the second package to install:

Docker Build Logs

Build Log
``` $ docker build --no-cache . -t pak_demo #0 building with "desktop-linux" instance using docker driver #1 [internal] load .dockerignore #1 transferring context: 2B done #1 DONE 0.0s #2 [internal] load build definition from Dockerfile #2 transferring dockerfile: 932B done #2 DONE 0.0s #3 [internal] load metadata for docker.io/rocker/r-ver:4.3.1 #3 DONE 0.0s #4 [1/9] FROM docker.io/rocker/r-ver:4.3.1 #4 CACHED #5 [2/9] RUN echo "options(repos = c(CRAN = 'https://packagemanager.posit.co/cran/__linux__/jammy/2023-10-30'))" > "/usr/local/lib/R/etc/Rprofile.site" #5 DONE 0.1s #6 [3/9] RUN Rscript -e 'install.packages("pak", repos = sprintf("https://r-lib.github.io/p/pak/stable/%s/%s/%s", .Platform$pkgType, R.Version()$os, R.Version()$arch))' #6 1.175 Installing package into ‘/usr/local/lib/R/site-library’ #6 1.175 (as ‘lib’ is unspecified) #6 1.807 trying URL 'https://r-lib.github.io/p/pak/stable/source/linux-gnu/x86_64/src/contrib/../../../../../linux/x86_64/pak_0.7.2_R-4-3_x86_64-linux.tar.gz' #6 2.270 Content type 'application/gzip' length 8038050 bytes (7.7 MB) #6 2.377 ================================================== #6 2.861 downloaded 7.7 MB #6 2.861 #6 6.852 * installing *binary* package ‘pak’ ... #6 6.924 * DONE (pak) #6 6.982 #6 6.982 The downloaded source packages are in #6 6.982 ‘/tmp/RtmpHREPme/downloaded_packages’ #6 DONE 7.1s #7 [4/9] RUN whoami #7 0.285 root #7 DONE 0.3s #8 [5/9] RUN Rscript -e 'pak::pak("base64enc")' #8 5.155 #8 9.011 ✔ Updated metadata database: 3.59 MB in 9 files. #8 9.014 #8 9.018 ℹ Updating metadata database #8 25.54 ✔ Updating metadata database ... done #8 25.55 #8 26.09 #8 26.12 → Will install 1 package. #8 26.19 → Will download 1 package with unknown size. #8 26.20 + base64enc 0.1-3 [dl] #8 26.21 #8 26.71 ℹ Getting 1 pkg with unknown size #8 27.80 ✔ Got base64enc 0.1-3 (x86_64-pc-linux-gnu-ubuntu-22.04) (26.09 kB) #8 28.25 ✔ Installed base64enc 0.1-3 (155ms) #8 28.30 ✔ 1 pkg: added 1, dld 1 (26.09 kB) [27s] #8 DONE 28.5s #9 [6/9] RUN useradd -m -G staff demo-user #9 DONE 0.3s #10 [7/9] WORKDIR /home/demo-user #10 DONE 0.0s #11 [8/9] RUN whoami && touch /usr/local/lib/R/site-library/foo.txt && rm /usr/local/lib/R/site-library/foo.txt #11 0.199 demo-user #11 DONE 0.2s #12 [9/9] RUN Rscript -e " pak::pak('praise'); " #12 5.353 #12 9.093 ✔ Updated metadata database: 3.59 MB in 9 files. #12 9.097 #12 9.100 ℹ Updating metadata database #12 25.54 ✔ Updating metadata database ... done #12 25.54 #12 26.21 #12 26.24 → Will install 1 package. #12 26.30 → Will download 1 package with unknown size. #12 26.32 + praise 1.0.0 [dl] #12 26.33 #12 26.86 ℹ Getting 1 pkg with unknown size #12 28.19 ✔ Got praise 1.0.0 (x86_64-pc-linux-gnu-ubuntu-22.04) (16.15 kB) #12 33.01 Error: #12 33.01 ! error in pak subprocess #12 33.01 Caused by error in `verify_extracted_package(filename, pkg_cache)`: #12 33.01 ! #12 33.01 '/tmp/Rtmp9AYWUh/file3614634922/src/contrib/x86_64-pc-linux-gnu-ubuntu-22.04/4.3/praise_1.0.0.tar.gz' #12 33.01 is not a valid R package, it is an empty archive. #12 33.01 --- #12 33.01 Backtrace: #12 33.01 1. pak::pak("praise") #12 33.01 2. pak::pkg_install(pkg, ...) #12 33.01 3. pak:::remote(function(...) get("pkg_install_do_plan", asNamespace("pak"))(...), … #12 33.01 4. err$throw(res$error) #12 33.01 --- #12 33.01 Subprocess backtrace: #12 33.01 1. base::withCallingHandlers(cli_message = function(msg) { … #12 33.01 2. get("pkg_install_do_plan", asNamespace("pak"))(...) #12 33.01 3. proposal$install() #12 33.01 4. pkgdepends::install_package_plan(plan, lib = private$library, num_workers = nw, … #12 33.01 5. base::withCallingHandlers({ … #12 33.01 6. pkgdepends:::handle_events(state, events) #12 33.01 7. pkgdepends:::handle_event(state, i) #12 33.01 8. proc$get_result() #12 33.01 9. processx:::process_get_result(self, private) #12 33.01 10. private$post_process() #12 33.01 11. pkgdepends:::install_extracted_binary(filename, lib_cache, pkg_cache, lib, … #12 33.01 12. pkgdepends:::verify_extracted_package(filename, pkg_cache) #12 33.01 13. base::throw(pkg_error("{.path {filename}} is not a valid R package, it is an emp… #12 33.01 14. | base::signalCondition(cond) #12 33.01 15. global (function (e) … #12 33.01 Execution halted #12 ERROR: process "/bin/sh -c Rscript -e \" pak::pak('praise'); \"" did not complete successfully: exit code: 1 ------ > [9/9] RUN Rscript -e " pak::pak('praise'); ": 33.01 7. pkgdepends:::handle_event(state, i) 33.01 8. proc$get_result() 33.01 9. processx:::process_get_result(self, private) 33.01 10. private$post_process() 33.01 11. pkgdepends:::install_extracted_binary(filename, lib_cache, pkg_cache, lib, … 33.01 12. pkgdepends:::verify_extracted_package(filename, pkg_cache) 33.01 13. base::throw(pkg_error("{.path {filename}} is not a valid R package, it is an emp… 33.01 14. | base::signalCondition(cond) 33.01 15. global (function (e) … 33.01 Execution halted ------ Dockerfile:29 -------------------- 28 | 29 | >>> RUN Rscript -e "\ 30 | >>> pak::pak('praise'); \ 31 | >>> " 32 | -------------------- ERROR: failed to solve: process "/bin/sh -c Rscript -e \" pak::pak('praise'); \"" did not complete successfully: exit code: 1 ```
gaborcsardi commented 1 month ago

This does not seems like a pak issue to me, but something is wrong with permissions in this container. If I build it without the last RUN and then run it as a non-root user, I get:

❯ docker run -ti -u demo-user sha256:10d5ac983f33d56eca94f5c051eb91366c9c5da3182ceff8fffcb28fe723721e bash
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
demo-user@7df0610dde0a:~$ R
Fatal error: cannot create 'R_TempDir'

In fact even this fails:

FROM rstudio/r-base:4.3.1-jammy AS base

RUN useradd -m -G staff demo-user
USER demo-user
WORKDIR /home/demo-user

RUN R -q -e 'getRversion()'
 => ERROR [4/4] RUN R -q -e 'getRversion()'                                                                                             0.3s
------
 > [4/4] RUN R -q -e 'getRversion()':
0.246 Fatal error: cannot create 'R_TempDir'
AlexAxthelm commented 1 month ago

Interesting, when I try to build your Dockerfile, it comes through cleanly. I'm on an M2 MacBook Air, but with DOCKER_DEFAULT_PLATFORM set to linux/amd64 (I get the same sucessful build with it unset though).

Normally I would probably call up debugonce() on the functions involved, but the calls to processx for parallelization are bypassing that. Is there a good way to get "into" the calls to pkgdepends::install_extracted_binary()?

My current working theory is that the permissions for the site-library are correct-ish, but I'm noting that in this container, the site-library direcotry has a s instead of x permissions flag for the group, which seems to have interactions with compressed files, but I don't know if that would affect the packages in the pkgcache cache.

demo-user@37b266a7ff96:~$ ls -la $R_HOME
total 80
drwxr-xr-x  1 root root   4096 Apr 24 12:00 .
drwxr-xr-x  1 root root   4096 Apr 24 12:00 ..
drwxr-xr-x  3 root root   4096 Apr 24 12:00 bin
-rw-r--r--  1 root root  18011 Apr 24 12:00 COPYING
drwxr-xr-x  4 root root   4096 Apr 24 12:00 doc
drwxr-xr-x  1 root root   4096 Apr 24 12:00 etc
drwxr-xr-x  3 root root   4096 Apr 24 12:00 include
drwxr-xr-x  2 root root   4096 Apr 24 12:00 lib
drwxr-xr-x 32 root root   4096 Apr 24 12:00 library
drwxr-xr-x  2 root root   4096 Apr 24 12:00 modules
drwxr-xr-x 11 root root   4096 Apr 24 12:00 share
drwxrwsr-x  1 root staff  4096 May 29 10:56 site-library
-rw-r--r--  1 root root     46 Apr 24 12:00 SVN-REVISION
AlexAxthelm commented 1 month ago

After digging a bit more, I don't think the s permissions flag on site-library is the problem (setting permissions to 777 results in the same behavior).

I am suspicious that the tar.gz isn't unpacking properly, but the literal file shown in the error seems to have the correct permissions, and can be unpacked. However, the filename listed in the error isn't what is actually being checked, but it's the contents of pkg_cache (which I don't have a good way to inspect).

@gaborcsardi is the pkg_cache referenced there the same as pkg.package_cache_dir that I can see in pak::cache_summary()?

11. pkgdepends:::install_extracted_binary(filename, lib_cache, pkg_cache, lib, …
12. pkgdepends:::verify_extracted_package(filename, pkg_cache)
13. base::throw(pkg_error("{.path {filename}} is not a valid R package, it is an empty archive.", …
14. | base::signalCondition(cond)
15. global (function (e) …

https://github.com/r-lib/pkgdepends/blob/1ecfde9b31d84719ed331f704ff6279a8a780689/R/install-verify-binary.R#L2-L11