r-lib / pak

A fresh approach to package installation
https://pak.r-lib.org
643 stars 56 forks source link

Cannot untar downloaded local package #516

Closed pawelru closed 1 year ago

pawelru commented 1 year ago

(almost) reproducible example - please change the path to any local package sources

ref <- "pak=local::/Users/ruckip/Documents/repo/gh/r-lib/pak"
x <- pak::pkg_download(ref)
#> ℹ No downloads are needed, 1 pkg is cached
#> ✔ Got pak 0.5.1.9000 (source) (96 B)
x$fulltarget_tree
#> [1] "./src/contrib/pak_0.5.1.9000.tar.gz-t"
file.exists(x$fulltarget_tree)
#> [1] TRUE
untar(x$fulltarget_tree, list = TRUE)
#> Warning in system(cmd, intern = TRUE): running command '/usr/bin/tar -tf
#> './src/contrib/pak_0.5.1.9000.tar.gz-t'' had status 1
#> character(0)
#> attr(,"status")
#> [1] 1
sessionInfo()
#> R version 4.2.1 (2022-06-23)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.43        magrittr_2.0.3    R.cache_0.16.0    R6_2.5.1         
#>  [5] rlang_1.1.1       fastmap_1.1.1     fansi_1.0.4       styler_1.10.1    
#>  [9] tools_4.2.1       pak_0.5.1         xfun_0.39         R.oo_1.25.0      
#> [13] utf8_1.2.3        cli_3.6.1         withr_2.5.0       htmltools_0.5.5  
#> [17] rprojroot_2.0.3   yaml_2.3.7        digest_0.6.31     lifecycle_1.0.3  
#> [21] processx_3.8.1    purrr_1.0.1       callr_3.7.3       vctrs_0.6.3      
#> [25] R.utils_2.12.2    fs_1.6.2          ps_1.7.5          glue_1.6.2       
#> [29] evaluate_0.21     rmarkdown_2.22    reprex_2.0.2      pillar_1.9.0     
#> [33] compiler_4.2.1    desc_1.4.2        R.methodsS3_1.8.2

Created on 2023-06-27 with reprex v2.0.2

One can ask why I want to download a package. The use case I am facing is that I am going through the vector of refs and I would like to apply the same function regardless of the ref type (GH, standard, local, etc.). At the very end I would like to get .tar.gz file for each of those refs. In https://github.com/r-lib/pak/issues/434 it is suggested to download, untar and build it.

gaborcsardi commented 1 year ago

fulltarget-tree is not a file, it is already the directory of the untarred package.

pawelru commented 1 year ago

yes - that's right I am able to run pkgbuild::build(file.path(x$fulltarget_tree, x$package)) succesfully. But now I am a little confused. For local type of ref, fulltarget-tree is a directory of the untarred package whereas for github type it is already tarred hence I need to execute untar() - is that right?

ref_local <- "pak=local::/Users/ruckip/Documents/repo/gh/r-lib/pak"
x_local <- pak::pkg_download(ref_local)
#> ℹ No downloads are needed, 1 pkg is cached
#> ✔ Got pak 0.5.1.9000 (source) (96 B)
x_local$fulltarget_tree
#> [1] "./src/contrib/pak_0.5.1.9000.tar.gz-t"
file.exists(x_local$fulltarget_tree)
#> [1] TRUE
untar(x_local$fulltarget_tree, list = TRUE) # error
#> Warning in system(cmd, intern = TRUE): running command '/usr/bin/tar -tf
#> './src/contrib/pak_0.5.1.9000.tar.gz-t'' had status 1
#> character(0)
#> attr(,"status")
#> [1] 1
# as suggested
file.exists(file.path(x_local$fulltarget_tree, x_local$package))
#> [1] TRUE
pkgbuild::build(file.path(x_local$fulltarget_tree, x_local$package))
#> ── R CMD build ─────────────────────────────────────────────────────────────────
#> * checking for file ‘/private/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T/RtmpplYbvn/reprex-1441d4bcb9230-smoky-moose/src/contrib/pak_0.5.1.9000.tar.gz-t/pak/DESCRIPTION’ ... OK
#> * preparing ‘pak’:
#> * checking DESCRIPTION meta-information ... OK
#> * running ‘cleanup’
#> * checking for LF line-endings in source and make files and shell scripts
#> * checking for empty or unneeded directories
#> * building ‘pak_0.5.1.9000.tar.gz’
#> [1] "/private/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T/RtmpplYbvn/reprex-1441d4bcb9230-smoky-moose/src/contrib/pak_0.5.1.9000.tar.gz-t/pak_0.5.1.9000.tar.gz"

ref_gh <- "r-lib/pak"
x_gh <- pak::pkg_download(ref_gh)
#> ℹ No downloads are needed, 1 pkg is cached
x_gh$fulltarget_tree
#> [1] "./src/contrib/pak_0.5.1.9000_cde6d23.tar.gz-t"
file.exists(x_gh$fulltarget_tree)
#> [1] TRUE
# as suggested -> error
file.exists(file.path(x_gh$fulltarget_tree, x_gh$package))
#> [1] FALSE
pkgbuild::build(file.path(x_gh$fulltarget_tree, x_gh$package))
#> Error: `path` must exist
head(untar(x_gh$fulltarget_tree, list = TRUE)) # however - this is working
#> [1] "r-lib-pak-cde6d23/"                                  
#> [2] "r-lib-pak-cde6d23/.Rbuildignore"                     
#> [3] "r-lib-pak-cde6d23/.github/"                          
#> [4] "r-lib-pak-cde6d23/.github/.gitignore"                
#> [5] "r-lib-pak-cde6d23/.github/workflows/"                
#> [6] "r-lib-pak-cde6d23/.github/workflows/R-CMD-check.yaml"

Created on 2023-06-27 with reprex v2.0.2

Is there any way to handle different type of refs to obtain .tar.gz file? Or I need to differentiate it on my own?

gaborcsardi commented 1 year ago
?pak::pkg_download

‘fulltarget’, if it exists, contains a packaged (via R CMD build) source R package. If ‘fulltarget_tree’ exists, it is a package tree directory, that still needs an R CMD build call.

pawelru commented 1 year ago

Yeah I saw that - it's just a matter of whether or not this needs to be untarred or not before R CMD build call. The aforementioned documentation said nothing about that. The only info I got is from issue discussion indicating that you have to always run untar() but that's not necessarily true. From my observation it seems that for GH reference we have to untar() and for local - no. Is there any chance to make it more consistent?

Putting this into code:

foo_untar <- function(x) {
    stopifnot(file.exists(x$fulltarget_tree))
    untarred_dir <- tempfile()
    dir.create(untarred_dir)
    untar(x$fulltarget_tree, exdir = untarred_dir)
    sources_dir <- list.dirs(untarred_dir, recursive = FALSE)[1]
    target_temp_file <- tempfile(fileext = ".tar.gz")
    pkgbuild::build(sources_dir, dest_path = target_temp_file, binary = FALSE, vignettes = FALSE)
    unlink(sources_dir, recursive = TRUE)
    target_temp_file
}

foo_no_untar <- function(x) {
    stopifnot(file.exists(x$fulltarget_tree))
    target_temp_file <- tempfile(fileext = ".tar.gz")
    pkgbuild::build(file.path(x$fulltarget_tree, x$package), dest_path = target_temp_file, binary = FALSE, vignettes = FALSE)
    target_temp_file
}

ref_local <- "pak=local::/Users/ruckip/Documents/repo/gh/r-lib/pak"
x_local <- pak::pkg_download(ref_local)
#> ℹ No downloads are needed, 1 pkg is cached
#> ✔ Got pak 0.5.1.9000 (source) (96 B)
foo_untar(x_local) # error
#> Warning in untar(x$fulltarget_tree, exdir = untarred_dir): '/usr/bin/tar -xf
#> './src/contrib/pak_0.5.1.9000.tar.gz-t' -C
#> '/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T//RtmpkG77AB/file1452812d39d2a''
#> returned error code 1
#> Error: `path` must exist
foo_no_untar(x_local) # OK
#> ── R CMD build ─────────────────────────────────────────────────────────────────
#> * checking for file ‘/private/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T/RtmpplYbvn/reprex-1441d349100f6-agile-pike/src/contrib/pak_0.5.1.9000.tar.gz-t/pak/DESCRIPTION’ ... OK
#> * preparing ‘pak’:
#> * checking DESCRIPTION meta-information ... OK
#> * running ‘cleanup’
#> * checking for LF line-endings in source and make files and shell scripts
#> * checking for empty or unneeded directories
#> * building ‘pak_0.5.1.9000.tar.gz’
#> [1] "/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T//RtmpkG77AB/file145282e686ea5.tar.gz"

ref_gh <- "r-lib/pak"
x_gh <- pak::pkg_download(ref_gh)
#> ℹ No downloads are needed, 1 pkg is cached
foo_untar(x_gh) # OK
#> ── R CMD build ─────────────────────────────────────────────────────────────────
#> * checking for file ‘/private/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T/RtmpkG77AB/file1452819b23948/r-lib-pak-cde6d23/DESCRIPTION’ ... OK
#> * preparing ‘pak’:
#> * checking DESCRIPTION meta-information ... OK
#> * running ‘cleanup’
#> * checking for LF line-endings in source and make files and shell scripts
#> * checking for empty or unneeded directories
#> * building ‘pak_0.5.1.9000.tar.gz’
#> [1] "/var/folders/m1/hrz0h_ls7gz57rc80tnj41t80000gp/T//RtmpkG77AB/file145283ceb326.tar.gz"
foo_no_untar(x_gh) # error
#> Error: `path` must exist

sessionInfo()
#> R version 4.2.1 (2022-06-23)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.43        magrittr_2.0.3    R.cache_0.16.0    R6_2.5.1         
#>  [5] rlang_1.1.1       fastmap_1.1.1     fansi_1.0.4       styler_1.10.1    
#>  [9] tools_4.2.1       pak_0.5.1         pkgbuild_1.4.1    xfun_0.39        
#> [13] R.oo_1.25.0       utf8_1.2.3        cli_3.6.1         withr_2.5.0      
#> [17] htmltools_0.5.5   rprojroot_2.0.3   yaml_2.3.7        digest_0.6.31    
#> [21] lifecycle_1.0.3   crayon_1.5.2      processx_3.8.1    purrr_1.0.1      
#> [25] callr_3.7.3       vctrs_0.6.3       R.utils_2.12.2    fs_1.6.2         
#> [29] ps_1.7.5          glue_1.6.2        evaluate_0.21     rmarkdown_2.22   
#> [33] reprex_2.0.2      pillar_1.9.0      compiler_4.2.1    prettyunits_1.1.1
#> [37] desc_1.4.2        R.methodsS3_1.8.2

Created on 2023-06-27 with reprex v2.0.2

gaborcsardi commented 1 year ago

You mean fulltarget_tree might be a file? I forgot that, but you can call file.info(...)[["isdir"]] to see if it is a directory or not.

It would not make much sense to tar up a directory, if you probably untar it as a next step. OTOH for remote packages, we download a file, which we do not untar, in case you need a file.

pawelru commented 1 year ago

Thanks - that's a good hint. So eventually the algorithm to get targz file should look like below. Can you please confirm whether is it correct? (also: Is that intended?)

(pseudocode)

x <- pak::pkg_download(ref)
if (file.exists(x$fulltarget)) {
    # do nothing - it's already build as per docs
} else if (file.exists(x$fulltarget_tree)) {
    if (file.info(x$fulltarget_tree)$isdir) {
        build(file.path(x$fulltarget_tree, "package"))
    } else {
        untarred_dir <- untar(x$fulltarget_tree)
        build(untarred_dir)
    }
}

It would not make much sense to tar up a directory, if you probably untar it as a next step. OTOH for remote packages, we download a file, which we do not untar, in case you need a file.

I am afraid that I am not following. Can you please elaborate more? It would be actually great to avoid tar-untar but the tar action is done on pak side and to the best of my knowledge I have no way to avoid that. Is there any option / argument to pak::pkg_download()?

gaborcsardi commented 1 year ago

A locaL::path package tree is available as a directory, so we do not tar it up, as we would need to untar it anyway.

A github::user/repo package is downloaded from GH as a tar.gz file, so we do not untar it, as the pkg_download() user possibly wants it as a file.

It is not great UI, convenience is sacrificed for speed.

pawelru commented 1 year ago

OK got it - many thanks for explanation and guidance. Much appreciated.

That's exactly my point here. Already built package (targz file) has its separate field ("fulltarget") in the returned object out of pak::pkg_download but that was not the case for source-tree ("fulltarget_tree") where it's on user to identify whether or not to untar it.

I can definitely handle this on my own. I think we provide enough information for anyone else looking at this problem again in the future. My personal opinion is that some docs enhancement will be great to have - especially about the (conditional) need to untar() and possibly how to identify it. I am leaving this for your assessment.

I am going to close it. Thanks again for the help.