rticulate / import

An Import Mechanism For R
https://import.rticulate.org
Other
222 stars 14 forks source link

Add ability to load packages from any library directory #56

Closed awong234 closed 11 months ago

awong234 commented 2 years ago

Assuming 'lib' is a directory that is not defined in the .libPaths(), and package foo exists in 'lib', and function bar exists in foo, I expect:

import::from(foo, bar, .library='lib')

To import function bar from foo. At the moment this causes an error, due to getNamespaceExports being called before the namespace of foo is made available. The solution appears to be a quick reordering of the operations, as loadNamespace comes right after that, and since that uses the .library parameter it can successfully load the namespace for getNamespaceExports to reference.

Open PR in #55

Workaround currently is to explicitly add lib to the .libPaths

torfason commented 2 years ago

I'd like to include this, but the checks failed, seemingly for an unrelated reason. Pushing a new commit seems like the only way to retrigger checks after some changes by GitHub. If we can get the checks to pass on all platforms I'm in favour of merging this into master with an eye to including in the next release.

torfason commented 2 years ago

Fix looks great, currently in https://github.com/torfason/import/tree/prx/awong234-getNamespaceExports pretty ready to merge into dev.

torfason commented 2 years ago

This has been merged into dev branch (https://github.com/torfason/import/tree/dev), and all tests are passing. Marking as fixed-in-dev

torfason commented 2 years ago

Dang, @awong234, I just found that this fix (c5b503b1) breaks devtools::check() locally, even though devtools::test() work fine. Reverting this, as I've done in a01c23a5 (and skipping the new tests of this fix) fixes the check() error (a01c23a5 passes, but the previous commit, 94294b68 fails). I have not checked on CRAN, and GitHub tests have been passing, but I don't feel good about going for a release that includes this change without understanding what is going on. The errors seem to start with:

── Failure (test_import.R:51:5): test_basetemplate.R  works ────────────────────
\[1\] "[...]import\.Rcheck/tests/test_import"
── Error \(\?\?\?\): Basic scenario works ───────────────────────────────────────────
Error: there is no package called ‘knitr’
Backtrace:
  1\. testthat::expect_silent\(import::from\(knitr, normal_print\)\)
  9\. import::from\(knitr, normal_print\)
 10\. base::tryCatch\(\.\.\.\)
 11\. base tryCatchList\(expr, classes, parentenv, handlers\)
 12\. base tryCatchOne\(expr, names, parentenv, handlers\[\[1L\]\]\)
 13\. value\[\[3L\]\]\(cond\)

Do you have any idea what might be going on?

awong234 commented 2 years ago

Hi Magnus. I took a look yesterday but I couldn't figure out what's wrong. I'll keep investigating to see why this is occurring, thanks!

torfason commented 2 years ago

Sounds good. One question, just out of curiosity, were you able reproduce to this but not figure out what's wrong, or were you unable to reproduce it?

awong234 commented 2 years ago

The former -- I ran devtools::check() and saw the same general pattern of errors you indicated, that knitr is unable to be found for whatever reason.

torfason commented 2 years ago

@awong234, an idea came to me, which was to change the default value of .library from .libPaths()[1L] to .libPaths(), as in:

from <- function(.from, ..., .into = "imports",
                 .library = .libPaths(), .directory=".",
                 .all=(length(.except) > 0), .except=character(),
                 .chdir = TRUE, .character_only = FALSE)
{

This seems to fix the issue, and it "feels right" to me to make this change coupled with the change of order of operations in your original pull request. However, this had already been discussed and decided against in 2015 (see #10), which makes me reluctant to pull it in with other changes last minute before release.

So what I propose is to do the release now based on what is in dev (plus optional .S3 handling that seems ready for inclusion), and then look at this change in isolation, with the aim of a release with only this addition. If it turns out to cause regressions in some scenarios, it would be an easier downgrade than facing a trade-off between a regression and the bugs fixed in 1.3.0.

awong234 commented 2 years ago

Hi Magnus,

I think I've narrowed the issue, after debugging devtools::check. But please do assess my understanding here. Sorry for the extremely detailed response, but I think that it is important for understanding how the devtool errors relate to this issue #56 as well as #10.

Debugging the error inside devtools::check

Stepping through devtools::check, we arrive ultimately to callr::rcmd_safe by way of

devtools::check()
devtools:::check_built()
rcmdcheck:::rcmdcheck()
rcmdcheck:::do_check()

When rcmdcheck:::do_check is called, it passes all of the .libPaths() through, but when it calls callr::rcmd_safe, it layers onto it the following:

libdir <- file.path(dirname(targz), paste0(package, ".Rcheck"))
# Temporary directory, in something like C:/Users/AlecW/AppData/Local/Temp/RtmpW2cqw8/import.Rcheck
 rcmd_safe("check", cmdargs = c(basename(targz), args),
    libpath = c(libdir, libpath), ...)

This is where the error occurs, and I believe it is indeed due to the default argument for .library within import::from; because the first library in the path is now that libdir variable. When it goes to test, import looks for knitr in that directory, and will fail because knitr is not in that directory (nor are any other packages).

Reproducing the error outside of devtools::check

The error can be reproduced simply outside of devtools::check by adding the following to test_from.R on commit c5b503b:

lib = tempdir()
.libPaths(c(lib, .libPaths()))

Running test_from.R will now fail to find knitr.

Rationale

Why then does this work on the prior commit? Let's try to simplify the code in question:

getNamespaceExports before loadNamespace; the default order

lib = tempdir()
.libPaths(c(lib, .libPaths()))
getNamespaceExports('data.table')
#>   [1] "hour"               "shift"              "is.data.table"
#>   [4] "dcast.data.table"   "%flike%"            "rleid"
#>   [7] "chgroup"            "setindexv"          "copy"
#>  [10] "wday"               "shouldPrint"        ".GRP"
#>  [13] ".__C__data.table"   "year"               "fsort"
#>  [16] "rbindlist"          "getDTthreads"       "frankv"
#>  [19] "fsetdiff"           ":="                 "second"
#>  [22] ".BY"                "month"              "setkey"
#>  [25] "frank"              "tstrsplit"          "rowid"
#>  [28] "%ilike%"            "as.xts.data.table"  "tables"
#>  [31] "melt"               "fwrite"             "setorder"
#>  [34] "key"                "like"               ".Last.updated"
#>  [37] "truelength"         "setnames"           "week"
#>  [40] "frollapply"         "setDF"              "setkeyv"
#>  [43] "setorderv"          ".I"                 ".NGRP"
#>  [46] "set"                "funion"             "as.ITime"
#>  [49] ".N"                 ".__T__$:base"       "inrange"
#>  [52] "setDTthreads"       "frollmean"          ".EACHI"
#>  [55] "cube"               "IDateTime"          "%like%"
#>  [58] "setDT"              "SJ"                 "melt.data.table"
#>  [61] "isoweek"            "uniqueN"            "chorder"
#>  [64] "chmatch"            "fsetequal"          "groupingsets"
#>  [67] "transpose"          "dcast"              ".__T__$<-:base"
#>  [70] "fifelse"            "fintersect"         "getNumericRounding"
#>  [73] "indices"            ".__T__[<-:base"     "timetaken"
#>  [76] "foverlaps"          ".__C__IDate"        "fread"
#>  [79] "fcase"              "yday"               "frollsum"
#>  [82] ".__C__ITime"        "setcolorder"        "last"
#>  [85] "rowidv"             "%inrange%"          "minute"
#>  [88] "setindex"           "rleidv"             "fcoalesce"
#>  [91] ".__T__[[<-:base"    "%chin%"             "data.table"
#>  [94] ".__T__[:base"       "%between%"          "setNumericRounding"
#>  [97] "setattr"            ".SD"                "as.data.table"
#> [100] "CJ"                 "haskey"             "test.data.table"
#> [103] "rollup"             "as.IDate"           "setalloccol"
#> [106] "merge.data.table"   "between"            "mday"
#> [109] "quarter"            "alloc.col"          "key<-"
#> [112] ".rbind.data.table"  "setnafill"          "update.dev.pkg"
#> [115] "address"            "nafill"             "first"
loadNamespace('data.table', lib.loc = lib)
#> <environment: namespace:data.table>

Created on 2022-05-16 by the reprex package (v2.0.1)

loadNamespace before getNamespaceExports; the revised order, and fails to load

lib = tempdir()
.libPaths(c(lib, .libPaths()))
loadNamespace('data.table', lib.loc = lib)
#> Error in loadNamespace("data.table", lib.loc = lib): there is no package called 'data.table'
getNamespaceExports('data.table')
#>   [1] "hour"               "shift"              "is.data.table"     
#>   [4] "dcast.data.table"   "%flike%"            "rleid"             
#>   [7] "chgroup"            "setindexv"          "copy"              
#>  [10] "wday"               "shouldPrint"        ".GRP"              
#>  [13] ".__C__data.table"   "year"               "fsort"             
#>  [16] "rbindlist"          "getDTthreads"       "frankv"            
#>  [19] "fsetdiff"           ":="                 "second"            
#>  [22] ".BY"                "month"              "setkey"            
#>  [25] "frank"              "tstrsplit"          "rowid"             
#>  [28] "%ilike%"            "as.xts.data.table"  "tables"            
#>  [31] "melt"               "fwrite"             "setorder"          
#>  [34] "key"                "like"               ".Last.updated"     
#>  [37] "truelength"         "setnames"           "week"              
#>  [40] "frollapply"         "setDF"              "setkeyv"           
#>  [43] "setorderv"          ".I"                 ".NGRP"             
#>  [46] "set"                "funion"             "as.ITime"          
#>  [49] ".N"                 ".__T__$:base"       "inrange"           
#>  [52] "setDTthreads"       "frollmean"          ".EACHI"            
#>  [55] "cube"               "IDateTime"          "%like%"            
#>  [58] "setDT"              "SJ"                 "melt.data.table"   
#>  [61] "isoweek"            "uniqueN"            "chorder"           
#>  [64] "chmatch"            "fsetequal"          "groupingsets"      
#>  [67] "transpose"          "dcast"              ".__T__$<-:base"    
#>  [70] "fifelse"            "fintersect"         "getNumericRounding"
#>  [73] "indices"            ".__T__[<-:base"     "timetaken"         
#>  [76] "foverlaps"          ".__C__IDate"        "fread"             
#>  [79] "fcase"              "yday"               "frollsum"          
#>  [82] ".__C__ITime"        "setcolorder"        "last"              
#>  [85] "rowidv"             "%inrange%"          "minute"            
#>  [88] "setindex"           "rleidv"             "fcoalesce"         
#>  [91] ".__T__[[<-:base"    "%chin%"             "data.table"        
#>  [94] ".__T__[:base"       "%between%"          "setNumericRounding"
#>  [97] "setattr"            ".SD"                "as.data.table"     
#> [100] "CJ"                 "haskey"             "test.data.table"   
#> [103] "rollup"             "as.IDate"           "setalloccol"       
#> [106] "merge.data.table"   "between"            "mday"              
#> [109] "quarter"            "alloc.col"          "key<-"             
#> [112] ".rbind.data.table"  "setnafill"          "update.dev.pkg"    
#> [115] "address"            "nafill"             "first"

Created on 2022-05-16 by the reprex package (v2.0.1)

(I am using data.table instead of knitr because reprex loads the knitr namespace!)

As I note in the error traceback in #55 getNamespaceExports actually calls loadNamespace, and forcibly with a NULL lib.loc argument. loadNamespace then uses find.package to locate the package, which searches through all .libPaths. From the help doc for find.package:

lib.loc a character vector describing the location of R library trees to search through, or NULL. The default value of NULL corresponds to checking the loaded namespace, then all libraries currently known in .libPaths().

But what it cannot do (unless the arbitrary lib is in the .libPath to begin with), is search in the arbitrary lib, because getNamespaceExports has no lib.loc argument; this is the basis for this issue and the fix in #55.

Solution

Based on all of this, I think that the default argument for .library in import::from should be .libPaths() instead of .libPaths[1] because when getNamespaceExports is run, it runs loadNamespace(pkg, lib.loc = NULL) which then searches through all the libpaths anyway. But, changing to .libPaths() would allow #55 to succeed.

Please let me know what you think! Thank you so much for your patience, especially with this comment!

torfason commented 2 years ago

No, thank you! For this incredible detective work! And I could not be less sorry that you documented it so well!

I think it is especially neat that you figured out how things manage to load using the old order, and that the loading is actually silently using .libPaths() anyway, regardless of the value of the .library parameter.

And I agree with your conclusion that the correct fix is to use the "revised order", that is loadNamespace() before getNamespaceExports(), and change the default value of the parameter to .libPaths().

I think the way forward is clear, but since people are waiting for specific functionality that 1.3.0 brings, I'll focus on getting that release out, and then once that is out for a few weeks I'll push a 1.4.0 release with this as the sole change, just in case someone was relying on the .libPaths()[1L] functionality in some way.

torfason commented 2 years ago

@awong234 I've added a PR (#70)to address these issues. The intention is just to do what you specify above, i.e. re-add the changes from #55 and then changing the default of .library.

I'd very much appreciate if you could review the PR and let me know if I'm missing anything.

torfason commented 2 years ago

This has now been fixed in main.

Installations from GitHub, using pak::pak("rticulate/import") or other methods will use this version, and getting some usage of this would be great, in order to minimize the chances of any issues on release to CRAN.

torfason commented 11 months ago

This large fix is now included in the released version on CRAN, specifially in the v1.3.1 release. So I'm closing this and hoping that we now have much more robust behavior w.r.t. these things. Thanks so much for the work put into this!