rticulate / import

An Import Mechanism For R
https://import.rticulate.org
Other
222 stars 14 forks source link

support symbols in .into #53

Closed hutch3232 closed 2 years ago

hutch3232 commented 3 years ago

Say I have a script, functions.R:

foo_a <- function(){

}

Now I want to import the contents of that script, perhaps using a wrapper function, and to use a dynamic .into. The current from function doesn't like being supplied a symbol.

env_name <- "custom"
import::from(.from = "C:/Users/Public/functions.R", .into = env_name, .character_only = TRUE)
Error in into_expr[[1]] : object of type 'symbol' is not subsettable

I couldn't find a nice work-around to this. Is this possible to incorporate?

Thanks!

torfason commented 3 years ago

The import package can either import into named entries on the search path, or into arbitrary environments.

However, if what you desire is to attach to a dynamic named entry on the search path, you can do it by importing it into a new environment, and then attach it to a named entry on the search path yourself. To expand on your example:

# functions.R
foo_a <- function(){ "foo_a" }
foo_b <- function(){ "foo_b" }
foo_c <- function(){ "foo_c" }
# main.R

# Works, .into refers to a a named entry on the search path
import::from(.from = "functions.R", "foo_a", .into = "custom_const", .character_only = TRUE)
foo_a() 
  # [1] "foo_a"

# Does not work, named entries on search path cannot be dynamic
env_name <- "custom_dyn"
import::from(.from = "functions.R", "foo_b", .into = env_name, .character_only = TRUE)
  # Error in into_expr[[1]] : object of type 'symbol' is not subsettable
foo_b() 
  # Error in foo_b() : could not find function "foo_b"

# Also does not work, curly notation does not support named entries directly
import::from(.from = "functions.R", "foo_b", .into = {env_name}, .character_only = TRUE)
  # Error: into is not an environment, but {env} notation was used.
foo_b() 
  # Error in foo_b() : could not find function "foo_b"

# Works, because .into is an environment
env <- new.env()
import::from(.from = "functions.R", "foo_c", .into = {env}, .character_only = TRUE)
foo_c()     
  # Error in foo_c() : could not find function "foo_c"
env$foo_c() 
  # [1] "foo_c"
attach(env, name=env_name)
rm(env)  # Might want to clean up the env variable (does not remove the search path entry)
search() 
  # [1] ".GlobalEnv"  "custom_dyn"   "custom_const"      ...
foo_c()  
  # [1] "foo_c"

Would this work for your use case? If so, the fix would be to clarify the documentation for this, so fine to leave the issue open.

hutch3232 commented 3 years ago

Thank you for the help/examples! This will work for my use-case. Appreciate it.

torfason commented 2 years ago

Good that it helped. Given the complexities of changing the behavior of .into in a backwards compatible manner I don't think we'll do that, but see #57 for a related issue. It would be good if a fix for that documentation issue also addressed this question.

brshallo commented 2 years ago

I'm using this trick (see #57 for minimal context) in my package at brshallo/funspotr but attach() anywhere generates a NOTE (and initial rejection) from CRAN (even if within a call::r() process... cran-comments.md).

@torfason do you know of an alternative trick that doesn't depend on attach() or anything else that will trigger a problem in R CMD check ? (Assuming you still don't want to support symbols :-)

torfason commented 2 years ago

@brshallo Don't get me wrong, I would love to have .into behave in a more consistent manner :-)

I was just thinking about backwards compatibility and implementation complexities (as issues like #55 show, small things can be important when messing around with environments). But reviewing the initial issue again, the current behaviour is to error out, so changing the behaviour in that scenario will not break anything. So perhaps we can do this after all?

First step would be to be super clear about the handling of the four cases (curlies vs. not and .character_only=TRUE or not), and ensure that Future behaviour would not be a regression on Current behaviour without really good reason. We could map it out in a table:

Cases Current Future
NoCurly & NoCharacterOnly ? ?
Curly & NoCharacterOnly ? ?
NoCurly & CharacterOnly ? ?
Curly & CharacterOnly ? ?

Second step would be to write tests to clearly specify future behavior.

Third would be to implement it.

Fourth to document it.

So my current disclaimer is simply that I am unlikely to implement this myself. However, I would be happy to help if this is a big enough scratch for someone to take a stab at it. And with plans for a release near- to mid- future getting clearer, it would be fun to have this in rather than out of such a release.

brshallo commented 2 years ago

I would think think that specifying .character_only would solely enable something like the following:

import::from("dplyr", "select", .into = paste0("test_pkg"), .character_only = TRUE)

(Essentially just expand disabling non-standard evaluation to the .into parameter.)

Which would mirror how import currently handles .from and ... (which mirror how library() handles when character.only = TRUE).

NoCurly & NoCharacterOnly

Would ERROR:

import::from("dplyr", "select", .into = paste0("test_pkg"))

Curly & NoCharacterOnly

Would ERROR:

import::from("dplyr", "select", .into = {paste0("test_pkg")})

NoCurly & CharacterOnly

This though would work:

import::from("dplyr", "select", .into = paste0("test_pkg"), .character_only = TRUE)

(This is the example that I care about, or some alternative way I can do this in a way that avoids attach().)

Curly & CharacterOnly

Would ERROR

import::from("dplyr", "select", .into = {paste0("test_pkg")}, .character_only = TRUE)

HOWEVER, you could make a case that this or something similar should work:

test_pkg <- new.env()
import::from("dplyr", "select", .into = {paste0("test_pkg")}, .character_only = TRUE)

At the same time though, you could argue that { } is itself a non-standard evaluation operator so that by setting .character_only = TRUE non-standard evaluation of this operator should also be disabled... in which case this would also fail. In which case maybe you could argue that instead you should write this as (which would run fine):

test_pkg <- new.env()
import::from("dplyr", "select", .into = paste0("{test_pkg}"), .character_only = TRUE)

(I personally don't care about this environments case as much though and not supporting it in the interim doesn't seem to mess-up backwards compatibility.)

brshallo commented 2 years ago

@torfason would the fix be as simple as adding an additional if statement to from.R somewhere around here?

  # Extract the arguments
  symbols <- symbol_list(..., .character_only = .character_only, .all = .all)

  from    <-
    `if`(isTRUE(.character_only), .from, symbol_as_character(substitute(.from)))

  into_expr <- substitute(.into)
  `{env}` <- identical(into_expr[[1]], quote(`{`))

  # if {env} syntax is used, treat env as explicit env
  if (`{env}`) {
    into <- eval.parent(.into)
    if (!is.environment(into))
      stop("into is not an environment, but {env} notation was used.", call. = FALSE)
  } else {
    into    <- symbol_as_character(into_expr)
  }
torfason commented 2 years ago

Thanks for the outline of the cases!

On where to fix this, yes, I think that looks about right.

In general, I'm happy with flipping any case that is currently an error -> reasonable behavior and I'm also fine with a partial fix (not flipping every error into a reasonable behavior) especially in cases where reasonable behavior is not that clear.

My main worry was the following case (where test_pkg is not defined):

import::from("dplyr", "select", .into = test_pkg, .character_only = TRUE)

I thought this would import into the named environment test_pkg, but under your proposed function, this would have to error, because it would look for a character variable. But "luckily" this seems to result in an error anyway (regardless of the value of .character_only, which I had not expected).

I threw together a few tests that look at this, I'm just attaching the file here (text format because R is not allowed): test_into_character_only.R.txt

This passes as written for me when placed in the test_import directory and renamed to remove the .txt extension.

Incorporating a fix accompanied with tests that demonstrated that the only behavior that changed was behavior that currently results in an error could go into the project would in my opinion be much easier than I expected :-)

torfason commented 2 years ago

Hmm ...

Actually, I'm now wondering, if there is no NSE to speak of in the handling of the .into parameter (apart from checking {, is there any need for the handling to be dependent on either .character_only or {}?

Would it not just be possible to require .into to be a regular variable, and then check the type?

if (is.character(.into)) {
  # Handle as named environment, as is currently done for non-curly
} else if (is.environment(.into)) {
  # Handle as an environment variable, as is currently done for curly
} else {
  stop(".into must either be a string (denoting a named environment) an variable of type environment")
}

Feels to me that this would capture all currently legal uses of .into, and the case you desire, and allow us to remove the rules about curlies from the documentation. Three birds with one stone? Sounds too good to be true, and perhaps it is ...

torfason commented 2 years ago

@brshallo, I wonder if you have been looking at this? Even a partial implementation attempt would be quite helpful if I were to look at getting this into dev before the next release.

torfason commented 2 years ago

Did not mean to close, just to comment ...

hutch3232 commented 2 years ago

I wouldn't mind making an attempt towards resolving this. I will plan to start working on it next week unless I hear that I should hold off.

brshallo commented 2 years ago

Sorry got busy with other things and hadn't taken an action towards this yet. @hutch3232 I say go for it (presuming @torfason hasn't done anything here yet).

torfason commented 2 years ago

Nope, I had not done anything beyond posting my thoughts about checking the type of the variable and try to avoid all the NSE on .into. Would be awesome if @hutch3232 can take a stab at this, happy to work with you on this as well if you find an approach that is promising.

torfason commented 2 years ago

Suggested fix proposed by @hutch3232 in PR #66

torfason commented 2 years ago

Hi again everyone, but especially @brshallo and @hutch3232

In reviewing PR #66 submitted by @hutch3232 I kept thinking that the NSE was not really doing much for .import given that import::from(dplyr, filter, .into=dplyr_named_env) (named environments without quotes) didn't work. The tests that @hutch3232 added reinforced this, so I tried adding all the cases systematically and then implementing .into without any NSE. It seems to work pretty well, see branch prx/hutch3232/into-symbol.

This is just very quick and dirty (still a bunch of comments and commented out code in there, as well as multiple names for the same variable). But I wonder if the two of you, given that you have been thinking about this code and issue, could have a look at this branch and see (a) if the approach looks sane to you and (b) if it works for the cases that you have, and (c) check out the tests and see if there are any cases that are omitted.

If it all looks fine, I'd like to include this approach in the release because it would (a) remove the need for curlies and (b) remove all NSE handling for import, making it behave just like the other configuration parameters.

brshallo commented 2 years ago

@torfason seems to work. Did the following:

#### FIRST clone and switch to branch, i.e.:
# git clone https://github.com/rticulate/import.git
# git checkout prx/hutch3232/into-symbol

devtools::load_all()

pkg <- "dplyr"
fun <- "select"
pkg_nm <- paste0("explicitpackage:", pkg)

import::from(pkg, fun, .into = pkg_nm, .character_only = TRUE)
utils::find(fun)
#> [1] "explicitpackage:dplyr"

Also tried-out out prior approach with the curly's and verified that it also works (/is backwards compatible):

devtools::load_all()

pkg <- "dplyr"
fun <- "select"
env <- new.env()
env_nm <- paste0("explicitpackage:", pkg)

import::from(pkg, fun, .into = {env}, .character_only = TRUE)

attach(env, name = env_nm)

utils::find(fun)
#> [1] "explicitpackage:dplyr"

These are really the only use-cases I personally care about at the moment.


Reviewing other test cases and expected behavior

I did though go and check a couple of the other test-cases I mentioned above and found that some did not work quite as expected. E.g.

NoCurly & NoCharacterOnly

Would ERROR:

import::from("dplyr", "select", .into = paste0("test_pkg"))

This doesn't error under the referenced branch -- the reason I thought it should is because if you did e.g. paste0("__") as an argument in ... or .from it would error (so figured .into should mirror this behavior).

Comments

Hence, I don't know if keeping some of the NSE is needed so that .into mirrors behavior of ... and .from (except allowing curlys with .into for backwards compatibility). My understanding is that this is all so that import::from() mirrors how base::library() handles arguments.

As an aside though, I really only care about the examples I mentioned above. My comments here are just in trying to keep with the spirit of the package... (personally I'm not a big fan of how library() handles arguments, see tweet-thread -- but I get that it makes sense to mirror how library() does things).

torfason commented 2 years ago

Thanks for the tests and detailed report. It is good to see that both your focal use cases and the prior working cases that you tested now work. As for the one that you expected to fail that doesn't:

import::from("dplyr", "select", .into = paste0("test_pkg"))

I think the only reason why one would want the above to fail would be to have NSE in order for the following to work (with test_pkg as the actual name of the environment, not a variable containing the name :

import::from(dplyr, select, .into = test_pkg)

It has been supposed to work, but it doesn't work in the release branch, so no regression. Probably people find it easy enough to just add quotes.

And because of the complications involved with allowing unnamed environment to be passed to an NSE version of.into, I think it is totally worth abandoning NSE for this param.

I do agree with you on the drawbacks of NSE, even though I think .from and ... are a bit different from .into (many people could use import::from(dplyr, select) for a long time without ever having a need to specify .into). And this change makes that explicit. If this gets into the release (which is my plan), NSE is explicitly limited to .from and ..., all other params are standard evaluation only, which will simplify the code in from() a bit, and make it easier to possibly implement a import::config() function in the future, where default values for the standard-eval parameters could be set.

Note as an aside, that passing an unnamed environment, as in:

env <- new.env()
import::from(dplyr, select, .into = env)

Should now work, no .character_only needed, and no curlies. Curlies still work because they are "ignored" for regular variablese (no NSE needed for that).

I'm pulling #66 into a staging branch for a bit further cleanup of this, after which this will go into dev.

torfason commented 2 years ago

Done, this should now be fixed in dev.

torfason commented 2 years ago

All features/fixes targeted for v1.3.0 are now in dev and master has been forwarded to match dev, and the version bumped to correspond to that. All checks required for a CRAN release see to be passing, and the plan is to submit to CRAN after the weekend, unless any unknown issues are uncovered until then. In the meantime, it would be much appreciated if the people involved in the new features/fixes could check out this could install and try out the most recent version from GitHub, for example using:

pak::pak("rticulate/import")

(remotes::install_github("rticulate/import") or devtools::install_github("rticulate/import") should give the same result)

Thanks to everyone for their contributions!

torfason commented 2 years ago

Fixed in v1.3.0