klmr / box

Write reusable, composable and modular R code
https://klmr.me/box/
MIT License
829 stars 47 forks source link

Nested imports don't work when set as `R6` class method #348

Closed dereckmezquita closed 7 months ago

dereckmezquita commented 8 months ago

Error description

I am using box and R6 classes extensively, I wanted to break down some of my logic into modules.

I have this structure:

Where index calls something from module.R, and this module in turn calls something from module2.R.

This works fine when I use the functions/objects directly in index.R. As I said I wanted to break my R6 class methods apart for modularisation and I found that if I set the function/object in my R6 class as a method I get the error: Error in pet_function("dog") : could not find function "pet_function".

minimal example

index.R

box::use(R6)
box::use(./module)

Person <- R6$R6Class(
    "Person",
    public = list(
        name = NA_character_,
        initialize = function(name) {
            self$name <- name
        },
        greet = module$my_name,
        hello = module$hello_world,
        # --- these don't work ---
        my_pet_r6 = module$my_pet_r6,
        my_pet_fun = module$my_pet_fun
    )
)

person <- Person$new("John")

person$greet()
person$hello()
person$my_pet_r6() # this doesn't work
person$my_pet_fun() # this doesn't work

module$my_pet_r6() # this works
module$my_pet_fun() # this works

module.R

#' @export
hello_world <- function() {
    print("Hello world!")
}

#' @export
my_name <- function() {
    print(paste0("My name is ", self$name))
}

box::use(./module2[ Pet ])

#' @export
my_pet_r6 <- function() {
    print("I have a pet dog; he makes this noise: ")
    pet <- Pet$new("dog")

    pet$noise()
}

box::use(./module2[ pet_function ])

#' @export
my_pet_fun <- function() {
    print("I have a pet dog; he makes this noise: ")
    pet_function("dog")
}

module2.R

box::use(R6)

#' @export
Pet <- R6$R6Class(
    "Pet",
    public = list(
        species = NA_character_,
        initialize = function(species) {
            self$species <- species
        },
        noise = function() {
            # take the species name split in half and paste it reversed order
            lng <- nchar(self$species)
            half <- lng %/% 2
            call_out <- paste0(
                substr(self$species, half + 1, lng),
                substr(self$species, 1, half)
            )

            print(call_out)
        }
    )
)

#' @export
pet_function <- function(species) {
    lng <- nchar(species)
    half <- lng %/% 2
    call_out <- paste0(
        substr(species, half + 1, lng),
        substr(species, 1, half)
    )

    print(call_out)
}

R version

r$> R.version
               _                           
platform       aarch64-apple-darwin22.4.0  
arch           aarch64                     
os             darwin22.4.0                
system         aarch64, darwin22.4.0       
status                                     
major          4                           
minor          3.1                         
year           2023                        
month          06                          
day            16                          
svn rev        84548                       
language       R                           
version.string R version 4.3.1 (2023-06-16)
nickname       Beagle Scouts  


### ‘box’ version

1.1.3
dereckmezquita commented 8 months ago

I found that the error doesn't occur if instead I import the submodule into the module in my index.R, this doesn't make sense to me since it's the submodule that needs it.

It must have something do with lazy evaluation of the module when it's set in an R6 class, my best educated guess.

box::use(R6)
box::use(./module)

Person <- R6$R6Class(
    "Person",
    public = list(
        name = NA_character_,
        initialize = function(name) {
            self$name <- name
        },
        greet = module$my_name,
        hello = module$hello_world,
        # --- these don't work ---
        my_pet_r6 = module$my_pet_r6,
        my_pet_fun = module$my_pet_fun
    )
)

person <- Person$new("John")

person$greet()
person$hello()

# adding these fixes the issue
# I removed these from module.R
box::use(./module2[ Pet ])
box::use(./module2[ pet_function ])

person$my_pet_r6() # now this works
person$my_pet_fun() # now this works

# but breaks these
module$my_pet_r6() # this doesn't work
module$my_pet_fun() # this doesn't work
ArcadeAntics commented 8 months ago

This is an error due to the implementation of package:R6. When an R6 class generator is created, its argument parent_env will become the parent environment of all instance methods of that class. You can see this in Person$new, it looks like: enclos_env <- new.env(parent = parent_env, hash = FALSE). So, despite your methods my_pet_r6 and my_pet_fun each having an environment of <environment: box_namespace:module>, the environment of the instance methods is overwritten when the instance is created. You can visualize that using this:

env_structure <- function(env) {
    c(
        format.default(env),
        if (!identical(env, emptyenv()) &&
            !identical(env, .GlobalEnv) &&
            !identical(env, baseenv()) &&
            !identical(env, .BaseNamespaceEnv) &&
            !isNamespace(env))
            env_structure(parent.env(env))
    )
}

env_structure(environment(Person$public_methods$my_pet_r6))
env_structure(environment(Person$new("John")$my_pet_r6))

The only reason the other methods greet and hello work as expected is because all the variables refer to the same bindings. If, for example, you attached a package which overrides print (as many of them do), greet and hello would refer to those functions instead of the functions in the base environment.

To correctly use R6 methods, each method should have an environment equal to parent_env. This means you should either store all methods of an R6 class in the same script, or you should define the methods as wrappers of the functions you want to call:

Person <- R6$R6Class(
    "Person",
    public = list(
        name = NA_character_,
        initialize = function(name) {
            self$name <- name
        },
        ## you would need to modify 'my_name' and others to accept an argument 'self'
        greet = function() module$my_name(self),
        hello = function() module$hello_world(self),
        my_pet_r6 = function() module$my_pet_r6(self),
        my_pet_fun = function() module$my_pet_fun(self)
    )
)
klmr commented 8 months ago

Thanks @ArcadeAntics, I was going to write pretty much exactly the same.

In general it’s probably not a idea to assign named functions to methods inside R6 classes, since doing so copies the function rather than using the original reference (and that leads to unexpected changes in the parent scope of the function, as can be seen here).

Instead, as shown by @ArcadeAntics’s code, it’s a good idea to always assign unnamed functions. Those, in turn, can then call other functions (and pass self, if required — but in general I’d rather design the functions to be uncoupled to the R6 class, and pass the actually required arguments instead of self; e.g. self$name or self$species).

dereckmezquita commented 7 months ago

Thank you to both @ArcadeAntics @klmr.

That makes sense, I suspected this might be something with R6 but I wasn't sure. I think I'll be just doing as @ArcadeAntics pointed out:

Person <- R6$R6Class(
    "Person",
    public = list(
        name = NA_character_,
        initialize = function(name) {
            self$name <- name
        },
        my_pet_r6 = function() module$my_pet_r6(self),
    )
)

Funnily enough I'm writing TypeScript code today and I'm having to do the exact same thing, break up a class over multiple files - and it's more annoying and limited in TypeScript!