RConsortium / S7

S7: a new OO system for R
https://rconsortium.github.io/S7
Other
388 stars 33 forks source link

How to create a nestable R7 class? #250

Open KatherineCox opened 2 years ago

KatherineCox commented 2 years ago

Sometimes it's useful to be able to nest objects of the same type inside each other. Is there a recommended way for doing this with R7?

Naive approach

It's good that this errors; if you ever tried to instantiate this you'd end up in an infinitely recursing loop.

nested <- new_class("nested",
  properties = list(child = nested)
)
#> Error in as_properties(properties): object 'nested' not found

Created on 2022-08-14 by the reprex package (v2.0.1)

new_union()

It would be nice if this worked, allowing you to specify at least one other type that can be a "base case". Preferably while still protecting you from infinite recursion by not permitting the nested class to be the default.

nested <- new_class("nested",
  properties = list( child = new_union(NULL, nested) )
)
#> Error in lapply(x, as_class): object 'nested' not found

Created on 2022-08-14 by the reprex package (v2.0.1)

Define it twice

This doesn't error and lets me nest objects, but properties holds on to the original class definition (inner's child has no name or child properties, since the original definition of nested had no properties). I don't know what problems that might cause (and it prevents the infinite recursion problem), but I don't like the feel of it.

nested <- new_class("nested")

nested <- new_class("nested",
  properties = list( 
    name = class_character,
    child = nested )
)

inner <- nested()
inner
#> <nested>
#>  @ name : chr(0) 
#>  @ child: <nested>

outer <- nested(name = "outer", child = inner)
outer
#> <nested>
#>  @ name : chr "outer"
#>  @ child: <nested>
#>  .. @ name : chr(0) 
#>  .. @ child: <nested>

Created on 2022-08-14 by the reprex package (v2.0.1)

Use a parent class

This seems to work and I haven't thougth of anything that would go wrong. It's mildly annoying to have the extra parent class cluttering things up if I don't have any other reason for it to exist, but I can live with it :)

nested_parent <- new_class("nested_parent")

nested <- new_class("nested", parent = nested_parent,
  properties = list( 
    name = class_character,
    child = nested_parent )
)

inner <- nested()
inner
#> <nested>
#>  @ name : chr(0) 
#>  @ child: <nested_parent>

outer <- nested(name = "outer", child = inner)
outer
#> <nested>
#>  @ name : chr "outer"
#>  @ child: <nested>
#>  .. @ name : chr(0) 
#>  .. @ child: <nested_parent>

Created on 2022-08-14 by the reprex package (v2.0.1)

So, 2 requests:

  1. It'd be nice to have documentation on the recommended way to create nestable classes. Is using a parent class the way to go?
  2. Any chance some behind-the-scenes wizardry could make something like the new_union syntax work?
hadley commented 2 years ago

Hmmmm, we haven't thought about this problem yet. Do you have any examples of syntax from other OO systems?

lawremi commented 2 years ago

We could probably reuse the delayed binding stuff that enables optional dependencies. That does require some effort on the part of the developer but it is pretty clean. S4 got away with this by just using strings.

hadley commented 2 years ago

@lawremi we currently only have new_external_generic(), but we could add new_external_class() too.

lawremi commented 2 years ago

I think it would be useful for defining a method on a local generic but with an external class in the signature. Like if a package defines a common API.

KatherineCox commented 2 years ago

As I've been experimenting a bit more, I realized this is more general - I can only use one R7 class as a prop type for another if the inner one is defined first.

This feels surprising because it's not how functions work; I'm used to being able to organize my code however I want.

mmaechler commented 2 years ago

Well, it's exactly like this with the only "strict" class system in (base) R: S4. In the {Matrix} package (and also in {Rmpfr} and {copula} if I remember correctly), we have used AllClass.R and AllGeneric.R (and did not use other file names alphabetically before "All*") and then really put all (basic) class definitions in a consistent order into AllClass.R. Of course the alternative is to use explicit collation in the DESCRIPTION file, something I think is advocated by the Roxygen aficionados. I personally hate to add (and update, even if automatically) such a Collate: field in DESCRIPTION.

jl5000 commented 1 year ago

I have a related problem of trying to define two classes, each dependent on the other. I'm not sure whether resolving the collate issue above will actually make this recursion possible, but I have a problem that requires it.

class_one <- S7::new_class("class_one",
                           properties = list(
                             x = class_two
                           ))
#> Error in as_properties(properties): object 'class_two' not found

class_two <- S7::new_class("class_two",
                           properties = list(
                             y = class_one
                           ))
#> Error in as_properties(properties): object 'class_one' not found

Created on 2023-02-14 with reprex v2.0.2

hadley commented 1 year ago

I just realised that there's one easy way to solve this problem by taking advantage of the fact that S7 classes are S3 classes:

library(S7)

# Recursive
tree <- new_class("tree", properties = list(child = NULL | new_S3_class("tree")))
tree()
#> <tree>
#>  @ child: NULL
tree(child = tree())
#> <tree>
#>  @ child: <tree>
#>  .. @ child: NULL
tree(child = 1)
#> Error: <tree> object properties are invalid:
#> - @child must be <NULL> or S3<tree>, not <double>

# Mutually recursive
class_one <- new_class(
  "class_one",
  properties = list(x = NULL | new_S3_class("class_two"))
)
class_two <- new_class(
  "class_two",
  properties = list(y = NULL | new_S3_class("class_one"))
)
class_one()                       
#> <class_one>
#>  @ x: NULL
class_one(x = class_two(y = class_one()))
#> <class_one>
#>  @ x: <class_two>
#>  .. @ y: <class_one>
#>  .. .. @ x: NULL
class_one(x = 1)
#> Error: <class_one> object properties are invalid:
#> - @x must be <NULL> or S3<class_two>, not <double>

Created on 2023-09-10 with reprex v2.0.2

I can't quite decide if this is beautiful or horrible, but it is a simple and pragmatic way to solve the problem today. And if we do introduce a solution sometime in the future, the underlying implementation is likely to be quite similar, but will just require some small tweaking to get a better error message, i.e.:

class_one(x = 1)
#> Error: <class_one> object properties are invalid:
#> - @x must be <NULL> or <class_two>, not <double>

(Note that in both the examples above I think you have to make the values "optional" because otherwise I don't understand how you'd ever end the chain. Using (e.g.) NULL | class_one is slightly better than class_one | NULL because it means we don't need to supply an explicit default as the default default value is taken from the first element of a union.)

jl5000 commented 1 year ago

This works, until you decide to use the package parameter in new_class().

hadley commented 1 year ago

@jl5000 then you can use new_S3_class("pkg::class_one")

lawremi commented 1 month ago

After looking at this again, there are two potential solutions:

  1. Using reference semantics to enable back references from the properties
  2. Using a stub object that satisfies the needs of a property definition, even if it is not the fully specified class

I think the stub approach is best, since supporting references would be too much at odds with the current approach. Two examples of stubs are the "define the class twice" approach at the top of this issue and Hadley's S3 approach. I think I prefer the former, because at least then it's a proper S7 class. Either way, we should document a recommended approach.