RConsortium / S7

S7: a new OO system for R
https://rconsortium.github.io/S7
Other
399 stars 33 forks source link

Updates to default constructor #438

Closed t-kalinowski closed 1 month ago

t-kalinowski commented 1 month ago

This PR works towards resolving #376 and #404.

The key change involves the value of formals(<constructor>). Previously, all constructor argument default values were set to class_missing, and the appropriate property default was determined at instantiation.

With this PR, constructor arguments now default to prop$default, which can be either an expression or a value. This is resolved using standard promise-forcing semantics during object construction.

Additionally, all custom setters for non-dynamic properties are now invoked during object construction. Previously, this only occurred if the property value was provided as an argument. Below is a short demo of the new functionality:

library(S7)

Person <- new_class(
  name = 'Person',
  properties = list(
    # required constructor argument
    name = new_property(class_character, default = quote(expr = )),

    # optional constructor argument, read-only after construction. 
    # Defaults to the time of the call.
    birthdate = new_property(
      class = class_Date,
      default = quote(Sys.Date()),
      setter = function(self, value) {
        if (!is.null(self@birthdate))
          stop("Can't set read-only property Person@birthdate")
        self@birthdate <- value
        self
      }
    ), 

    # dynamic property, not a constructor argument
    age = new_property(class = class_any, getter = \(self) {
      Sys.Date() - self@birthdate
    })
  )
)

dput(formals(Person))
#> as.pairlist(alist(name = , birthdate = Sys.Date()))

try(Person())
#> Error in Person() : argument "name" is missing, with no default

Person('Mary')@age
#> Time difference of 0 days

Person('Mary', as.Date('1970-01-01'))@age
#> Time difference of 19974 days

p <- Person("Mary")
try(p@birthdate <- as.Date('1970-01-01'))
#> Error in (function (self, value)  : 
#>   Can't set read-only property Person@birthdate

Created on 2024-09-08 with reprex v2.1.1

hadley commented 1 month ago

Also worth noting in the news bullet that this is going to affect all existing exported constructors — you'll need to re-document to ensure that your docs match the updated usage.

t-kalinowski commented 1 month ago

I updated docs and NEWS, and added tests. I believe this is ready to merge.

The failing CI is unrelated; it looks like CRAN servers are currently down.


Last-minute addition worth noting for completeness: I updated new_property() to accept class_missing as a type (potentially as part of a union) and to accept a symbol as an arg default. This allows the following usage:

new_property(class_missing | class_character)

which is practically equivalent to:

new_property(class_character, default = quote(expr =))

I imagine most users will prefer specifying a missing argument with an informatively named helper like this:

new_property(class_character, default = rlang::missing_arg())

Please take a look at the updated tests for more examples.

mmaechler commented 1 month ago

Last-minute addition worth noting for completeness: I updated new_property() to accept class_missing as a type (potentially as part of a union) and to accept a symbol as an arg default. This allows the following usage:

new_property(class_missing | class_character)

which is practically equivalent to:

new_property(class_character, default = quote(expr =))

I imagine most users will prefer specifying a missing argument with an informatively named helper like this:

new_property(class_character, default = rlang::missing_arg())

Hmm... I'm sorry but I don't like the look of this, I've not checked the new behavior yet, hence only commenting what I see above. There are strong opinions and good reason why your "rlang::missing_arg()" is not explicitly available in R (i.e. base R), even though it has been around forever and "discussed" on R-devel / R-help lists.

As I don't understand why you think a default of "missing arg" would be preferable, I assume most users would also not understand. After all, the R way to require an argument is to not specify a default.
Yes, it is appropriate to say this is a matter of taste... but for something as "basic" as S7 that should become a fundamental package for R, I'm strongly against introducing such code / examples / ...

When I wrote earlier that I like the idea of being allow to pass a "language" object, specifically of class "call", I did not think I'd ever advocate using or even allowing quote(expr=) (which is actually not a "call" but a "symbol") but rather "real" calls such as quote( sin(x) ) as we were talking about something like a deferred / postponed evaluation.

t-kalinowski commented 1 month ago

Thank you, @mmaechler. I agree that quote(expr=) is not the prettiest user-facing UI.

The original user question was, "Can properties not have a default value?"

I believe we're in agreement that it should be possible to conveniently declare an object with a "required" property, meaning, a property whose value must be provided when the object instance is constructed. The question then becomes: What syntax in new_property() should we support that results in the auto-generated constructor having a missing default for that property? Concretely, given new_property(name = 'foo', default = ???), what can ??? be so that identical(formals(constructor)$foo, quote(expr=)) is TRUE?

Some alternatives to consider:

  1. Change the default default from NULL to be missing. This change would flip the default property default, from being the "empty" instance of the class to being required. I.e., optional property arguments would become opt-in, with all property arguments defaulting to being necessary. However, I don't think we want to pursue this approach.

  2. Accept a sentinel value, for example: new_property(default = class_missing). But this also brings complications, and to me does not seem to offer much beyond readability to the user-facing UI, compared to quote(expr=). Readability might be better addressed through a helper.

I'm eager to hear your thoughts. Any other ideas we should consider?


I would like to get this resolved and merged soon, as this PR is just the first step towards addressing other constructor-related issues, which we're actively discussing in the relevant threads (namely, alternative ways to opt-in or opt-out from having the property included as a constructor argument, ways to make a property read-only, and also, working through the proposal for accepting prototype seed arguments).

t-kalinowski commented 1 month ago

I quickly explored "Alternative 1" described in my previous comment, changing the signature of new_property() so that default defaults to missing. You can view the changes needed to make tests pass here: https://github.com/RConsortium/S7/compare/new-constructor-defaults...new-missing-prop-default-default

hadley commented 1 month ago

IMO we should merge this PR since it's a substantial improvement, but we can continue to iterate on the exact form that required properties should take.

mmaechler commented 1 month ago

I quickly explored "Alternative 1" described in my previous comment, changing the signature of new_property() so that default defaults to missing. You can view the changes needed to make tests pass here: new-constructor-defaults...new-missing-prop-default-default

Thanks a lot, Tomasz, for doing this! Now, after browsing the the required changes to the tests, I'd clearly agree that this alternative is worse than the "main" proposal (wrt which you did show the diffs). I'm definitely preferring that the user and class / property programmers does not need to specify a default in almost all cases... again following S4's new("<foo>") behavior (for all of its slots, unless there's an explicit initializer for the class), e.g., default for atomic type <foo> is just to provide <foo>(0) which is the same as <foo>().

What I'm still completely confused about is your assumption that the default for default should usually be explicitly specified by the user, and then use the "missing object" for that. I'd assume that very often the default default would make sense, e.g., if all properties of a new class/object are atomic, then no explicit defaults would be needed at all even for the whole class with these properties --- and then the same recursively, when a property is from a class which is simple enough then new_<...>() would work for that class and hence ...

t-kalinowski commented 1 month ago

Thank you, @mmaechler! I'll proceed with the merge.

Regarding your comment:

I'm still completely confused by your assumption that the default should usually be explicitly specified by the user.

To clarify a bit: I believe class authors should be able to create properties that require user-supplied values at instantiation, without excessive syntax.

I think the current default default is fine; however, the only way to currently make a property whose value is "required" by the constructor is to write a constructor by hand.

lawremi commented 1 month ago

Looks like I am a little late to the party. One general comment is that a common convention in S4 is to provide user-facing constructors that abstract the call to new(). Thus, a lot of these concerns were passed on to the developer. S7 is forging a new path when it comes to trying to create a user-friendly constructor automatically. In practice with S7, I've created a convenience constructor anyway, because I'd like to keep the actual constructor as simple as possible, i.e., it should only be concerned with constructing the object from a set of properties.

Another way to think of a "required" property would be that its default would yield an invalid object. The question then is whether the constructor should refuse to construct an invalid object. I don't think it needs to; that is the purpose of the validity function. Like in the example at the top, there should be a validity function to ensure that @name is a single string. Then, the developer could be a bit lazy and just let the default of character() fail the check. It's not quite as good an error message, but it is fairly easy for the user to infer the problem.