tidymodels / model-implementation-principles

recommendations for creating R modeling packages
https://tidymodels.github.io/model-implementation-principles/
41 stars 4 forks source link

Add note about subclassing #12

Closed alexpghayes closed 4 years ago

alexpghayes commented 5 years ago

An attempt to avoid the special hells accompanying subclassing lm, anova, or similar. Unfortunately, many of the canonical "copy this structure" functions in MASS do this and then break down the line.

Actually, I think is a really good rule of thumb that we should make sure to implement in parsnip: only create model objects from constructors/validators. We should probably also provide constructors and validators for parsnip objects in if we want them to be extensible. cc @DavisVaughan

DavisVaughan commented 5 years ago

I haven't fully thought through what subclassing would mean for model objects. I don't think I've ever done it myself actually. But they should fall under the Scalar object type as defined in this section: https://adv-r.hadley.nz/s3.html#object-styles

vctrs doesn't provide much infra here, but the constructor/validator advice still applies. I definitely think it would be good to think through what we should do in parsnip.

First thoughts are maybe base (essentially abstract in OOP terms) classes of model_spec and model_fit and then everything is built off of that by subclassing with their own constructors. The base classes could define a few things that are required when subclasses extend it. This method doesn't really let subclasses add things to the resulting .spec list, but they can add attributes through ....

new_model_spec <- function(args = list(), 
                           eng_args = list(), 
                           mode = character(), 
                           method = NULL, 
                           engine = character(), 
                           ..., 
                           subclass = character()) {

  .spec <- list(
    args = args,
    eng_args = eng_args,
    mode = mode,
    method = method,
    engine = engine
  )

  structure(.spec, ..., subclass = c(subclass, "psnp_spec"))
}

new_linear_reg <- function(args = list(), 
                           eng_args = list(), 
                           mode = character(), 
                           method = NULL, 
                           engine = character(),
                           ..., 
                           subclass = character()) {

  new_model_spec(
    args = args,
    eng_args = eng_args,
    mode = mode,
    method = method,
    engine = engine,
    ...,
    subclass = c(subclass, "psnp_linear_reg")
  )

}

new_model_spec()
#> $args
#> list()
#> 
#> $eng_args
#> list()
#> 
#> $mode
#> character(0)
#> 
#> $method
#> NULL
#> 
#> $engine
#> character(0)
#> 
#> attr(,"subclass")
#> [1] "psnp_spec"

new_linear_reg()
#> $args
#> list()
#> 
#> $eng_args
#> list()
#> 
#> $mode
#> character(0)
#> 
#> $method
#> NULL
#> 
#> $engine
#> character(0)
#> 
#> attr(,"subclass")
#> [1] "psnp_linear_reg" "psnp_spec"

Created on 2018-11-22 by the reprex package (v0.2.0).

DavisVaughan commented 5 years ago

I'll also add that parsnip already does this in some form, but it might could be iterated on to be more like advanced R's recommendation.