alexpghayes / modelling-in-r

an initial attempt to describe a grammar of modelling for r
https://alexpghayes.github.io/modelling-in-r/
3 stars 0 forks source link

The big one: model and model families #12

Closed alexpghayes closed 6 years ago

alexpghayes commented 6 years ago

The big thing that keeps coming up that seems to throw a wrench into everything is the difference between models and model families.

In some sense, you can work with a model by fitting a model family on a hyperparameter grid containing a single point. This is the approach caret takes, and I believe the one present in the current interface proposal.

I think is minorly problematic in terms of conceptual clarity, but majorly problematic in terms of implementation of new models. If you're implementing a new modelling technique, it makes a lot more sense to first write a fit method for models (i.e. glmnet::glmnet) and then to write a fit method (i.e. hyperparameter selection method) that may make heavy use of fit.model.

I want this separation because I think it'll be key to selling an interface to people writing new methods, but there are then two problems:

One approach is to offer both new_knn_model() and new_knn_family(), but this feels sloppy. Maybe it is a good idea though. Another option would be to offer new_knn() that creates a "knn" dummy object that gets transformed into a knn_model or a knn_family based on the particular call to fit.

Maybe let type safety be a guiding ideal here? Or is a type-unsafe function okay, where specifying individual hyperparameter values results in a model but passing in a hyperparameter space object results in a model family. In this case function signatures might seem weird.

In general, I think it's a good idea to discourage use of model families with singleton hyperparameter spaces.

After writing this all out, I'm starting think that a separate new_knn_model() and new_knn_family() might be a good idea, with new_knn() acting as a wrapper around new_knn_family().

@jarvmiller thoughts?

alexpghayes commented 6 years ago

Rolling with separate model and model_family constructors for the moment, moving to main text.