mlr-org / ParamHelpers

Helpers for parameters in black-box optimization, tuning and machine learning.
https://paramhelpers.mlr-org.com
Other
26 stars 9 forks source link

Add unstructured information to parameter #169

Open mb706 opened 7 years ago

mb706 commented 7 years ago

It would potentially be useful to add unstructured information to a parameter in a ParamSet. A concrete example of this would be in mlr: Information about which function a parameter pertains to. Another similar use would be to add help or notes about the parameter's setup. I suggest using a parameter extras in the make*Param functions which gets incorporated in the resulting Param object. For maximum flexibility, no structure should be imposed on this object; the result would be mlr-agnostic.

Example 1, information about parameter in mlr "classif.bst" learner:

# the 'learner' parameter of bst has been renamed 'Learner' to avoid a name collision
makeDiscreteLearnerParam(id = "Learner", default = "ls", values = c("ls", "sm", "tree"),
  extras = list(destination = "bst::bst(learner)")

Example 2, adding documentation to "regr.km" learner:

makeLogicalLearnerParam(id = "jitter", default = FALSE, when = "predict",
  extras = list(help = "enables adding a very small jitter (order 1e-12) to the x-values before prediction, as `predict.km` reproduces the exact y-values of the training data points, when you pass them in, even if the nugget effect is turned on."))

(example 1 is the most elegant solution I can think of to the problem it is supposed to solve. Example 2 is a suggestion, but there may be better alternatives.)

berndbischl commented 7 years ago

i can see your general point. you want to add more info. thats good and useful.

but how is that unstructured? why not ask to introduce specific slots for your 2 examples?

mb706 commented 7 years ago

The benefit of unstructured data would be that it could satisfy a broader need to add meta-information to parameters, if it ever arises in another context. Since the behaviour of the parameters doesn't depend on the information, imposing a structure (or even a suggestive name) on this meta information could end in frustration if someone ever wants to store some other thing alongside their parameters.

(After some thought, I'd say that extras defaulting to list()would be the best behaviour; possibly also to impose that extras should always be a list. This way, a user could modify the extras after parameter creation without much fuzz. Use case would be for learnerParamHelp in the PR i referenced above to add the parsed help info from package help pages to the help info entered manually into extras.)

The drawback would be that misspellings of list elements would not be caught, but that's true for most of R.