default params in param description: when we deviate from the defaults

berndbischl commented 9 years ago

We must have a consensus (best formally written down) how one can deviate from the defaults of an implemented underlying learner.

Questions:

in ParamDesc: old default value or new under entry "deafault"
should we set the value in par.vals or do it "manually" in training? the former is more transparent, but "clutters" objects more, was we see something which is a default.

berndbischl commented 9 years ago

@kerschke You should worry about this too

kerschke commented 9 years ago

Ok, I'll add it to the wiki.

berndbischl commented 9 years ago

Should this not go into tutorial / extend / own learner?

berndbischl commented 9 years ago

We also have to define it first....

mllg commented 9 years ago

I don't get it. Do you want something like resetParam()?

berndbischl commented 9 years ago

No. The problem is that we sometimes do this: A learner cannot be called with an empty argument list. Due to 2 reasons usually: (EDIT: I mean the underlying training function of the learner cannot be called without special extra args) a) a certain param must be set , but it has no default (eg nnet has 'size') b) the defaults really do not work. because they are for classification but we have a regr.foo variant.

Now we need to formally write down what we do here. Usually I set the extra needed values in 'par.vals'. And make a 'note' in the learner. So what happens is transparent to the user. And the PH Param object has the default slot. There I usually leave the default which is used in the underlying package. And dont set the changed value we use in mlr. (Agree with this?)

Do you see how much text I wrote? And I am NOT sure whether this is done exactly the same for every learner. Fortunately we did not have to do this very often. But we MUST document it. It does not really need new functions, but a documented standard, and a tedious check whether we follow it.

Got it now?

jakob-r commented 9 years ago

Related: #344 The problem is, if we don't have a good specification it is nearly impossible to have a reliable and working feasibility check. Our defaults in par.vals can clash with manually set parameters when creating the learner or calling setHyperPars.

schiffner commented 8 years ago

The consensus so far was:

The par.set contains possible values, bounds, defaults exactly as in the docs/signature of the underlying learner.
In trainLearner, predictLearner: Don't hardcode learner params. Exception: Functions that can deal with several types of learning problems like glm: For classification the family argument is hardcoded as "binomial".

Deviating from defaults for example because the defaults are inconvenient, unstable, temporarily buggy, ...

Put the parameter values in par.vals and add a note. (Then the changed param values are visible for example if the Learner is printed or by getHyperPars.)
We never change the defaults in the ParamSet.
Do not change defaults by adding them to the signature of trainLearner or predictLearner because this is not visible to the user.

berndbischl commented 8 years ago

In trainLearner, predictLearner: Don't hardcode learner params. Exception:

Good, but I would probably explain the exception a bit better. The real reason is basically that we fix the param because there is NEVER a reason to change it. You glm thing is then more of an example. Ah well. I think you get what I mean.

Deviating from defaults for example because the defaults are inconvenient, unstable....

Or missing!

berndbischl commented 8 years ago

Do not change defaults by adding them to the signature of trainLearner or predictLearner because this is not visible to the user.

Ok. I guess in general this is all nicely explained.

Add this to the tutorial and close here?

schiffner commented 8 years ago

I was trying to explain all conventions in the tutorial and some questions came up. One of my question is if what we agreed on about specifying the par.set really does make sense:

The par.set contains possible values, bounds, defaults exactly as in the docs/signature of the underlying learner.

I guess what I wrote is wrong, at least with regard to values, upper, lower:

Example: For LiblineaR we have several mlr learners for different value pairs of type. The default in the LiblineaR function is type = 0. Possible values for type are 0-7 (classif), 11-13 (regression). makeRLearner.classif.LiblineaRL2SVC for example is based on type = 1 or type = 2. So we need values = c(1,2) in makeDiscreteLearnerParam (to make sure that checks for user supplied params work correctly).
Does it then make sense to set the default in makeDiscreteLearnerParam to an infeasible value (0)? For what purposes will this info ever be needed?
This situation also occurs for many all-purpose learners, i.e., learners like blackboost that can do regression, classification, ... As values should only include options suitable for the particular learning task, the default can easily be infeasible. A hot candidate for this is the family arg.

I'm not sure what exactly is done in #344. Maybe this will resolve the problem.

schiffner commented 8 years ago

I talked with Bernd. Consensus: In the special cases mentioned above the default in the par.set should be a feasible value (and what makeRLearner.classif.LiblineaRL2SVC does (default = 2L) is correct).

pat-s commented 5 years ago

Cleaning up / closing old issues. I assume that this enhancement is not of interest anymore after all this time. Feel free to re-open if s.o. is tackling it again.

mlr-org / mlr

default params in param description: when we deviate from the defaults #225