mlr-org / paradox

ParamHelpers Next Generation
https://paradox.mlr-org.com
GNU Lesser General Public License v3.0
28 stars 7 forks source link

ParamSet Conditions should respect $default #265

Open mb706 opened 4 years ago

mb706 commented 4 years ago

Suppose the following ParamSet:

> ps <- ParamSet$new(list(
  ParamLgl$new("x", default = TRUE),
  ParamLgl$new("y")))
> ps$add_dep("y", "x", CondEqual$new(TRUE))

Here the default = TRUE indicates that, if the parameter x is not given, the whole thing should behave like x is set to TRUE. E.g. x could indicate whether a certain feature is used, and y could then influence the configuration of that feature---y then depends on x being TRUE, since it does not have anything to configure if x is FALSE.

Since the "default" behaviour indicated is that the feature is usually present (everything behaves like x is TRUE) if x is not given at all, it makes sense to just set the y parameter and have the x parameter be "implicitly" TRUE.

This is not what happens, however:

> ps$values <- list(y = TRUE)
Error in (function (xs)  : 
  Assertion on 'xs' failed: The parameter 'y' can only be set if the following condition is met 'x = TRUE'. Instead the parameter value for 'x' is not set at all. Try setting 'x' to a value that satisfies the condition.

The ParamSet$check() function should do an xs = insert_named(self$default, xs) after the if(!isTRUE(ok)) block. What I am describing above should be added as a test; possibly add more tests.

(I believe one of our student assistants could solve this)

jakob-r commented 4 years ago

We already discussed that several times (e.g. #259) Remember: The default in the ParamSet is just documentation. We improved the error so the user is aware what the necessary action his to be from his side.

So the solution is:

ps$values <- list(y = TRUE, x = TRUE)

I vote for: No Bug

mb706 commented 4 years ago

The dependency is supposed to indicate that presence of a certain parameter (in the example y) is only sensible if another parameter (in this case x) is at a certain value. Suppose the x controls whether a certain kernel is used in an ML method, and y then parameterises this kernel. If the default behaviour (which we know about because of default) of the ML method is to have the kernel present (even if x is not set), then it makes sense to have y present (even when x is not set). y should only be absent if the kernel is explicitly deactivated (by setting x to FALSE in this example).

jakob-r commented 4 years ago

Your explanation was already pretty clear to me. However, it heavily relies on the statement that the default is set correctly. But as I sad: So far this is just documentation and not something we want to program on. Partly also because we don't have any way to test if this value is set correctly. So why should we bother to complicate things within mlr3/paradox if the user can just both values to fulfill the requirement.

You might be in favor for a fix within the code but it is less trivial than it appears to be. You cannot just send x = insert(defaults, values) to the feasibility check. Unfortunately the cases where this breaks are rare and somewhat complicated. I should have made a collection for such discussions. Some points

mb706 commented 4 years ago

So far this is just documentation and not something we want to program on.

That would need to change then.

Partly also because we don't have any way to test if this value is set correctly.

whatever = WhateverClass$new()
set.seed(1)
result1 = whatever$evaluate()
whatever$param_set$values %<>% insert_named(whatever$param_set$defaults, .)
set.seed(1)
result2 = whatever$evaluate()
expect_equal(result1, result2)

(possibly with some code to weed out parameters with unfulfilled requirements, but this would still be an easy autotest like function)

also note we are setting a pretty high bar here, since we don't currently have systematic tests for most of the other metainformation given in paramset, e.g. lower / upper bounds, dependencies, special_vals or even parameter type. Default values are probably among the easiest of these to test.

You cannot just send x = insert(defaults, values) to the feasibility check

I don't think it is overly complicated to have the feasibility-check respect the default values as well. The ok = (p1id %in% ns && p2id... line in ParamSet$check() would have to be expanded by another clause.

ok = (p1id %in% ns &&
  (p2id %in% ns && cond$test(xs[[p2id]]) ||
  p2id %in% self$defaults && cond$test(self$defaults[[p2id]]]))) || 
                (p1id %nin% ns)