mlr-org / mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
https://mlr3tuning.mlr-org.com/
GNU Lesser General Public License v3.0
54 stars 5 forks source link

Remove archive output, add best result and state of instance #217

Closed be-marc closed 4 years ago

be-marc commented 4 years ago

Closes #206

Output before tuning

<TuningInstance>
* State:  Not tuned
* Task: <TaskClassif:iris>
* Learner: <LearnerClassifRpart:classif.rpart>
* Measures: classif.ce, classif.acc
* Resampling: <ResamplingHoldout>
* Terminator: <TerminatorEvals>
* bm_args: list()
* n_evals: 0
ParamSet: 
   id    class lower upper levels     default value
1: cp ParamDbl     0  0.05        <NoDefault>

Output after tuning

 <TuningInstance>
* State:  Tuned
* Task: <TaskClassif:iris>
* Learner: <LearnerClassifRpart:classif.rpart>
* Measures: classif.ce, classif.acc
* Resampling: <ResamplingHoldout>
* Terminator: <TerminatorEvals>
* bm_args: list()
* n_evals: 20
* Result:
   perf:
    classif.ce
    0.06
    classif.acc
    0.94
   tune_x:
    cp
    0.00173121183179319
   params:
    xval
    0
    cp
    0.00173121183179319
ParamSet: 
   id    class lower upper levels     default value
1: cp ParamDbl     0  0.05        <NoDefault> 

If we just use $eval_batch instead of $tune the output looks like this Fixed with new commit

 <TuningInstance>
* State:  Tuned
* Task: <TaskClassif:iris>
* Learner: <LearnerClassifRpart:classif.rpart>
* Measures: classif.ce, classif.acc
* Resampling: <ResamplingHoldout>
* Terminator: <TerminatorEvals>
* bm_args: list()
* n_evals: 2
* Result:
   perf:

   tune_x:

   params:
    xval
    0
ParamSet: 
   id    class lower upper levels     default value
1: cp ParamDbl     0  0.05        <NoDefault>  

since no results were written to private$.result by a Tuner class. Should we omit the result output in this case?

codecov-io commented 4 years ago

Codecov Report

Merging #217 into master will increase coverage by 0.21%. The diff coverage is 60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #217      +/-   ##
==========================================
+ Coverage   91.83%   92.04%   +0.21%     
==========================================
  Files          18       18              
  Lines         306      352      +46     
==========================================
+ Hits          281      324      +43     
- Misses         25       28       +3
Impacted Files Coverage Δ
R/TuningInstance.R 91.07% <60%> (+0.06%) :arrow_up:
R/Tuner.R 85.36% <0%> (+0.07%) :arrow_up:
R/AutoTuner.R 98.5% <0%> (+0.46%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 706f1dc...fd3ba0b. Read the comment docs.

pat-s commented 4 years ago

since no results were written to private$.result by a Tuner class. Should we omit the result output in this case?

imo yes, looks weird otherwise


The indention also looks a bit off here. Is this how its supposed to be?

be-marc commented 4 years ago

We check now if TuningInstance was called by a Tuning* object to determine if a tuning was conducted. We do this indirectly by checking !is.null(self$result$perf) instead of using self$n_evals != 0 since n_evals is also increased if just $eval_batch() is called without using a tuner. self$result$perf is always written by a Tuning* object .

@pat-s Yes the indention is on purpose. Do you prefer another way to structure the output?

pat-s commented 4 years ago

How about:

Do we really need $params? Its just showing the same as tune_x for tuned params and also not tuned params? Maybe just append not tuned ones in the tune_x output?

After you've made changes, please also post the example output in a comment (maybe a small reprex helps) and do not edit the first comment - otherwise some comments cannot be related to certain intermediate works :)

be-marc commented 4 years ago

Do we really need $params? Its just showing the same as tune_x for tuned params and also not tuned params? Maybe just append not tuned ones in the tune_x output?

tune_x shows the hyperparameters without trafo and params with trafo applied.

pat-s commented 4 years ago

Ah ok. Then I'd say only params? tune_x is in the object anyway but for the printout params should be sufficient then?

be-marc commented 4 years ago

New print ouput:

<TuningInstance>
* State:  Tuned
* Task: <TaskClassif:iris>
* Learner: <LearnerClassifRpart:classif.rpart>
* Measures: classif.ce, classif.acc
* Resampling: <ResamplingHoldout>
* Terminator: <TerminatorEvals>
* bm_args: list()
* n_evals: 20
* Result:
   perf:
      classif.ce classif.acc
   1:       0.04        0.96
   tune_x:
              cp
   1: 0.03188852
   params:
      xval         cp
   1:    0 0.03188852
* ParamSet:
      id    class lower upper levels     default value
   1: cp ParamDbl     0  0.05        <NoDefault>

Reprex:

library(mlr3)
library(mlr3tuning)
library(paradox)

ps = ParamSet$new(list(
  ParamDbl$new("cp", lower = 0, upper = 0.05)
))

instance = TuningInstance$new(
  task = tsk("iris"),
  learner = lrn("classif.rpart"),
  resampling = rsmp("holdout"),
  measures = msrs(c("classif.ce", "classif.acc")),
  param_set = ps,
  terminator = term("evals", n_evals = 20)
)

tuner = TunerRandomSearch$new()
tuner$tune(instance)
print(instance)
be-marc commented 4 years ago

Ah ok. Then I'd say only params? tune_x is in the object anyway but for the printout params should be sufficient then?

Not sure about this. @berndbischl @pfistfl What do you think?

berndbischl commented 4 years ago

pls change:

a) only show tune_x. thats really the result from tuning. not "params"

b) your print code looks bad in code and output. why not simply use mlr3misc::as_short string

    catf(paste0("   ", capture.output(as.data.table(rbind(self$result$perf)))), sep= "\n")

c) also i wouldnt double-indent stuff Result perf Result tune_x seems fine

be-marc commented 4 years ago

This ugly code was necessary to indent the data.tables.

The output looks like this now:

<TuningInstance>
* State:  Tuned
* Task: <TaskClassif:iris>
* Learner: <LearnerClassifRpart:classif.rpart>
* Measures: classif.ce, classif.acc
* Resampling: <ResamplingHoldout>
* Terminator: <TerminatorEvals>
* bm_args: list()
* n_evals: 20
* Result perf:
   classif.ce classif.acc
1:       0.06        0.94
* Result tune_x:
           cp
1: 0.01549386
ParamSet: 
   id    class lower upper levels     default value
1: cp ParamDbl     0  0.05        <NoDefault>      
mllg commented 4 years ago
* Result perf:
   classif.ce classif.acc
1:       0.06        0.94

This is a named numeric vector, why do we print it as data.table()? Would you consider paste(sprintf("%s: %g", names(perf), perf), collapse = ", ")?

* Result tune_x:
           cp
1: 0.01549386

Same here, I don't see why we use the data.table() printer. sprintf() + as_short_string()?

berndbischl commented 4 years ago

@mllg didnt you make this PR now completely obsolete with your last commit? close here?

berndbischl commented 4 years ago

https://github.com/mlr-org/mlr3tuning/commit/1cf4b2efa2c1002b74af9b31d58427fab8925e65

be-marc commented 4 years ago

@mllg Thank you. I just used the code from https://github.com/mlr-org/mlr3tuning/commit/1cf4b2efa2c1002b74af9b31d58427fab8925e65

@berndbischl We also wanted to remove the archive output and indicate the state of the TuningInstance

Output looks like this now:

<TuningInstance>
* State:  Tuned
* Task: <TaskClassif:iris>
* Learner: <LearnerClassifRpart:classif.rpart>
* Measures: classif.ce, classif.acc
* Resampling: <ResamplingHoldout>
* Terminator: <TerminatorEvals>
* bm_args: list()
* n_evals: 20
* Result perf: classif.ce=0, classif.acc=1
* Result tune_x: cp=0.001542, minsplit=11
ParamSet: 
         id    class lower upper levels     default value
1:       cp ParamDbl     0  0.05        <NoDefault>      
2: minsplit ParamInt    10 12.00        <NoDefault>      
berndbischl commented 4 years ago

test fail, instance --> self

berndbischl commented 4 years ago

@be-marc pls only push after you have passed local tests, otherwise this back-and-forths eats up time for other. this problem you can easily see locally, travis is that there to leave this out...

be-marc commented 4 years ago

Sorry. Should work now.