mlr-org / mlr

Machine Learning in R
https://mlr.mlr-org.com
Other
1.65k stars 404 forks source link

{generate,plot}HyperParsEffect{Data} and more than 2 hyperparameters: no example? #1281

Closed eddelbuettel closed 8 years ago

eddelbuettel commented 8 years ago

When trying this (much appreciated) feature, I get

Error in checkLearner(learner, "regr", props) : 
  Learner 'classif.xgboost' must be of type 'regr', not: 'classif'
R> 

That was a basic XGBoost with three parameters and partial.dep=TRUE. There are two tests in generateHyperParsEffect.R which have

  if (checkClass(interpolate, "Learner") == TRUE ||
      checkString(interpolate) == TRUE) {
    lrn = checkLearnerRegr(interpolate)
  }

meaning that classifiers won't work. I tried relaxing this to checkLearner(interpolate) but no luck so far.

schiffner commented 8 years ago

Hi,

I hope I understand your question correctly. {generate,plot}HyperParsEffect{Data} is intended to work for hyperparameters of a classification learner.

Since you set partial.dep = TRUE: What learner did you choose as partial.dep.learn in plotHyperParsEffect? Maybe the problem lies there because this must always be a regression learner (and the same holds for the interpolate learner)?

To make sure that classification with 3 hyperpars really works I made a small example for myself.

ps = makeParamSet(
  makeIntegerParam("degree", lower = 1, upper = 5),
  makeNumericParam("offset", lower = 0, upper = 5),
  makeNumericParam("C", lower = -5, upper = 5, trafo = function(x) 2^x))
ctrl = makeTuneControlRandom(maxit = 100L)
rdesc = makeResampleDesc("Holdout")
learn = makeLearner("classif.ksvm", par.vals = list(kernel = "polydot"))
res = tuneParams(learn, task = pid.task, control = ctrl, measures = acc,
  resampling = rdesc, par.set = ps, show.info = FALSE)
data = generateHyperParsEffectData(res, partial.dep = TRUE)
plt = plotHyperParsEffect(data, x = "C", y = "degree", z = "acc.test.mean",
  plot.type = "heatmap", partial.dep.learn = "regr.earth")

Maybe this is helpful to you. Otherwise we would need a more complete code example from you.

eddelbuettel commented 8 years ago

Looking at this I think I specified the partial.dep.learn argument wrong. If you could just add what we have here to the manual page as an example things would be much better.

Your example surely works just fine, even over remote ssh+x11. So a big thanks!

larskotthoff commented 8 years ago

@eddelbuettel It would be great if you could post your complete original code so we can see exactly what you did and where to clarify the docs.

schiffner commented 8 years ago

Two things I noted:

eddelbuettel commented 8 years ago

@larskotthoff Will do once I get into the office. It was pretty vanilla XGBoost use, and a rookie mistake.

@schiffner Concur on both points! Happy to send a PR in, I have the repo checked out at work anyway.

(But what's it with your habit of not deleting branches? Unwieldy...)

eddelbuettel commented 8 years ago

Here is a MRE, with a apologies for not sending it in in the first place:

library(mlr)

ex <- function() {
    paramset <- makeParamSet(makeDiscreteParam("nrounds", values=c(50,125,250)), #25,50,100)),
                             makeDiscreteParam("max_depth", values=c(5,7)),
                             makeDiscreteParam("eta", values=c(0.01, 0.1, 0.2)),
                             #makeDiscreteParam("gamma", values=0),
                             #makeDiscreteParam("colsample_bytree", values=c(0.4,0.6)), #,,0.8)),
                             #makeDiscreteParam("min_child_weight", values=c(1,3,5)),
                             #makeDiscreteParam("nthread", values=2),
                             makeDiscreteParam("verbose", values=0))

    ctrl <- makeTuneControlGrid()

    rdesc <- makeResampleDesc("CV", iters = 2L) #5L)

    learn <- makeLearner("classif.xgboost", predict.type="prob")

    lrn <- makeTuneWrapper(learn,
                           control = ctrl,
                           measures = list(acc, mmce),
                           resampling = rdesc,
                           par.set = paramset,
                           show.info = FALSE)
    tsk <- iris.task # instead of my real data
    res <- resample(lrn,
                    task = tsk,
                    resampling = rdesc,
                    extract = getTuneResult,
                    show.info = FALSE)
    data <- generateHyperParsEffectData(res, partial.dep=TRUE)
    plt <- plotHyperParsEffect(data,
                               x = "nrounds",
                               y = "eta",
                               z = "acc.test.mean",
                               plot.type = "heatmap",
                               interpolate = "regr.earth",
                               show.experiments = FALSE,
                               nested.agg = mean,
                               partial.dep.learn = "classif.xgboost")  
    min_plt <- min(data$data$acc.test.mean, na.rm = TRUE)
    max_plt <- max(data$data$acc.test.mean, na.rm = TRUE)
    med_plt <- mean(c(min_plt, max_plt))
    plt + ggplot2::scale_fill_gradient2(breaks = seq(min_plt, max_plt, length.out = 5),
                                        low = "red", mid = "white", high = "blue", midpoint = med_plt)
    print(plt)
}

ex()

I had tried several arguments for partial.dep.learn, but as @schiffner noted, the documentation isn't really all that fantastic on that point, and there is no working example (yet) to monkey-off. I adjusted the Subject.

Seeting partial.dep.learn="regr.earth" works perfectly fine.

(Hm. The plot comes up with blank canvas so I left some other hoopla in there.)

schiffner commented 8 years ago

Thanks very much.

(Hm. The plot comes up with blank canvas so I left some other hoopla in there.)

I guess this is a weird side-effect of using both interpolate and partial.dep.learn simultaneously (which is not intended), and we are lacking an arg check for this.

Also paging @MasonGallo

eddelbuettel commented 8 years ago

I am also seeing weird side-effects of the grid shrinking -- ie data$data shows the range of values, the plot does not. No MRE yet. Currently back to faceting by hand in ggplot().

schiffner commented 8 years ago

I am also seeing weird side-effects of the grid shrinking -- ie data$data shows the range of values, the plot does not.

Mmmh, it might be the case that the regression learner (the partial.dep.learn) makes constant predictions? (Unfortunately, I just found out that internally partial.dep.learn is not passed on correctly, so regardless what you specify a randomForest is fitted. I can do a PR to solve that.)

I added an example to the tutorial for 3 hyper pars.

eddelbuettel commented 8 years ago

Now irace is busted for me on master :-/

Error in irace::irace(scenario = tuner.config, parameters = parameters) : 
  unused argument (scenario = tuner.config)

MRE straight from the manual example:

library(mlr)

iraceBug <- {
    ps = makeParamSet(
        makeNumericParam("C", lower = -12, upper = 12, trafo = function(x) 2^x),
        makeDiscreteParam("kernel", values = c("vanilladot", "polydot", "rbfdot")),
        makeNumericParam("sigma", lower = -12, upper = 12, trafo = function(x) 2^x,
                         requires = quote(kernel == "rbfdot")),
        makeIntegerParam("degree", lower = 2L, upper = 5L,
                         requires = quote(kernel == "polydot"))
    )
    ctrl = makeTuneControlIrace(maxExperiments = 200L)
    rdesc = makeResampleDesc("Holdout")
    res = tuneParams("classif.ksvm", iris.task, rdesc, par.set = ps, control = ctrl, show.info = FALSE)
}

iraceBug()
schiffner commented 8 years ago

Which irace version do you have? With the current github version of mlr you need 2.0.

eddelbuettel commented 8 years ago

Indeed. I was sure I had it seen update that -- but it may have been my laptop or machine at home. Sorry about that.

schiffner commented 8 years ago

Not a problem. Sorry for all the obstacles :/

masongallo commented 8 years ago

Thanks for the feedback @eddelbuettel! Few things I want to help clarify which @schiffner explained above:

Error in checkLearner(learner, "regr", props) : Learner 'classif.xgboost' must be of type 'regr', not: 'classif'

The learner for partial.dep.learn or interpolate must always be regression learner since we are predicting the (numerical) performance measure. I'm happy to make this more clear in the docs.

(Hm. The plot comes up with blank canvas so I left some other hoopla in there.)

Are you sure the code you pasted runs? The code you pasted above this should have returned an error since classif.xgboost isn't a regression learner. That might explain the blank canvas? If you do use a regression learner, you will still get an error due to not having enough unique values in the response. This will go away if you search a deeper space for the hyperparameters.

I guess this is a weird side-effect of using both interpolate and partial.dep.learn simultaneously (which is not intended), and we are lacking an arg check for this.

The docs state that interpolate is ignored if partial.dep.learn is used, so this shouldn't be a problem.

@eddelbuettel if you keep in mind that visualizing hyperparameter tuning effects is sort of like a "regression" problem, I think your issues will go away. We still have to learn a function, so we need enough data.

masongallo commented 8 years ago

Thanks @schiffner for fixing the tutorial! I completely forgot to submit a PR (I even wrote full examples).

(Unfortunately, I just found out that internally partial.dep.learn is not passed on correctly, so regardless what you specify a randomForest is fitted. I can do a PR to solve that.)

Thanks for catching! That was a silly typo. This should be a really quick fix. I should have some time early tmw to fix.

eddelbuettel commented 8 years ago

Are you sure the code you pasted runs?

Of course. I learned seven or eight posts ago about the required change and made it.

masongallo commented 8 years ago

Of course. I learned seven or eight posts ago about the required change and made it.

Great! The conversation wasn't perfectly clear to me, so I was just checking we're all on the same page. Let us know if you hit any other bumps-

eddelbuettel commented 8 years ago

I have. Your point on enough data for the response surface is a good one. I am still a little unclear about some outcomes where the resulting plot has less 'range' than the data. To be followed-up...

schiffner commented 8 years ago

@MasonGallo : Thanks very much for the clarifications.

About the tutorial. I just went ahead and added an example because it was an easy 5-min thing to do. If you had already something planned out and written, please feel free to just replace the stuff I did.

I guess this is a weird side-effect of using both interpolate and partial.dep.learn simultaneously (which is not intended), and we are lacking an arg check for this.

The docs state that interpolate is ignored if partial.dep.learn is used, so this shouldn't be a problem.

I'm aware that the docs say this. To clarify what led me to believe that something strange is going on: I took @eddelbuettel 's example, replaced the partial.dep.learn with "reg.randomForest" and noticed that if I leave the interpolate learner in there I get a blank plot, while leaving it out gives a reasonable looking plot. I meant to investigate further yesterday, but ran out of time. Any clue what is happening?

set.seed(123)
plotHyperParsEffect(data,
                               x = "nrounds",
                               y = "eta",
                               z = "acc.test.mean",
                               plot.type = "heatmap",
                               interpolate = "regr.earth",
                               show.experiments = FALSE,
                               nested.agg = mean,
                               partial.dep.learn = "regr.randomForest")
set.seed(123)
plotHyperParsEffect(data,
                               x = "nrounds",
                               y = "eta",
                               z = "acc.test.mean",
                               plot.type = "heatmap",
                               show.experiments = FALSE,
                               nested.agg = mean,
                               partial.dep.learn = "regr.randomForest")
masongallo commented 8 years ago

Thanks @schiffner now I understand what's going on. This is not an intended usecase since we are essentially telling the plotting routine to interpolate the original but also asking for partial dependence.

To your point, we could just block this with arg check and make the language a bit stronger in the docs.

eddelbuettel commented 8 years ago

Ok, what I was trying to get to is now an issue in #1287. This may be behaviour 'as inteded', but it is a little hard to tell.

masongallo commented 8 years ago

Questions / issues discussed here are now merged.