boost-R / mboost

Boosting algorithms for fitting generalized linear, additive and interaction models to potentially high-dimensional data. The current relase version can be found on CRAN (http://cran.r-project.org/package=mboost).
73 stars 27 forks source link

Variable importance #29

Closed ghost closed 8 years ago

ghost commented 8 years ago

Almond and me (in coordination with Fabian Scheipl) wrote a function to extract and plot the (in-bag) risk reduction per base-learner from a fitted mboost-model. The contribution (to risk reduction) of each base-learner can be used as a measure for variable importance of the different base-learners or variables in the model.

Function varimp(object) simply returns a object of class varimp with the corresponding risk reductions per base-learner. The plot function (eventually a lattice::barchart) for these varimp objects additionally offers some visualisation options like the display in percentages or absolute values, number and order of the displayed bars and whether to focus on base-learners or variables involved in the base-leaners.

hofnerb commented 8 years ago

Thanks a lot. We currently have an issue with Travis-CI but I will try to fix this and have a look at your PR afterwards.

ghost commented 8 years ago

Just as a short information - the Travis build still includes two notes caused by the variable importance code:

plot.varimp: no visible binding for global variable ‘ylab’
plot.varimp: no visible binding for global variable ‘blearner’

We already tried Hadley Wickhams advice to solve the problem via the following line: if (getRversion() >= "2.15.1") globalVariables(c("ylab", "blearner")) but unfortunately that didn't work out. Besides the notes, we also got additional errors:

Error in registerNames(names, package, ".__global__", add) :
The namespace for package "mboostDevel" is locked; no changes in
the global variables list may be made.

Question is, can we keep it that way despite the notes?

hofnerb commented 8 years ago

No, the issue has to be resolved.

1) Please do specify all possible options of arguments like type and blorder in the function definition, e.g. type = c("blearner", "variable") and use type <- match.arg(type) in your function body.

2) Do not throw an error if the user wants to specify xlabs and ylabs. Use xlab = NULL and ylab = NULL in your function definition. If these arguments are not set, i.e., NULL, keep the default labels as you do now. Otherwise use the user specified labels. By having xlab and ylab in your function definition this should also remove the warning.

Regarding the issue with blearner I do not have a solution a.t.m. I can remember that we had similar issues before but do not remember how we've fixed them.

ghost commented 8 years ago

Ok, thanks for the advice. We implemented remarks 1) and 2) and it helped eliminating the note for ylab, of course. blearner is still an issue. This seems to be quite a common problem!? Is there maybe anyone else who we could ask for support?

hofnerb commented 8 years ago

I now seem to have found the error. type = "variable" should not work as you use the object blearner which isn't defined:

 if( type == "variable" ) {
   barchart(variable ~ reduction, groups = blearner, data = plot_data,
    horizontal = TRUE, xlab = xlab, ylab = ylab, xlim = xlim,
    scales = list(x = list(tck = c(1,0), at = seq(0,sum(x), length.out = 5))), 
    stack = TRUE, auto.key = auto.key, ...)
}

I'd guess this should be plot_data[, "blearner"]? Might that be the case?

ghost commented 8 years ago

..yes, that's it - thanks a lot!! The note for blearner is gone, too.

Actually the barchart was correctly created with the call in the post above (with groups = blearner), but it also works with groups = plot_data[, "blearner"].

Is there anything else that got your attention or can we leave the code as it is?

hofnerb commented 8 years ago

Hi @tkuehn13,

I was a bit too lazy (and had too little time) to pull in your patch bevor changing the structure of mboost (I got rid of the stupid sub folders mboostPatch and mboostDevel. Now, I cannot use your patch easily for mboostDevel anymore. Do you have any ideas how to solve this issue? I do not want to loose your change track etc by simply copy-pasting the files in the new structure...

Sorry for this. Benjamin