boost-R / mboost

Boosting algorithms for fitting generalized linear, additive and interaction models to potentially high-dimensional data. The current relase version can be found on CRAN (http://cran.r-project.org/package=mboost).
73 stars 27 forks source link

warning in df2lambda() never given for %O% #26

Closed sbrockhaus closed 7 years ago

sbrockhaus commented 8 years ago

The warning in df2lambda(), line https://github.com/hofnerb/mboost/blob/master/pkg/mboostPatch/R/bl.R#L25

    ## check for possible instability
    if (options("mboost_check_df2lambda")[[1]] && max(abs(X)) > 10)
        warning("Some absolute values in design matrix are greater 10. Hence, ",
                sQuote("df2lambda"), " might be numerically instable.\n  ",
                "See documentation of argument ", sQuote("by"),
                " in ?bbs for further information.",
                immediate. = TRUE)

gives a warning if the design matrix X contains absolute values > 10.

If %O% is used to kronecker two base-learners, this warning is never given. The reason is that in this case X is only a dummy-matrix and not the real design matrix, see the call to df2lambda() within %O% in line https://github.com/hofnerb/mboost/blob/master/pkg/mboostPatch/R/bkronecker.R#L57 where XtX is the squared design matrix. That means that one would have to check XtX instead of checking X.

See a MWE on how two model fits using the same desing-matrix give or do not give a warning

library(mboost)
### use volcano-data as example 
data("volcano", package = "datasets")

## use one direction as covariate
x1 <- 1:nrow(volcano) 
## in fact the response is just a vector, not a matrix...
vol1 <- volcano[ , 20]
##... thus use only 1 for the second direction
ONEx <- 1

## fit the model using %O%
mod <- mboost(vol1 ~ bols(x1, df=1) %O%
                bols(ONEx, df=1),
              control = boost_control(nu = 0.25))

## df2lambda() does not give a warning
## have a look at the two marginal desing-matrices
range(extract(mod, "design")[[1]][[1]])
range(extract(mod, "design")[[1]][[2]])

## now fit the model but without %O%
## (same design matrix as in mod, but penalty is different)
mod2 <- mboost(vol1 ~ bols(x1, df=1),
              control = boost_control(nu = 0.25))
## for this model df2lambda() gives a warning 
## as the design matrix contains values > 10
range(extract(mod2, "design")[[1]])
hofnerb commented 7 years ago

@sbrockhaus I have the slight feeling that this warning is nonsense. I can remember that we did some experiments and found out that df2lambda becomes instable. However, as of today, I cannot reproduce this finding. Did you ever have the feeling that something went wrong here? If not, I would remove this peace of code and close this issue.

@fabian-s What is your feeling? You requested an option once to be able to skip this test.

fabian-s commented 7 years ago

Sorry, I don't have any intuition about this. If your tests don't reproduce the issue this warns about then it should be safe to remove it, no?

sbrockhaus commented 7 years ago

For me, it was just odd that for the same model the warning is given or not given depending on how the model is built. I have usually ignored the warning anyway and never had the feeling that I had any troubles because of that.