boost-R / mboost

Boosting algorithms for fitting generalized linear, additive and interaction models to potentially high-dimensional data. The current relase version can be found on CRAN (http://cran.r-project.org/package=mboost).
73 stars 27 forks source link

Fix Binomial_adaboost with link functions #63

Closed hofnerb closed 7 years ago

hofnerb commented 7 years ago

I was a bit too fast. I've changed link2dist() to make.link() in Binomial_adaboost (see commit 2562b07), however did not change all further occurrences of link$d, link$d and link$q.

I think it would be preferable to use make.link (with correct code afterwards). However it might be that Torsten needs the link2dist interface for his CTMs? Hence we might need to keep this.

hofnerb commented 7 years ago

I think arbitrary distributions should work again, e.g., with Bionomial(link = "norm"). These are required for boosted CTMs.

@mayrandy: Can you please add code for the make.link interface from line 184 onwards (or modify the code from line 152 onwards)? Afterwards we can remove lines 101-107 (in function link2dist).

mayrandy commented 7 years ago

@hofnerb OK

mayrandy commented 7 years ago

I've implemented now the other glm-type link functions c("probit", "cloglog", "cauchit", "log") for the classical Binomial(type = "adaboost").

While doing that, I think I better understood what the classical family was doing:

In case of link = "logit" we were (and still are) using the loss (as described in Bühlmann and Hothorn, 2007): log_2(1 + exp(- 2yf)) with y in (-1, 1) and f = log(p / (1 - p)) / 2

The latter is the actual reason we are getting coefficients half the size than usual, not the coding of y.

In case of link = "probit" the loss is the log-binomial y log(p) + (1 - y) log(1 - p) with f = qnorm(p) and the coefficients have the usual size (although we are giving the warning) -> see #65 !

As a result, Binomial(type = "glm", link = "probit") and Binomial(type = "adaboost", link = "probit") are in fact optimizing the same loss, they only lead to slightly different results due to other offsets and the different coding of y.