boost-R / gamboostLSS

Boosting models for fitting generalized additive models for location, shape and scale (GAMLSS) to potentially high dimensional data. The current relase version can be found on CRAN (https://cran.r-project.org/package=gamboostLSS).
26 stars 11 forks source link

as.families("BEINF") not working #28

Closed hofnerb closed 7 years ago

hofnerb commented 7 years ago

The following code does not work:

library("gamboostLSS")
library("gamlss.dist")
data <- data.frame(x1 = runif(400), x2 = runif(400))
data$y <- with(data, rBEINF(400, mu = range((x1 + x2 - min(x1 + x2) + 0.001) / (diff(range(x1 + x2)) + 0.002)), 
                          sigma = sqrt(x1), nu = 0.1, tau = 0.1))
mod <- gamboostLSS(y ~ x1 + x2, data = data, families = as.families("BEINF"))
# Error in FAM$dldm(y = y, mu = FAM$mu.linkinv(f), sigma = sigma, nu = nu,  : 
#   unused arguments (nu = nu, tau = tau)

The reason can be found in

gamlss.dist::as.gamlss.family("BEINF")$dldm
#  function (y, mu, sigma) 
#  {
#      a <- mu * (1 - sigma^2)/(sigma^2)
#      b <- a * (1 - mu)/mu
#      dldm <- ifelse(((y == 0) | (y == 1)), 0, ((1 - sigma^2)/(sigma^2)) * 
#          (-digamma(a) + digamma(b) + log(y) - log(1 - y)))
#      dldm
#  }

which is a function of mu and sigma only.

How can we fix this for this family (and potentially others)? Is a change in gamlss.dist needed or can we fix this ourselves? Are there other families leading to the same or a similar issue?

@mayrandy, can you have a look at this?

mayrandy commented 7 years ago

Thanks for finding this! This should be a problem for all inflated families, e.g. BEOI():

BEINF()$nopar
[1] 4
 BEOI()$nopar
[1] 3

But their corresponding functions for the derivatives BEOI()$dldm have arguments mu and sigma only which leads to the error.

I think we should be able to fix this in our code, I'll take a look.

hofnerb commented 7 years ago

... no markdown available ...

hofnerb commented 7 years ago

Thanks a lot!

Perhaps there are even more families with similar properties? Could check this issue by comparing nopar with the number of parameters in dldm either by simply calling the functions or by computing on

deparse(gamlss.dist::as.gamlss.family("BEINF")$dldm)[1]

for all available families in a loop?

mayrandy commented 7 years ago

OK, all those families have FAM()$type == "Mixed", found that in my own code.

Apparently I had already implemented such cases for 3-parametric families and was only too lazy to do the same also for the 4-paremetric ones. Should be fixed now in "devel".

hofnerb commented 7 years ago

Merci :)