boost-R / mboost

Boosting algorithms for fitting generalized linear, additive and interaction models to potentially high-dimensional data. The current relase version can be found on CRAN (http://cran.r-project.org/package=mboost).
73 stars 27 forks source link

GammaReg Family not running to completion #33

Open Nicholas-Autio-Mitchell opened 8 years ago

Nicholas-Autio-Mitchell commented 8 years ago

I have been able to perform many large runs through many datasets using the Gaussian and Binomial families for, however I am unable to perform the same test using the GammaReg Family in order to predict continous positive response. I originaly received an error message through debugging mode, saying that I should decrease the size of the learning-rate, nu (hence the pretty small number in the screenshots below) because the negative gradient that was calculated was infinite. There should likely be redundancy built into the code for such cases - hence why I believe this is a bug.

Here are the error messages that I have been able to extract from Rstudio: bereichsaufnahme 6

The first shows the function that is encountering the error. As the source is not available, it leads me to think the problem may lie within the C-code that underlies much many models. The second is similar, from the Traceback: bereichsaufnahme 5

This doesn't tell me much, but maybe it helps someone else... The last is the message given at the console upon crashing:

bereichsaufnahme 7

The warnings() are all the same, namely:

Warnmeldungen: 1: In optimize(riskS, interval = nuirange, y = y, fit = f, ... : NA/Inf durch größte positive Zahl ersetzt

[For those who don't speak German: "Error message: ...code... NA/Inf was replaced by the largest postive number"]

I have tested my code on that runs over many data sets, only performing the first iteration (of ~700) in each case, which works fine and saves a file to disk after each data subset has been modelled, as expected.

I am not sure if there are special considerations to be made regarding any of the model paramters, but the error that is given appears to be more of a bug in the code.

hofnerb commented 8 years ago

This shouldn't be a bug in C code but in the mboost_fit function. What might be the case is that the nuirange (which is needed in optimize) is to small? Could you try to increase the nuirange in GammaReg(nuirange = ...)?

Furthermore, do you have a minimal working example to reproduce the error? That would be very heplful.

Nicholas-Autio-Mitchell commented 8 years ago

Apologies from not coming back to this sooner (original post 6 months ago!). This is just a quick update, but I will try to create a minimal reproducible example.

@hofnerb - you're suggestion that nuirange is too small, I believe, is not the cause. Back then, I had played around with fair few ranges, some small and some large [eg: nuirange = c(0, 200)]. The default in the package is 0:100 as far as I can remember. I had also tried smaller ranges and even ranges within negative numbers, e.g. -100:100.