gamlss-dev / gamlss

gamlss: Generalized Additive Models for Location Scale and Shape
https://CRAN.R-project.org/package=gamlss
11 stars 4 forks source link

gamlssCV incompatibel with ridge regression #10

Open harakiricode opened 5 months ago

harakiricode commented 5 months ago

When calling ri with gamlssCV, get("gamlsscall", envir = gamlss.env)$data returns the data function from the standard R utils package instead of the dataframe passed to the data argument of the gamlssCV call.

Miminal reproducible example:

library(gamlss)
m0 <- gamlss(y~ri(x.vars=c("x1","x2","x3","x4","x5","x6")), data=usair) ## works like a charm
m1<- gamlssCV(y~ri(x.vars=c("x1","x2","x3","x4","x5","x6")), data=usair) ## throws the error

Full traceback:

> m1<- gamlssCV(y~ri(x.vars=c("x1","x2","x3","x4","x5","x6")), data=usair)
fold 1
Error in as.data.frame.default(x[[i]], optional = TRUE) : 
  cannot coerce class ‘"function"’ to a data.frame
> traceback()
16: stop(gettextf("cannot coerce class %s to a data.frame", sQuote(deparse(class(x))[1L])), 
        domain = NA)
15: as.data.frame.default(x[[i]], optional = TRUE)
14: as.data.frame(x[[i]], optional = TRUE)
13: data.frame(eval(substitute(Data)))
12: ri(x.vars = c("x1", "x2", "x3", "x4", "x5", "x6"))
11: eval(predvars, data, env)
10: eval(predvars, data, env)
9: model.frame.default(formula = y ~ ri(x.vars = c("x1", "x2", "x3", 
       "x4", "x5", "x6")), data = data)
8: model.frame(formula = y ~ ri(x.vars = c("x1", "x2", "x3", "x4", 
       "x5", "x6")), data = data)
7: eval(mcall, sys.parent())
6: eval(mcall, sys.parent())
5: gamlss(formula = formula, sigma.formula = sigma.formula, nu.formula = nu.formula, 
       tau.formula = tau.formula, data = data, family = family, 
       control = control, ...)
4: gamlssVGD(formula = formula, sigma.formula = sigma.formula, nu.formula = nu.formula, 
       tau.formula = tau.formula, family = family, control = control, 
       data = data[rand != i, ], newdata = data[rand == i, ], ...)
3: FUN(X[[i]], ...)
2: lapply(i, fn)
1: gamlssCV(y ~ ri(x.vars = c("x1", "x2", "x3", "x4", "x5", "x6")), 
       data = usair)
zeileis commented 5 months ago

Thanks for the report! I cannot help much myself, unfortunately. I can only confirm the bug and add that this is an issue beyond gamlssCV() or gamlssVGD(). It occurs in general when you call gamlss() with ri() inside another function:

gamlss(y ~ ri(x.vars = c("x1", "x2")), data = usair)                         ## ok
fit_gamlss <- function(formula, data) gamlss(formula = formula, data = data) ## wrapper function
fit_gamlss(y ~ ri(x.vars = c("x1", "x2")), data = usair)                     ## fails
## Error in as.data.frame.default(x[[i]], optional = TRUE) : 
##   cannot coerce class '"function"' to a data.frame

And depending on what you use as the data name in the wrapper function you can also get other funny errors:

fit_gamlss <- function(formula, BOD) gamlss(formula = formula, data = BOD)
fit_gamlss(y ~ ri(x.vars = c("x1", "x2")), BOD = usair)
## Error in `[.data.frame`(Data, , x.vars) : undefined columns selected

In any case ri() looks for an object with the name from inside the function (data or BOD above) in the environment outside the function. I've looked at the ri() code and I'm not sure how easy it is to remedy this. Mikis @mstasinopoulos ?