refunders / refund

Regression with functional data
39 stars 23 forks source link

Help for family argument in pffr #113

Closed iandanilevicz closed 7 months ago

iandanilevicz commented 7 months ago

Hi,

I am having trouble with the function pffr. I have some data with a discrete output, so I wish to use a binomial or zero-inflated distribution. I also have another dataset with another positive continuous output, and a Gamma distribution should be more appropriate than Gaussian. Anyway, I can't change the argument family in the function pffr. The help is sent to the family.mgcv (http://127.0.0.1:45423/help/library/mgcv/help/family.mgcv), which provides two families: Tweedie and negbin. But none of them works as I expected. For example:

  1. Discrete output pffr_adj_fit3 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ml, family="negbin") pffr_adj_fit3 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ml, family=nb(theta = NULL, link = "log")) pffr_adj_fit3 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ml, family=nb(link="sqrt")) pffr_adj_fit3 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ml, family="binomial")

  2. Positive and continuous output pffr_adj_fit2 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ave, family="tw") pffr_adj_fit2 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ave, family=tw()) pffr_adj_fit2 = pffr(MIMS ~ age + gender + BMI, data = nhanes_ave, family=gammals(link=list("identity","log"),b=-7))

Somebody can provide a list of valid arguments for family except family = "gaulss"? Thank you so much!

fabian-s commented 7 months ago

Thx & sorry about that. Fixed in latest patch - please install with

devtools::install_github("refunders/refund", ref="devel")

fabian-s commented 7 months ago
library(refund) 
library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.9-0. For overview type 'help("mgcv-package")'.
data <- pffrSim(n = 100, scenario = "lin")
# extract linear predictor:
eta <- attr(data, "truth")$eta
# generate poisson response
data$YPois <- rpois(n = prod(dim(eta)), lambda = exp(eta)) |> matrix(nrow = 100, ncol = 60)
# fit poisson model:
pffr(YPois ~ xlin, family = poisson(link = "log"), data = data)
#> 
#> Family: poisson 
#> Link function: log 
#> 
#> Formula:
#> YPois ~ s(x = yindex.vec, bs = "ps", k = 20, m = c(2, 1)) + s(yindex.vec, 
#>     by = xlin, bs = "ps", k = 5, m = c(2, 1))
#> 
#> Estimated degrees of freedom:
#> 18.3  5.0  total = 24.28 
#> 
#> REML score: 14441.7
# fit tweedie model:
pffr(YPois ~ xlin, family = mgcv::tw(), data = data)
#> 
#> Family: Tweedie(p=1.01) 
#> Link function: log 
#> 
#> Formula:
#> YPois ~ s(x = yindex.vec, bs = "ps", k = 20, m = c(2, 1)) + s(yindex.vec, 
#>     by = xlin, bs = "ps", k = 5, m = c(2, 1))
#> 
#> Estimated degrees of freedom:
#> 18.3  5.0  total = 24.27 
#> 
#> REML score: 11644.24
# fit negbin model:
pffr(YPois ~ xlin, family = mgcv::nb(), data = data)
#> 
#> Family: Negative Binomial(126404.533) 
#> Link function: log 
#> 
#> Formula:
#> YPois ~ s(x = yindex.vec, bs = "ps", k = 20, m = c(2, 1)) + s(yindex.vec, 
#>     by = xlin, bs = "ps", k = 5, m = c(2, 1))
#> 
#> Estimated degrees of freedom:
#> 18.3  5.0  total = 24.28 
#> 
#> REML score: 14441.71

Created on 2024-04-15 with reprex v2.0.2

iandanilevicz commented 7 months ago

Thank yo @fabian-s ! family = mgcv::tw() works pretty well!