Closed ChrisWaller26 closed 2 years ago
gamma distribution is a tough one. Might be best to start with testing simply the severity data generated by gamma first. I think there was an update made in brms allowing for constant prior. I will test whether this could be helpful to solve the initialization problems with gamma. for example: loss ~ a1 + s1, a1 ~ 1, s1 ~ 1, nl = T
and a1 can have a prior constant(8), using Gamma(link = "log")
Hi Chris,
I have been playing with the gamma distribution, and given that gamma is a scale distribution, one easy solution might be applying a scale adjustment in the non-linear term. This term can probably be estimated as a function of the mean of loss, as all it needs to do is to ensure that the loglikelihood can be evaluated at initialized values. Below is one example for the default inverse link and also a log link example. One additional note is that, it seems that stan parameterize gamma with mean and shape parameters -- s1 is the mean of the gamma rather than the scale parameter used in the R gamma distributions.
` sev_data <- tibble( claim_id = seq_len(1e4) ) %>% dplyr::mutate( region = sample(c("EMA", "USC"), size = dplyr::n(), replace = T), scale = dplyr::case_when( region == "EMA" ~ 9, T ~ 10 ), ded = 0, lim_exceed = 0, loss = rgamma(dplyr::n(), shape = 0.5, scale = exp(scale)) )
sev_formula <- bf(loss | trunc(lb = ded) + cens(lim_exceed) ~ s1/1000, s1 ~ 1 + region, nl = T)
sev_formula2 <- bf(loss | trunc(lb = ded) + cens(lim_exceed) ~ s1 + 5, s1 ~ 1 + region, nl = T)
brmsfit <- brm( sev_formula, data = sev_data, family = Gamma(), prior = c( prior(normal(0, 1), class = b, nlpar = s1) ), chains = 1, iter = 2000, warmup = 1000, refresh = 50, control = list(adapt_delta = 0.8, max_treedepth = 10) ) brmsfit
brmsfit2 <- brm( sev_formula2, data = sev_data, family = Gamma(link = 'log'), prior = c( prior(normal(0, 1), class = b, nlpar = s1) ), chains = 1, iter = 2000, warmup = 1000, refresh = 50, control = list(adapt_delta = 0.8, max_treedepth = 10) ) brmsfit2 `
This is less an issue with this package but rather with modelling with the Gamma distribution in general
The brms_freq_sev function technically works for Gamma severity but requires accurate initial values. The model then takes a long time toi run, only to produce mostly divergent transitions - even with a very high adapt_delta and max treedepth. I think it would be worth creating a Poisson-Gamma frequency-severity model directly in Stan, testing it and then comparing it to the code generated by the brms_freq_sev to see if there are any subtle issues causing problems with convergence.