ChrisWaller26 / bayesact

A combined frequency-severity model in BRMS which allows for left-censoring at deductibles.
4 stars 2 forks source link

Model works poorly for Gamma severity #31

Closed ChrisWaller26 closed 2 years ago

ChrisWaller26 commented 3 years ago

The brms_freq_sev function technically works for Gamma severity but requires accurate initial values. The model then takes a long time toi run, only to produce mostly divergent transitions - even with a very high adapt_delta and max treedepth. I think it would be worth creating a Poisson-Gamma frequency-severity model directly in Stan, testing it and then comparing it to the code generated by the brms_freq_sev to see if there are any subtle issues causing problems with convergence.

Atan1988 commented 3 years ago

gamma distribution is a tough one. Might be best to start with testing simply the severity data generated by gamma first. I think there was an update made in brms allowing for constant prior. I will test whether this could be helpful to solve the initialization problems with gamma. for example: loss ~ a1 + s1, a1 ~ 1, s1 ~ 1, nl = T

 and a1 can have a prior constant(8), using Gamma(link = "log")
Atan1988 commented 3 years ago

Hi Chris,

I have been playing with the gamma distribution, and given that gamma is a scale distribution, one easy solution might be applying a scale adjustment in the non-linear term. This term can probably be estimated as a function of the mean of loss, as all it needs to do is to ensure that the loglikelihood can be evaluated at initialized values. Below is one example for the default inverse link and also a log link example. One additional note is that, it seems that stan parameterize gamma with mean and shape parameters -- s1 is the mean of the gamma rather than the scale parameter used in the R gamma distributions.

` sev_data <- tibble( claim_id = seq_len(1e4) ) %>% dplyr::mutate( region = sample(c("EMA", "USC"), size = dplyr::n(), replace = T), scale = dplyr::case_when( region == "EMA" ~ 9, T ~ 10 ), ded = 0, lim_exceed = 0, loss = rgamma(dplyr::n(), shape = 0.5, scale = exp(scale)) )

sev_formula <- bf(loss | trunc(lb = ded) + cens(lim_exceed) ~ s1/1000, s1 ~ 1 + region, nl = T)

sev_formula2 <- bf(loss | trunc(lb = ded) + cens(lim_exceed) ~ s1 + 5, s1 ~ 1 + region, nl = T)

brmsfit <- brm( sev_formula, data = sev_data, family = Gamma(), prior = c( prior(normal(0, 1), class = b, nlpar = s1) ), chains = 1, iter = 2000, warmup = 1000, refresh = 50, control = list(adapt_delta = 0.8, max_treedepth = 10) ) brmsfit

brmsfit2 <- brm( sev_formula2, data = sev_data, family = Gamma(link = 'log'), prior = c( prior(normal(0, 1), class = b, nlpar = s1) ), chains = 1, iter = 2000, warmup = 1000, refresh = 50, control = list(adapt_delta = 0.8, max_treedepth = 10) ) brmsfit2 `

ChrisWaller26 commented 2 years ago

This is less an issue with this package but rather with modelling with the Gamma distribution in general