avg_premium calculation

MHaringa / insurancerating

R-package for actuarial pricing

68 stars 17 forks source link

Fitting GAM for pure premium is not documented, so I'm not sure if it's officially supported. Regardless I gave a try.

This may not make huge difference if most of exposures are 1, but thought there would be a better way to calculate aggregated pure premium before fitting GAM.

Currrently: df <- aggregate(list(exposure = df$exposure, pure_premium = df$pure_premium), by = list(x = df$x), FUN = sum, na.rm = TRUE, na.action = NULL)

df$avg_premium <- df$pure_premium / df$exposure

I wonder if it's better to calculate avg_premium for each x by doing sum(pure_premium * exposure) / sum(exposure) instead of sum(pure_premium) / sum(exposure).

Separate question: How is confidence interval in the autoplot for GAM calculated? It seems to use aggregated data without underlying data points. Does it use exposure for CI or look at variability in the underlying data?

Thank you for your message. You are right that the functionality for fitting a GAM for pure premium is still experimental (in the early stages of development). But I agree that it would be better to calculate avg_premium for each x by doing sum(pure_premium * exposure) / sum(exposure) instead of sum(pure_premium) / sum(exposure). I will change this and will also add a remark to the documentation that fitting a GAM for pure premium is still in the early stages of development.

Answer to your separate question: In fact it should look at variability in the underlying data. However, in applications to large insurance portfolios it may take a considerable time to complete the GAM calculations. Therefore, the data is first aggregated by x and then a GAM is fitted. Indeed the CI is only correct if the data is not aggregated. I will add an argument to the function to choose whether the data should be aggregated or not.

MHaringa / insurancerating

avg_premium calculation #2