JonasHarnau / apc

Python Package for Age-Period-Cohort and extended Chain-Ladder Analysis
GNU General Public License v3.0
17 stars 6 forks source link
reserving

apc

This package is for age-period-cohort and extended chain-ladder analysis. It allows for model estimation and inference, visualization, misspecification testing, distribution forecasting and simulation. The package covers binomial, (generalized) log-normal, normal, over-dispersed Poisson and Poisson models. The common factor is a linear age-period-cohort predictor. The package uses the identification method by Kuang et al. (2008) implemented as described by Nielsen (2015) who also discusses the use of the R package apc which inspired this package.

Latest changes

Version 1.0.2 fixes some bugs introduced by pandas 0.25.0. apc 1.0.2 now requires pandas >=0.24.0. Further, the version refactors some of the unittests and removes deprecated behavior.

Version 1.0.1 fixes some typos and refactors production code.

Version 1.0.0 adds a number of new features. Among them are

Usage

  1. import package: import apc
  2. Set up a model: model = apc.Model()
  3. Attach and format the data: model.data_from_df(pandas.DataFrame)
  4. Plot data
    • Plot data sums: model.plot_data_sums()
    • Plot data heatmaps: model.plot_data_heatmaps()
    • Plot data groups of one time-scale across another: model.plot_data_within()
  5. Fit and evaluate the model
    • Fit a model: model.fit(family, predictor)
    • Plot residuals: model.plot_residuals()
    • Generate ad-hoc identified parameterizations: model.identify()
    • Plot parameter estimates: model.plot_parameters()
    • Fit a deviance table to check for valid reductions: model.fit_table()
  6. Test model for misspecification
    • R test (generalized) log-normal against over-dispersed Poisson: apc.r_test(pandas.DataFrame, family_null, predictor)
    • Split into sub-models: model.sub_model(age_from_to, per_from_to, coh_from_to)
    • Bartlett test: apc.bartlett_test(sub_models)
    • F test: apc.f_test(model, sub_models)
  7. Form distribution forecasts: model.forecast()
  8. Plot distribution forecasts: model.plot_forecast()
  9. Simulate from the model: model.simulate(repetitions)

Vignettes

The package includes vignettes that replicate the empirical applications of a number of papers.

Included Data

The following data are included in the package.

Asbestos

These data are for counts of mesothelioma deaths in the UK in age-period space. They may be modeled with a Poisson model with "APC" or "AC" predictor. The data can be loaded by calling apc.asbestos().

Source: Martinez Miranda et al. (2015).

Belgian Lung Cancer

These data includes counts of deaths from lung cancer in Belgium in age-period space. This dataset includes a measure for exposure. It can be analyzed using a Poisson model with an “APC”, “AC”, “AP” or “Ad” predictor. The data can be loaded by calling apc.Belgian_lung_cancer().

Source: Clayton and Schifflers (1987).

Run-off triangle by Barnett and Zehnwirth (2000)

Data for an insurance run-off triangle in cohort-age (accident-development year) space. This data is pre-formatted. These data are well known to require a period/calendar effect for modeling. They may be modeled with an over-dispersed Poisson "APC" predictor. The data can be loaded by calling apc.loss_BZ().

Source: Barnett and Zehnwirth (2000).

Run-off triangle by Taylor and Ashe (1983)

Data for an insurance run-off triangle in cohort-age (accident-development year) space. This data is pre-formatted. May be modeled with an over-dispersed Poisson model, for instance with "AC" predictor. The data can be loaded by calling apc.loss_TA().

Source: Taylor and Ashe (1983).

Run-off triangle by Verrall et al. (2010)

Data for insurance run-off triangle of paid amounts (units not reported) in cohort-age (accident-development year) space. Data from Codan, Danish subsidiary of Royal & Sun Alliance. It is a portfolio of third party liability from motor policies. The time units are in years. Apart from the paid amounts, counts for the number of reported claims are available. The paid amounts may be modeled with an over-dispersed Poisson model with "APC" predictor. The data can be loaded by calling apc.loss_VNJ().

Source: Verrall et al. (2010).

Run-off triangle by Kuang and Nielsen (2018)

These US casualty data are from the insurer XL Group. Entries are gross paid and reported loss and allocated loss adjustment expense in 1000 USD. Kuang and Nielsen (2018) consider a generalized log-normal model with "AC" predictor for these data. The data can be loaded by calling apc.loss_KN().

Known Issues

References