casact / chainladder-python

Actuarial reserving in Python
https://chainladder-python.readthedocs.io/en/latest/
Mozilla Public License 2.0
192 stars 71 forks source link

[BUG] wrong triangle origin #505

Closed cladier closed 7 months ago

cladier commented 7 months ago

I'm not getting the correct behaviour, the example will speak for itself :

import chainladder as cl
import pandas as pd
from pandas import Timestamp

db = pd.DataFrame({'claimDate': {0: Timestamp('2015-05-22 00:00:00'),
  1: Timestamp('2015-05-22 00:00:00'),
  2: Timestamp('2015-05-22 00:00:00'),
  3: Timestamp('2015-05-22 00:00:00')},
 'ones': {0: 1, 1: 1, 2: 1, 3: 1},
 'viewDate': {0: Timestamp('2015-06-30 00:00:00'),
  1: Timestamp('2015-06-30 00:00:00'),
  2: Timestamp('2015-07-31 00:00:00'),
  3: Timestamp('2015-07-31 00:00:00')}})
db
image
cl.Triangle(
    db,
    origin="claimDate",
    development="viewDate",
    columns="ones",
    cumulative=False
)
image

=> Why does the origin starts in 2014 ?

Desktop: pandas: 2.2.1 numpy: 1.24.4 chainladder: 0.8.18

jbogaardt commented 7 months ago

I think this is an issue of ambiguity. The Triangle signature asks for a lot of things already and we aim to keep as many items that can be inferred from the data as optional. The issue we're running into is that we have one accident month with two development periods. With that information, the constructor has to:

  1. Determine what the origin grain is (Annual, semi-annual, quarterly, or monthly) - Behavior is to assume its annual
  2. Determine the desired fiscal period end. - Behavior is to assume its May of each year.

So in this case, its creating annual accident periods with a fiscal period from June-2014 through May-2015 taking the origin from the beginning of the period. It does this because the trailing argument in the constructor defaults to True. This argument assumes fiscal period end of May.


>>> tri = cl.Triangle(
...     db,
...     origin="claimDate",
...     development="viewDate",
...     columns="ones",
...     cumulative=False
... )
dtype: int64
>>> tri.origin
PeriodIndex(['2014', '2015'], dtype='period[A-MAY]', name='origin')
>>> tri.origin_grain
'Y'

You can override the trailing argument and get the desired behavior:

>>> tri = cl.Triangle(
...     db,
...     development="viewDate",
...     columns="ones",
...     cumulative=False,
...     trailing=False # add this to coerce to a December year end
...
... )
dtype: int64
>>> tri.origin
PeriodIndex(['2015'], dtype='period[A-DEC]', name='origin')
>>> tri
        6    7
2015  2.0  2.0
cladier commented 7 months ago

When you don't know that some countries have a fiscal year starting in May, the origin labelling is indeed confusing. Makes sense now thanks a lot !

jbogaardt commented 7 months ago

Glad this helps. Technically, the default behavior for trailing=True is to assume the latest origin month in your data as the fiscal close. I'm not sure how many companies actually use May globally - probably very few.

cladier commented 7 months ago

Hi @jbogaardt , sorry to bother again, but there's a behaviourI quite don't get and I suspect it's the same kind of problem:


db = pd.DataFrame({'ones': {14: 1, 15: 1, 16: 1, 17: 1},
 'claimDate': {14: Timestamp('2015-05-22 00:00:00'),
  15: Timestamp('2015-05-22 00:00:00'),
  16: Timestamp('2015-05-22 00:00:00'),
  17: Timestamp('2015-05-22 00:00:00')},
 'viewDate': {14: Timestamp('2016-01-31 00:00:00'),
  15: Timestamp('2016-01-31 00:00:00'),
  16: Timestamp('2016-02-29 00:00:00'),
  17: Timestamp('2016-02-29 00:00:00')}})
image

cl.Triangle(db,
    origin="DateSurvenance",
    development="viewDate",
    columns="ones",
    cumulative=False,
    trailing=False
)
image

=> Why are the 4 values regrouped and not split between 2 different oldings ?

I've built a bigger table where I view these same two claims at the end of each month and I sometime get months skipped and passed to the next, but I have no clue of why this happens :

image
jbogaardt commented 7 months ago

This is related to #494 and is a result of changes in pandas>=0.2.2. We've patched master branch to accomodate and it works as intended there. Just need to get a release out pypi.

cladier commented 7 months ago

Indeed, just installing from github fixes this. Thanks a lot for all the amazing work !