Closed alex-pax closed 2 years ago
The underlying issue is that an affine (y = mx) model, as is the standard chain-ladder method, is inappropriate for a process in which an actual observation of zero can be followed by a non-zero observation because the affine model says the following obs MUST be zero. As the OP notes, zeros at beginning ages occur frequently with excess-/re-insurance. By kicking out those beginning-zero observations and estimating the standard error (SE) from only the nonzero data, the practitioner ends up with an understated SE.
I have experimented with replacing actual zero’s with a non-zero epsilon – say $1 or 1 euro. SE’s blew up, as expected and hoped. Then I experimented with adjusting epsilon until the SE looked more reasonable. Eventually I realized I was replacing one type of actuarial judgment (model error) with another judgment (parameter error) that was more complicated to explain – I tossed the epsilon approach.
The ability of R to use NA to indicate non-available data is brilliant. That’s why I made sure NA’s trigger non-observations when I wrote the Clark methods.
In the long run, I am a proponent of using linear models (y = mx + b) that incorporate an intercept for just this situation. Since beginning zero’s tend to occur only at immature ages, those may be considered too-much-work-for-the-effort “edge cases.” None of the models the OP mentions can incorporate an intercept in their current implementation.
By the way, the Mack method can work accurately when the selected average is the all-year-weighted average. However, ChainLadder must be modified to avoid the use of “weights” in that case.
In summary,
I. When the beginning age contains missing data, use NA instead of zero. The algorithms should be modified to eliminate such non-observations from the analyzed dataset.
II. When the beginning age contains an actual zero, I suggest
Dan Murphy
From: Alex Pax @.> Sent: Monday, July 18, 2022 1:35 PM To: mages/ChainLadder @.> Cc: Subscribed @.***> Subject: [mages/ChainLadder] Theoretical Understanding of How Methods Should Handle 0's (Issue #87)
Sorry if this is longwinded and indirect, but I feel like I've encountered behavior that is of general interest to most ChainLadder users. So I thought I would share it here in the hopes of clarifying some of that behavior with the developers.
I've noticed that many of the various ChainLadder methods (MackChainLadder, BootChainLadder, ClarkLDF, ClarkCapeCod, etc...) will completely fail if there is even a single 0 observed in the cumulative triangle regardless of how big the triangle is. Zeroes in cumulative development triangles are generally uncommon. However, they do appear in certain circumstances such as excess lines where development may not appear until after the first development period or when refining development triangles into shorter origin/development periods such as accident quarters or months.
Presumably this behavior of the methods is driven by the fact that the individual age-to-age factor from that observation to the next age is Inf. In practice, however, I think it is common to ignore these observations when parameterizing actuarial methods.
I've found a couple of workarounds that I outline below. I haven't been able to find a discussion of this online anywhere, so apologies if this conversation is already happening elsewhere. Is there any appetite for formalizing the way each method would handle observed zeroes as a default approach? I could see this becoming a pandora's box of edge cases (for instance, not all zeroes can be assumed to be the same. What if you have an entire column of zeroes? An entire row? Some other configuration that causes errors?) But it would provide a lot of value, particularly when deploying methods across multiple triangles at a time.
As an example, take this triangle:
x <- data.frame(origin = c(2015, 2015, 2015, 2015,
2016, 2016, 2016,
2017, 2017,
2018),
dev = c(12, 24, 36, 48,
12, 24, 36,
12, 24,
12),
value = c(20, 80, 150, 190,
0, 60, 80,
50, 90,
90))
x_tri <- as.triangle(x)
Calling any of the four methods mentioned above produces an error. For MackChainLadder, ClarkLDF, and ClarkCapeCod the error output is clearly related to fitting a model to Inf observation(s). For BootChainLadder it's more opaque, but presumably related to passing Inf's to the shape parameter of rgamma (you can reproduce this error message with rgamma(1, shape = Inf/Inf), for example)
MackChainLadder(x_tri)
BootChainLadder(x_tri)
ClarkLDF(x_tri, maxage = 72)
ClarkCapeCod(x_tri, maxage = 72, Premium = 50)
The MackChainLadder method has an argument, weights, that can be used to ignore the zeroes. The code below produces a warning but not an error:
MackChainLadder(x_tri, weights = pmin(as.matrix(x_tri), 1))
For the BootChainLadder method, replacing zeroes with NAs seems to work in at least this example:
BootChainLadder(apply(X = x_tri,
## Apply across rows and columns:
MARGIN = c(1,2),
## Replace 0s with NA:
function(x) ifelse(x == 0, as.numeric(NA), x)))
Replacing zeroes with NA also seems to work for ClarkLDF and ClarkCapeCod:
ClarkLDF(apply(X = x_tri,
## Apply across rows and columns:
MARGIN = c(1,2),
## Replace 0s with NA:
function(x) ifelse(x == 0, as.numeric(NA), x)),
maxage = 72)
ClarkCapeCod(apply(X = x_tri,
## Apply across rows and columns:
MARGIN = c(1,2),
## Replace 0s with NA:
function(x) ifelse(x == 0, as.numeric(NA), x)),
maxage = 72,
Premium = 50)
— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mages_ChainLadder_issues_87&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=Vud7r063yomY90ooqA-RtFTFd_VQMt-eZA0h4MxPjNo&m=caFEVqKRHeAW1oVIALtitxAh3nOgMduKIBAya2WvPpw&s=G-AcVIlW9C8DjRijOQ9lx69qOL9WtRsX-DB_NhKzzLQ&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABRJYBZHZDPRRFKPPGK2H4DVUW5WJANCNFSM535RYPUQ&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=Vud7r063yomY90ooqA-RtFTFd_VQMt-eZA0h4MxPjNo&m=caFEVqKRHeAW1oVIALtitxAh3nOgMduKIBAya2WvPpw&s=qX6tv1IhadENe_sOZb3pw40wdjzHPK_JRdXCZ-UTc8g&e=. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>
@trinostics Thanks for your thoughtful response. That all makes sense to me. Please feel free to close this issue.
Seems like the Brosius paper (https://www.casact.org/sites/default/files/database/studynotes_brosius6.pdf) would be a good method to implement for that y = mx + b framework you mentioned. When I finally finish taking exams, I'd love to be able to contribute to this package, and seems like that could be a good place to start.
Closing per my comment above. Thanks for your response on this.
Alex
Thank you for the 1993 Brosius reference. Do you know how it was published by the CAS? Proceedings? Forum? I could not find it at https://www.casact.org/publications-research/library
Curious that it was written the year before mine “Unbiased Loss Development Factors” was published in the Proceedings. His Least Squares Method is Model I in my paper.
Two observations:
Best of luck with your exams. I hope your interest in ChainLadder continues. Please stay in touch.
Dan
From: Alex Pax @.> Sent: Thursday, July 21, 2022 1:50 PM To: mages/ChainLadder @.> Cc: dmurphy trinostics.com @.>; Mention @.> Subject: Re: [mages/ChainLadder] Theoretical Understanding of How Methods Should Handle 0's (Issue #87)
@trinosticshttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_trinostics&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=Vud7r063yomY90ooqA-RtFTFd_VQMt-eZA0h4MxPjNo&m=mfqB6BLiioOXMgFgFggq7EK7geZTSLtMwKHB-C_OtxA&s=ZrGnN6C4rZeboqrRQB3H0Gc4EkubS7XP_vW7A55cc6Q&e= Thanks for your thoughtful response. That all makes sense to me. Please feel free to close this issue.
Seems like the Brosius paper (https://www.casact.org/sites/default/files/database/studynotes_brosius6.pdfhttps://urldefense.proofpoint.com/v2/url?u=https-3A__www.casact.org_sites_default_files_database_studynotes-5Fbrosius6.pdf&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=Vud7r063yomY90ooqA-RtFTFd_VQMt-eZA0h4MxPjNo&m=mfqB6BLiioOXMgFgFggq7EK7geZTSLtMwKHB-C_OtxA&s=TSxgpv5PxZe01JUFaqkUauWifkuyyPle6Gu0KED5A50&e=) would be a good method to implement for that y = mx + b framework you mentioned. When I finally finish taking exams, I'd love to be able to contribute to this package, and seems like that could be a good place to start.
— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mages_ChainLadder_issues_87-23issuecomment-2D1191921457&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=Vud7r063yomY90ooqA-RtFTFd_VQMt-eZA0h4MxPjNo&m=mfqB6BLiioOXMgFgFggq7EK7geZTSLtMwKHB-C_OtxA&s=9vTocvD-tZ-71VpKHYg6CtKEocRS72xgmoYxs9zSthc&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABRJYBY5WLUCUUSJKZ2A7ALVVGZY3ANCNFSM535RYPUQ&d=DwMCaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=Vud7r063yomY90ooqA-RtFTFd_VQMt-eZA0h4MxPjNo&m=mfqB6BLiioOXMgFgFggq7EK7geZTSLtMwKHB-C_OtxA&s=__z2f7zzQsY6--vImvEitt8pqGIS0OyYyJJ_PtDNlZA&e=. You are receiving this because you were mentioned.Message ID: @.**@.>>
The Brosius paper is the first item on the current Exam 7 text reference here: https://www.casact.org/exam/exam-7-estim-liabilities-valuation-erm. I think for many people this exam is the source, but someone who's involved with the exam committee might have better insight.
Sorry if this is longwinded and indirect, but I feel like I've encountered behavior that is of general interest to most ChainLadder users. So I thought I would share it here in the hopes of clarifying some of that behavior with the developers.
I've noticed that many of the various ChainLadder methods (
MackChainLadder
,BootChainLadder
,ClarkLDF
,ClarkCapeCod
, etc...) will completely fail if there is even a single 0 observed in the cumulative triangle regardless of how big the triangle is. Zeroes in cumulative development triangles are generally uncommon. However, they do appear in certain circumstances such as excess lines where development may not appear until after the first development period or when refining development triangles into shorter origin/development periods such as accident quarters or months.Presumably this behavior of the methods is driven by the fact that the individual age-to-age factor from that observation to the next age is
Inf
. In practice, however, I think it is common to ignore these observations when parameterizing actuarial methods.I've found a couple of workarounds that I outline below. I haven't been able to find a discussion of this online anywhere, so apologies if this conversation is already happening elsewhere. Is there any appetite for formalizing the way each method would handle observed zeroes as a default approach? I could see this becoming a pandora's box of edge cases (for instance, not all zeroes can be assumed to be the same. What if you have an entire column of zeroes? An entire row? Some other configuration that causes errors?) But it would provide a lot of value, particularly when deploying methods across multiple triangles at a time.
As an example, take this triangle:
Calling any of the four methods mentioned above produces an error. For
MackChainLadder
,ClarkLDF
, andClarkCapeCod
the error output is clearly related to fitting a model toInf
observation(s). ForBootChainLadder
it's more opaque, but presumably related to passingInf
's to the shape parameter ofrgamma
(you can reproduce this error message withrgamma(1, shape = Inf/Inf)
, for example)The
MackChainLadder
method has an argument,weights
, that can be used to ignore the zeroes. The code below produces a warning but not an error:For the
BootChainLadder
method, replacing zeroes withNA
s seems to work in at least this example:Replacing zeroes with
NA
also seems to work forClarkLDF
andClarkCapeCod
: