Closed SuzieDunham closed 4 years ago
Hi Suzie,
Can you please post the output of the command dput
on your triangle,
please?
Example
dput(triangle)
Thanks, Marco
On Wed, Nov 13, 2019, 11:05 AM SuzieDunham notifications@github.com wrote:
Is it possible to perform as.triangle on a triangle that isn't complete? I have older data where I'm only looking at the most recent ~25 years of development, so the upper left corner of my triangle is empty. When I use as.triangle on the data it seems to start the triangle at the first not null development period, which will then mess up recent accident years of the triangle. Is there an option to have the triangle start at the smallest (or first if data is ordered) origin value and development value? Example triangle for what I'd like to make: [image: image] https://user-images.githubusercontent.com/57721993/68786037-f1ca9a00-05f3-11ea-8b6d-99245706704c.png
This is the triangle that as.triangle would generate from that data: [image: image] https://user-images.githubusercontent.com/57721993/68786382-8cc37400-05f4-11ea-8996-137471a2689e.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mages/ChainLadder/issues/69?email_source=notifications&email_token=ALJ2H3ACJLDQ7WRMAZQSMA3QTQXTZA5CNFSM4JM6OBG2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HZCPQDQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJ2H3GAJ53R5DHGCFJECXDQTQXTZANCNFSM4JM6OBGQ .
as.triangle(test1,
- dev = 'Dev_yr',
- origin = 'AY',
- value = 'Value') Dev_yr AY 4 5 6 7 8 3 2 1 2009 4 4 5 5 5 NA NA NA 2010 4 4 5 5 5 3 NA NA 2011 4 4 5 5 5 3 2 NA 2012 4 4 5 5 NA 3 2 1 2013 4 4 5 NA NA 3 2 1 2014 4 4 NA NA NA 3 2 1 2015 4 NA NA NA NA 3 2 1 2016 NA NA NA NA NA 3 2 1 2017 NA NA NA NA NA NA 2 1 2018 NA NA NA NA NA NA NA 1 dput(triangle) structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, NA, NA, NA, 4L, 4L, 4L, 4L, 4L, 4L, NA, NA, NA, NA, 5L, 5L, 5L, 5L, 5L, NA, NA, NA, NA, NA, 5L, 5L, 5L, 5L, NA, NA, NA, NA, NA, NA, 5L, 5L, 5L, NA, NA, NA, NA, NA, NA, NA, NA, 3L, 3L, 3L, 3L, 3L, 3L, 3L, NA, NA, NA, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, NA, NA, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Dim = c(10L, 8L), .Dimnames = list(AY = c("2009", "2010", "2011", "2012", "2013", "2014", "2015", "2016", "2017", "2018"), Dev_yr = c("4", "5", "6", "7", "8", "3", "2", "1")), class = c("triangle", "matrix"))
execute this, please
dput(test1)
dput(test1) structure(list(AY = c(2009L, 2009L, 2009L, 2009L, 2009L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2013L, 2013L, 2013L, 2013L, 2013L, 2013L, 2014L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2018L), Dev_yr = c(4L, 5L, 6L, 7L, 8L, 3L, 4L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 1L), Value = c(4L, 4L, 5L, 5L, 5L, 3L, 4L, 4L, 5L, 5L, 5L, 2L, 3L, 4L, 4L, 5L, 5L, 5L, 1L, 2L, 3L, 4L, 4L, 5L, 5L, 1L, 2L, 3L, 4L, 4L, 5L, 1L, 2L, 3L, 4L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 1L)), class = "data.frame", row.names = c(NA, -46L))
It looks like that the output of the function is as expected (fig,1 in your first post). Please run this on your machine and check that the output is correct.
` test1 <- structure(list(AY = c(2009L, 2009L, 2009L, 2009L, 2009L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2013L, 2013L, 2013L, 2013L, 2013L, 2013L, 2014L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2018L), Dev_yr = c(4L, 5L, 6L, 7L, 8L, 3L, 4L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 1L), Value = c(4L, 4L, 5L, 5L, 5L, 3L, 4L, 4L, 5L, 5L, 5L, 2L, 3L, 4L, 4L, 5L, 5L, 5L, 1L, 2L, 3L, 4L, 4L, 5L, 5L, 1L, 2L, 3L, 4L, 4L, 5L, 1L, 2L, 3L, 4L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 1L)), class = "data.frame", row.names = c(NA, -46L))
test1
as.triangle(test1, dev = 'Dev_yr', origin = 'AY', value = 'Value')
`
Also, it looks like that your initial input (test1
) is a data.frame with the following columns: 'AY', Dev_yr
, Value
So when I run the code you provided this is the output: Dev_yr AY 4 5 6 7 8 3 2 1 2009 4 4 5 5 5 NA NA NA 2010 4 4 5 5 5 3 NA NA 2011 4 4 5 5 5 3 2 NA 2012 4 4 5 5 NA 3 2 1 2013 4 4 5 NA NA 3 2 1 2014 4 4 NA NA NA 3 2 1 2015 4 NA NA NA NA 3 2 1 2016 NA NA NA NA NA 3 2 1 2017 NA NA NA NA NA NA 2 1 2018 NA NA NA NA NA NA NA 1
And I would have hoped to see the Dev_yr from 1-8 and have NA in the top left of the triangle. When the triangle isn't populating correctly, it then doesn't allow me to calculate link ratios correctly.
linkratios <- c(attr(ata(triangle), "vwtd"), tail = 1.00) round(linkratios, 4) 4-5 5-6 6-7 7-8 8-3 3-2 2-1 tail 1.0000 1.2500 1.0000 1.0000 0.6000 0.6667 0.5000 1.0000
I'm new to the package - is there another way I should be looking at this?
It is strange, this is what I see on mine.
` Dev_yr AY 1 2 3 4 5 6 7 8 2009 NA NA NA 4 4 5 5 5 2010 NA NA 3 4 4 5 5 5 2011 NA 2 3 4 4 5 5 5 2012 1 2 3 4 4 5 5 NA 2013 1 2 3 4 4 5 NA NA 2014 1 2 3 4 4 NA NA NA 2015 1 2 3 4 NA NA NA NA 2016 1 2 3 NA NA NA NA NA 2017 1 2 NA NA NA NA NA NA 2018 1 NA NA NA NA NA NA NA
` and, as expected, the link ratios
1-2 2-3 3-4 4-5 5-6 6-7 7-8 tail 2.000000 1.500000 1.333333 1.000000 1.250000 1.000000 1.000000 1.000000
which comes from a standard calculation of lr ignoring the NAs
Hmmm. I'm using the latest version of RStudio and the ChainLadder package, both downloaded last week. Any recommendations on what to check?
I don't know because everything should be fine.
You can "force" the order of the columns. Try this:
Once you have your triangle, after the as.triangle
function but before calculating the link ratios try this:
new_triangle <- triangle[,as.character(1:8)]
and then
ata(new_triangle)
That worked to calculate the link ratios, thank you! Is there a similar way to force the order of the triangle columns?
That's exactly what you are doing.
The class triangle
is just a matrix with named columns.
By doing new_triangle <- triangle[,as.character(1:8)]
you are basically defining a new triangle with the "correct" column order. From now on the triangle you will be working with is called new_triangle
.
If you need more information on how named dimensions work have a look here: https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/row%2Bcolnames
Thank you!
I realize this is closed, but I have a few comments.
First, @SuzieDunham 's original data looks like it's already in matrix format, in which case it's already in a form for calculating link ratios -- no "as.triangle" step needed. I entered the data in excel, saved as csv, used read.csv to read it into R (necessarily a data.frame), turned it into a matrix, cleaned it up, and calculated age-to-age factors using the 'ata' function:
> x <- read.csv("test.csv")
> x <- as.matrix(x)
> dimnames(x) <- list(AY=x[,1], c("", 1:8))
> x <- x[,-1]
> library(ChainLadder)
> ata(x)
AY 1-2 2-3 3-4 4-5 5-6 6-7 7-8
2009 NA NA NA 1 1.25 1 1
2010 NA NA 1.333 1 1.25 1 1
2011 NA 1.5 1.333 1 1.25 1 1
2012 2 1.5 1.333 1 1.25 1 NA
2013 2 1.5 1.333 1 1.25 NA NA
2014 2 1.5 1.333 1 NA NA NA
2015 2 1.5 1.333 NA NA NA NA
2016 2 1.5 NA NA NA NA NA
2017 2 NA NA NA NA NA NA
smpl 2 1.5 1.333 1 1.25 1 1
vwtd 2 1.5 1.333 1 1.25 1 1
>
Second, help("as.triangle") motivates that function with the observation that a Triangle obtained from a data.base will usually be in long format. @marcopark90 's test1 data.frame is in long format, which is why as.triangle gives anything at all. The reason it is not in the expected order (it was in previous versions) is due to ChainLadder relinquishing its dependence on Hadley's now-unsupported package, reshape2, and going with an algorithm based on stats::aggregate, which is well-known for producing results in "unintuitive" order (one of Hadley's reasons for writing reshape2 in the first place):
> as.triangle(test1,
+ dev = 'Dev_yr',
+ origin = 'AY',
+ value = 'Value')
Dev_yr
AY 4 5 6 7 8 3 2 1
2009 4 4 5 5 5 NA NA NA
2010 4 4 5 5 5 3 NA NA
2011 4 4 5 5 5 3 2 NA
2012 4 4 5 5 NA 3 2 1
2013 4 4 5 NA NA 3 2 1
2014 4 4 NA NA NA 3 2 1
2015 4 NA NA NA NA 3 2 1
2016 NA NA NA NA NA 3 2 1
2017 NA NA NA NA NA NA 2 1
2018 NA NA NA NA NA NA NA 1
Hadley dropped his support for reshape2 and moved on to another package, tidyr, with two solutions: spread and pivot_wider. Of the two, the older 'spread' produces "intuitive" results
> library(tidyr)
> spread(test1, "Dev_yr", "Value")
AY 1 2 3 4 5 6 7 8
1 2009 NA NA NA 4 4 5 5 5
2 2010 NA NA 3 4 4 5 5 5
3 2011 NA 2 3 4 4 5 5 5
4 2012 1 2 3 4 4 5 5 NA
5 2013 1 2 3 4 4 5 NA NA
6 2014 1 2 3 4 4 NA NA NA
7 2015 1 2 3 4 NA NA NA NA
8 2016 1 2 3 NA NA NA NA NA
9 2017 1 2 NA NA NA NA NA NA
10 2018 1 NA NA NA NA NA NA NA
The newer 'pivot_wider' gives results that look like as.triangle:
> pivot_wider(test1, names_from = "Dev_yr", values_from = "Value")
# A tibble: 10 x 9
AY `4` `5` `6` `7` `8` `3` `2` `1`
<int> <int> <int> <int> <int> <int> <int> <int> <int>
1 2009 4 4 5 5 5 NA NA NA
2 2010 4 4 5 5 5 3 NA NA
3 2011 4 4 5 5 5 3 2 NA
4 2012 4 4 5 5 NA 3 2 1
5 2013 4 4 5 NA NA 3 2 1
6 2014 4 4 NA NA NA 3 2 1
7 2015 4 NA NA NA NA 3 2 1
8 2016 NA NA NA NA NA 3 2 1
9 2017 NA NA NA NA NA NA 2 1
10 2018 NA NA NA NA NA NA NA 1
So, at this point in time it seems wise to avoid re-dependence on Hadley code. @marcopark90 's solution of rearranging the columns of as.triangle's matrix into one's desired order seems like the best solution for ChainLadder.
In summary, if the triangle is already a matrix, you are good to go. If the triangle is in long data.frame format, you may have an extra step of reordering the columns. Ultimately, it would be convenient to calculate link ratios (and other ChainLadder algorithms) from a long data.frame without having to first convert it into a matrix -- but that's a different issue.
Or:
within the function as.triangle.data.frame, just before returning, enter the line
matrixTriangle <- matrixTriangle[order(rownames(matrixTriangle)),
order(colnames(matrixTriangle))]
This will put the matrix into a reasonable default order.
After sleeping on it, that order-ing step won't always work with the character row- and column-names. For example, 10 comes between 1 and 2 in
sort(as.character(1:10)) [1] "1" "10" "2" "3" "4" "5" "6" "7" "8" "9" As long as the origin and development period names are convertible to numeric -- which is usually the case for development "ages" but not always the case with origin labels --
as.numeric
would do the trick. That condition could be tested with .allisnumeric in ChainLadder's Triangle.R: rn <- rownames(matrixTriangle) if (.allisnumeric(rn)) rn <- as.numeric(rn) cn <- colnames(matrixTriangle) if (.allisnumeric(cn)) rn <- as.numeric(cn) matrixTriangle <- matrixTriangle[order(rn), order(cn)]
-Dan
Hey Dan,
Thanks for adding some items! I'm working with some actual data in "long" format, and neither of the ordering solutions are working for me. When I try [,as.character(1:30)] or even as.numeric, I get a subscript out of bounds error. Your first ordering solution ordered the columns, but as characters rather than numeric values as you indicated would happen. For your recent solution, .allisnumeric isn't a recognized function. Any suggestions?
-Susan
HI Suzie,
If you are typing [,as.character(1:30)]
and you get the error subscript out of bounds
it means that your object has less than 30 columns.
Can you please post the output of dput(your_data)
?
Thank you
Sorry, I should have said 30 was an example. I am unable to share my data. In this case, this is the code I'm running:
n <- length(unique(New_Loss_Combined$AY_AGE)) test <- New_Loss_Combined %>% filter(PD_CLM_TYPE == '8', PLCY_TYP_CD == 'P', BOOK_TYP_DESC != 'HCW') pd_tl_triange <- as.triangle(test, dev = 'AY_AGE', origin = 'AY', value = 'TMLSS_PD_AMT') nrow(pd_tl_triange) ncol(pd_tl_triange) pd_tl_triange[,as.character(1:n)]
my results:
n [1] 54 nrow(pd_tl_triange) [1] 54 ncol(pd_tl_triange) [1] 54 pd_tl_triange[,as.character(1:n)] Error in pd_tl_triange[, as.character(1:n)] : subscript out of bounds
It's very weird, can you post the output of this:
colnames(pd_tl_triangle)
?
colnames(pd_tl_triange) [1] "34.5" "35.5" "36.5" "37.5" "38.5" "39.5" "40.5" "41.5" "42.5" "43.5" "44.5" "45.5" "46.5" "47.5" "48.5" "49.5" [17] "50.5" "51.5" "52.5" "53.5" "33.5" "32.5" "31.5" "30.5" "29.5" "28.5" "27.5" "26.5" "25.5" "24.5" "23.5" "22.5" [33] "21.5" "20.5" "19.5" "18.5" "17.5" "16.5" "15.5" "14.5" "13.5" "12.5" "11.5" "10.5" "9.5" "8.5" "7.5" "6.5" [49] "5.5" "4.5" "3.5" "2.5" "1.5" "0.5"
This will solve your issue:
pd_tl_triangle[, as.character( seq(.5,53.5,.5))]
However I really recommend you to read about colnames in R.
https://stat.ethz.ch/R-manual/R-devel/library/base/html/colnames.html
Sorry, to join the party a little late. Can you please try the developer version of ChainLadder on GitHub? I recall that following the last release on CRAN. I fixed as.triangle for a 'long' data set, when input data had missing. See commit https://github.com/mages/ChainLadder/commit/718855836f1efc87beef006904018de26e807e6d
@SuzieDunham In case you are wondering how to do that:
library(devtools) install_github("mages/ChainLadder")
For a non-exported package function, use the three-colon trick:
ChainLadder:::.allisnumeric(c("12", "24")) [1] TRUE
I was hoping you might want to fork the package into your own repo, then you could see the code for yourself, and “borrow” as you see fit. The code is in the file Triangles.R, at the very top. My “hope” is that you could suggest a change that suits you, which probably would suit others, and thereby become a ChainLadder contributor. No pressure! :)
From: SuzieDunham notifications@github.com Sent: Thursday, November 21, 2019 9:25 AM To: mages/ChainLadder ChainLadder@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [mages/ChainLadder] Triangle with empty values in early development periods (#69)
Hey Dan,
Thanks for adding some items! I'm working with some actual data in "long" format, and neither of the ordering solutions are working for me. When I try [,as.character(1:30)] or even as.numeric, I get a subscript out of bounds error. Your first ordering solution ordered the columns, but as characters rather than numeric values as you indicated would happen. For your recent solution, .allisnumeric isn't a recognized function. Any suggestions?
-Susan
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/mages/ChainLadder/issues/69?email_source=notifications&email_token=ABRJYBYOTKW6QB3YEPCBOULQU276FA5CNFSM4JM6OBG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3AK6I#issuecomment-557188473, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABRJYB5BGPDAS5TOFKFY2BDQU276FANCNFSM4JM6OBGQ.
Thanks for your assistance. Sadly nothing it working :( I'm either getting a subscript out of bounds error or an incorrectly sorted triangle, so I feel like I'm going to have to abandon trying to use this package. Was just trying to figure out a way to automate generation of a ton of triangles, but I guess I'm stuck with excel!
Ugh, I'd all but given up but I tried one LAST thing and of course it worked... Though I'm still not thrilled that it had to be this complicated!
Final working and correctly ordered triangle: pd_tl_triange <- as.triangle(test, dev = 'AY_AGE', origin = 'AY', value = 'TMLSS_PD_AMT')
cn <- colnames(pd_tl_triange) pd_tl_triange <- pd_tl_triange[, as.character(sort(as.numeric(cn)))]
:) There you have it, a solution for your specific data! Nice work. I don't think it's ChainLadder's goal to handle the most general type of data, but to implement cutting edge reserving algorithms. Are there other capabilities of ChainLadder you want to take advantage of, other than calculating link ratios?
Trinostics LLC 925-381-9869
From: SuzieDunham notifications@github.com Sent: Friday, November 22, 2019 10:44:44 AM To: mages/ChainLadder ChainLadder@noreply.github.com Cc: dmurphy trinostics.com dmurphy@trinostics.com; Comment comment@noreply.github.com Subject: Re: [mages/ChainLadder] Triangle with empty values in early development periods (#69)
Ugh, I'd all but given up but I tried one LAST thing and of course it worked... Though I'm still not thrilled that it had to be this complicated!
Final working and correctly ordered triangle: pd_tl_triange <- as.triangle(test, dev = 'AY_AGE', origin = 'AY', value = 'TMLSS_PD_AMT')
cn <- colnames(pd_tl_triange) pd_tl_triange <- pd_tl_triange[, as.character(sort(as.numeric(cn)))]
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mages/ChainLadder/issues/69?email_source=notifications&email_token=ABRJYB63R3U4RDA5B3DSJILQVASBZA5CNFSM4JM6OBG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE6QJGY#issuecomment-557647003, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABRJYBZMVGQ5M56Q4AGKBG3QVASBZANCNFSM4JM6OBGQ.
Is it possible to perform as.triangle on a triangle that isn't complete? I have older data where I'm only looking at the most recent ~25 years of development, so the upper left corner of my triangle is empty. When I use as.triangle on the data it seems to start the triangle at the first not null development period, which will then mess up recent accident years of the triangle. Is there an option to have the triangle start at the smallest (or first if data is ordered) origin value and development value? Example triangle for what I'd like to make:
This is the triangle that as.triangle would generate from that data: