timriffe / DemoTools

Tools for the evaluation, adjustment, and standardization of demographic data
https://timriffe.github.io/DemoTools
Other
59 stars 31 forks source link

Strange closeout values from lt_single_mx #247

Closed sarahertog closed 2 years ago

sarahertog commented 3 years ago

Hi Tim! I'm getting some curious results from the lt_single_mx function.

The below two very similar mortality schedules are from Lee-Carter extrapolations (MortCast) for two adjacent time periods (USA females 2034 and 2035). The life table returned for the first looks ok, but the one for the second has implausible ex, especially at oldest ages. I looked through the code, but I haven't been able to detect why two similar schedules would yield such wildly different results. Do you know what's happening here?

mx1 <- c(4.137699e-03, 4.661475e-04, 1.894496e-04, 1.774186e-04, 1.540913e-04, 1.301364e-04, 1.132623e-04, 1.015097e-04, 9.441688e-05, 9.176039e-05, 9.274633e-05, 9.718911e-05, 1.053648e-04, 1.175363e-04, 1.341031e-04, 1.555623e-04, 1.822785e-04, 2.143500e-04, 2.514762e-04, 2.931839e-04, 3.382772e-04, 3.854442e-04, 4.333799e-04, 4.806971e-04, 5.267097e-04, 5.712928e-04, 6.149592e-04, 6.584504e-04, 7.015056e-04, 7.460926e-04, 7.921565e-04, 8.389562e-04, 8.847963e-04, 9.286664e-04, 9.709676e-04, 1.012187e-03, 1.053186e-03, 1.095366e-03, 1.140483e-03, 1.190553e-03, 1.247768e-03, 1.314626e-03, 1.393632e-03, 1.486962e-03, 1.596768e-03, 1.724411e-03, 1.870979e-03, 2.037092e-03, 2.223067e-03, 2.429087e-03, 2.655171e-03, 2.901032e-03, 3.166104e-03, 3.449478e-03, 3.749681e-03, 4.064870e-03, 4.393181e-03, 4.732840e-03, 5.082725e-03, 5.443162e-03, 5.815679e-03, 6.204534e-03, 6.614041e-03, 7.054522e-03, 7.539394e-03, 8.084013e-03, 8.705730e-03, 9.424979e-03, 1.025917e-02, 1.122027e-02, 1.231761e-02, 1.355231e-02, 1.492595e-02, 1.644069e-02, 1.811349e-02, 1.997103e-02, 2.205897e-02, 2.443255e-02, 2.715526e-02, 3.028601e-02, 3.388573e-02, 3.800512e-02, 4.269348e-02, 4.799949e-02, 5.397628e-02, 6.068719e-02, 6.819964e-02, 7.658684e-02, 8.592572e-02, 9.629673e-02, 1.077475e-01, 1.203260e-01, 1.341193e-01, 1.489562e-01, 1.646282e-01, 1.815015e-01, 1.990732e-01, 2.139401e-01, 2.310362e-01, 2.597395e-01, 3.078898e-01)

lt1 <- DemoTools::lt_single_mx(nMx = mx1, Age = 0:100, Sex = "f", OAG = TRUE)

mx2 <- c(4.037911e-03, 4.603860e-04, 1.850905e-04, 1.736494e-04, 1.508987e-04, 1.274294e-04, 1.109161e-04, 9.942305e-05, 9.250199e-05, 8.993885e-05, 9.094943e-05, 9.534979e-05, 1.034152e-04, 1.154050e-04, 1.317199e-04, 1.528649e-04, 1.792262e-04, 2.109428e-04, 2.477606e-04, 2.892609e-04, 3.342809e-04, 3.815155e-04, 4.296428e-04, 4.772308e-04, 5.235438e-04, 5.684095e-04, 6.123032e-04, 6.559399e-04, 6.989331e-04, 7.434080e-04, 7.893061e-04, 8.358813e-04, 8.813417e-04, 9.246434e-04, 9.662472e-04, 1.006642e-03, 1.046691e-03, 1.087800e-03, 1.131728e-03, 1.180499e-03, 1.236310e-03, 1.301667e-03, 1.379077e-03, 1.470712e-03, 1.578717e-03, 1.704437e-03, 1.848943e-03, 2.012841e-03, 2.196435e-03, 2.399911e-03, 2.623297e-03, 2.866325e-03, 3.128447e-03, 3.408747e-03, 3.705721e-03, 4.017438e-03, 4.341920e-03, 4.677275e-03, 5.022257e-03, 5.377158e-03, 5.743494e-03, 6.125575e-03, 6.527342e-03, 6.959537e-03, 7.435586e-03, 7.970684e-03, 8.581952e-03, 9.289587e-03, 1.011072e-02, 1.105717e-02, 1.213818e-02, 1.335487e-02, 1.470893e-02, 1.620252e-02, 1.785255e-02, 1.968544e-02, 2.174642e-02, 2.409031e-02, 2.678014e-02, 2.987459e-02, 3.343456e-02, 3.751099e-02, 4.215364e-02, 4.741181e-02, 5.333928e-02, 5.999997e-02, 6.746201e-02, 7.579953e-02, 8.509072e-02, 9.541821e-02, 1.068324e-01, 1.193850e-01, 1.331676e-01, 1.480110e-01, 1.637074e-01, 1.806291e-01, 1.982678e-01, 2.131747e-01, 2.303442e-01, 2.592926e-01, 3.075283e-01)

lt2 <- DemoTools::lt_single_mx(nMx = mx2, Age = 0:100, Sex = "f", OAG = TRUE)

mx2 - mx1 lt2$ex - lt1$ex

peterdavjohnson commented 3 years ago

Hi Sara and Tim,

It seems that the extrapolation somehow fails to converge. The attached script uses the A and B parameters from lt2 A <- 0.0356595987 B <- 0.0000000151 and we see that the mx values are all equal to 0.0344. The goodness.of.fit values are NaN, NaN, NaN.

For lt1 we get reasonable A and B values (0.00409324, 0.1076407), and the goodness.of.fit values are also NaN, NaN, NaN. The deviance looks better though at 0.0057 compared to 0.3813 for the second one.

We need to check the fitting algorithm and see why small differences cause this big difference.

pj

On Tue, Jul 6, 2021 at 3:13 PM Sara Hertog @.***> wrote:

Hi Tim! I'm getting some curious results from the lt_single_mx function.

The below two very similar mortality schedules are from Lee-Carter extrapolations (MortCast) for two adjacent time periods (USA females 2034 and 2035). The life table returned for the first looks ok, but the one for the second has implausible ex, especially at oldest ages. I looked through the code, but I haven't been able to detect why two similar schedules would yield such wildly different results. Do you know what's happening here?

mx1 <- c(4.137699e-03, 4.661475e-04, 1.894496e-04, 1.774186e-04, 1.540913e-04, 1.301364e-04, 1.132623e-04, 1.015097e-04, 9.441688e-05, 9.176039e-05, 9.274633e-05, 9.718911e-05, 1.053648e-04, 1.175363e-04, 1.341031e-04, 1.555623e-04, 1.822785e-04, 2.143500e-04, 2.514762e-04, 2.931839e-04, 3.382772e-04, 3.854442e-04, 4.333799e-04, 4.806971e-04, 5.267097e-04, 5.712928e-04, 6.149592e-04, 6.584504e-04, 7.015056e-04, 7.460926e-04, 7.921565e-04, 8.389562e-04, 8.847963e-04, 9.286664e-04, 9.709676e-04, 1.012187e-03, 1.053186e-03, 1.095366e-03, 1.140483e-03, 1.190553e-03, 1.247768e-03, 1.314626e-03, 1.393632e-03, 1.486962e-03, 1.596768e-03, 1.724411e-03, 1.870979e-03, 2.037092e-03, 2.223067e-03, 2.429087e-03, 2.655171e-03, 2.901032e-03, 3.166104e-03, 3.449478e-03, 3.749681e-03, 4.064870e-03, 4.393181e-03, 4.732840e-03, 5.082725e-03, 5.443162e-03, 5.815679e-03, 6.204534e-03, 6.614041e-03, 7.054522e-03, 7.539394e-03, 8.084013e-03, 8.705730e-03, 9.424979e-03, 1.025917e-02, 1.122027e-02, 1.231761e-02, 1.355231e-02, 1.492595e-02, 1.644069e-02, 1.811349e-02, 1.997103e-02, 2.205897e-02, 2.443255e-02, 2.715526e-02, 3.028601e-02, 3.388573e-02, 3.800512e-02, 4.269348e-02, 4.799949e-02, 5.397628e-02, 6.068719e-02, 6.819964e-02, 7.658684e-02, 8.592572e-02, 9.629673e-02, 1.077475e-01, 1.203260e-01, 1.341193e-01, 1.489562e-01, 1.646282e-01, 1.815015e-01, 1.990732e-01, 2.139401e-01, 2.310362e-01, 2.597395e-01, 3.078898e-01)

lt1 <- DemoTools::lt_single_mx(nMx = mx1, Age = 0:100, Sex = "f", OAG = TRUE)

mx2 <- c(4.037911e-03, 4.603860e-04, 1.850905e-04, 1.736494e-04, 1.508987e-04, 1.274294e-04, 1.109161e-04, 9.942305e-05, 9.250199e-05, 8.993885e-05, 9.094943e-05, 9.534979e-05, 1.034152e-04, 1.154050e-04, 1.317199e-04, 1.528649e-04, 1.792262e-04, 2.109428e-04, 2.477606e-04, 2.892609e-04, 3.342809e-04, 3.815155e-04, 4.296428e-04, 4.772308e-04, 5.235438e-04, 5.684095e-04, 6.123032e-04, 6.559399e-04, 6.989331e-04, 7.434080e-04, 7.893061e-04, 8.358813e-04, 8.813417e-04, 9.246434e-04, 9.662472e-04, 1.006642e-03, 1.046691e-03, 1.087800e-03, 1.131728e-03, 1.180499e-03, 1.236310e-03, 1.301667e-03, 1.379077e-03, 1.470712e-03, 1.578717e-03, 1.704437e-03, 1.848943e-03, 2.012841e-03, 2.196435e-03, 2.399911e-03, 2.623297e-03, 2.866325e-03, 3.128447e-03, 3.408747e-03, 3.705721e-03, 4.017438e-03, 4.341920e-03, 4.677275e-03, 5.022257e-03, 5.377158e-03, 5.743494e-03, 6.125575e-03, 6.527342e-03, 6.959537e-03, 7.435586e-03, 7.970684e-03, 8.581952e-03, 9.289587e-03, 1.011072e-02, 1.105717e-02, 1.213818e-02, 1.335487e-02, 1.470893e-02, 1.620252e-02, 1.785255e-02, 1.968544e-02, 2.174642e-02, 2.409031e-02, 2.678014e-02, 2.987459e-02, 3.343456e-02, 3.751099e-02, 4.215364e-02, 4.741181e-02, 5.333928e-02, 5.999997e-02, 6.746201e-02, 7.579953e-02, 8.509072e-02, 9.541821e-02, 1.068324e-01, 1.193850e-01, 1.331676e-01, 1.480110e-01, 1.637074e-01, 1.806291e-01, 1.982678e-01, 2.131747e-01, 2.303442e-01, 2.592926e-01, 3.075283e-01)

lt2 <- DemoTools::lt_single_mx(nMx = mx2, Age = 0:100, Sex = "f", OAG = TRUE)

mx2 - mx1 lt2$ex - lt1$ex

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/timriffe/DemoTools/issues/247, or unsubscribe https://github.com/notifications/unsubscribe-auth/APAK5QQ5ZHH2YDGXCNYXS2DTWNIV7ANCNFSM475FMICA .

peterdavjohnson commented 3 years ago

Hi all,

I think the problem might be in the call to optim (line321 of MortalityLaws_main.R). I tried changing to a different optimization method (from 'Nelder-Mead' to 'BFGS') but that didn't work for either input mx set. I don't understand all the parts of optim, but maybe other inputs would be needed if 'BFGS' is used. I also tried passing that (as method='BFGS') in the call to MortalityLaw, but that doesn't work.

My code calling MortalityLaw directly is attached.

pj

On Tue, Jul 6, 2021 at 8:23 PM Peter Johnson @.***> wrote:

Hi Sara and Tim,

It seems that the extrapolation somehow fails to converge. The attached script uses the A and B parameters from lt2 A <- 0.0356595987 B <- 0.0000000151 and we see that the mx values are all equal to 0.0344. The goodness.of.fit values are NaN, NaN, NaN.

For lt1 we get reasonable A and B values (0.00409324, 0.1076407), and the goodness.of.fit values are also NaN, NaN, NaN. The deviance looks better though at 0.0057 compared to 0.3813 for the second one.

We need to check the fitting algorithm and see why small differences cause this big difference.

pj

On Tue, Jul 6, 2021 at 3:13 PM Sara Hertog @.***> wrote:

Hi Tim! I'm getting some curious results from the lt_single_mx function.

The below two very similar mortality schedules are from Lee-Carter extrapolations (MortCast) for two adjacent time periods (USA females 2034 and 2035). The life table returned for the first looks ok, but the one for the second has implausible ex, especially at oldest ages. I looked through the code, but I haven't been able to detect why two similar schedules would yield such wildly different results. Do you know what's happening here?

mx1 <- c(4.137699e-03, 4.661475e-04, 1.894496e-04, 1.774186e-04, 1.540913e-04, 1.301364e-04, 1.132623e-04, 1.015097e-04, 9.441688e-05, 9.176039e-05, 9.274633e-05, 9.718911e-05, 1.053648e-04, 1.175363e-04, 1.341031e-04, 1.555623e-04, 1.822785e-04, 2.143500e-04, 2.514762e-04, 2.931839e-04, 3.382772e-04, 3.854442e-04, 4.333799e-04, 4.806971e-04, 5.267097e-04, 5.712928e-04, 6.149592e-04, 6.584504e-04, 7.015056e-04, 7.460926e-04, 7.921565e-04, 8.389562e-04, 8.847963e-04, 9.286664e-04, 9.709676e-04, 1.012187e-03, 1.053186e-03, 1.095366e-03, 1.140483e-03, 1.190553e-03, 1.247768e-03, 1.314626e-03, 1.393632e-03, 1.486962e-03, 1.596768e-03, 1.724411e-03, 1.870979e-03, 2.037092e-03, 2.223067e-03, 2.429087e-03, 2.655171e-03, 2.901032e-03, 3.166104e-03, 3.449478e-03, 3.749681e-03, 4.064870e-03, 4.393181e-03, 4.732840e-03, 5.082725e-03, 5.443162e-03, 5.815679e-03, 6.204534e-03, 6.614041e-03, 7.054522e-03, 7.539394e-03, 8.084013e-03, 8.705730e-03, 9.424979e-03, 1.025917e-02, 1.122027e-02, 1.231761e-02, 1.355231e-02, 1.492595e-02, 1.644069e-02, 1.811349e-02, 1.997103e-02, 2.205897e-02, 2.443255e-02, 2.715526e-02, 3.028601e-02, 3.388573e-02, 3.800512e-02, 4.269348e-02, 4.799949e-02, 5.397628e-02, 6.068719e-02, 6.819964e-02, 7.658684e-02, 8.592572e-02, 9.629673e-02, 1.077475e-01, 1.203260e-01, 1.341193e-01, 1.489562e-01, 1.646282e-01, 1.815015e-01, 1.990732e-01, 2.139401e-01, 2.310362e-01, 2.597395e-01, 3.078898e-01)

lt1 <- DemoTools::lt_single_mx(nMx = mx1, Age = 0:100, Sex = "f", OAG = TRUE)

mx2 <- c(4.037911e-03, 4.603860e-04, 1.850905e-04, 1.736494e-04, 1.508987e-04, 1.274294e-04, 1.109161e-04, 9.942305e-05, 9.250199e-05, 8.993885e-05, 9.094943e-05, 9.534979e-05, 1.034152e-04, 1.154050e-04, 1.317199e-04, 1.528649e-04, 1.792262e-04, 2.109428e-04, 2.477606e-04, 2.892609e-04, 3.342809e-04, 3.815155e-04, 4.296428e-04, 4.772308e-04, 5.235438e-04, 5.684095e-04, 6.123032e-04, 6.559399e-04, 6.989331e-04, 7.434080e-04, 7.893061e-04, 8.358813e-04, 8.813417e-04, 9.246434e-04, 9.662472e-04, 1.006642e-03, 1.046691e-03, 1.087800e-03, 1.131728e-03, 1.180499e-03, 1.236310e-03, 1.301667e-03, 1.379077e-03, 1.470712e-03, 1.578717e-03, 1.704437e-03, 1.848943e-03, 2.012841e-03, 2.196435e-03, 2.399911e-03, 2.623297e-03, 2.866325e-03, 3.128447e-03, 3.408747e-03, 3.705721e-03, 4.017438e-03, 4.341920e-03, 4.677275e-03, 5.022257e-03, 5.377158e-03, 5.743494e-03, 6.125575e-03, 6.527342e-03, 6.959537e-03, 7.435586e-03, 7.970684e-03, 8.581952e-03, 9.289587e-03, 1.011072e-02, 1.105717e-02, 1.213818e-02, 1.335487e-02, 1.470893e-02, 1.620252e-02, 1.785255e-02, 1.968544e-02, 2.174642e-02, 2.409031e-02, 2.678014e-02, 2.987459e-02, 3.343456e-02, 3.751099e-02, 4.215364e-02, 4.741181e-02, 5.333928e-02, 5.999997e-02, 6.746201e-02, 7.579953e-02, 8.509072e-02, 9.541821e-02, 1.068324e-01, 1.193850e-01, 1.331676e-01, 1.480110e-01, 1.637074e-01, 1.806291e-01, 1.982678e-01, 2.131747e-01, 2.303442e-01, 2.592926e-01, 3.075283e-01)

lt2 <- DemoTools::lt_single_mx(nMx = mx2, Age = 0:100, Sex = "f", OAG = TRUE)

mx2 - mx1 lt2$ex - lt1$ex

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/timriffe/DemoTools/issues/247, or unsubscribe https://github.com/notifications/unsubscribe-auth/APAK5QQ5ZHH2YDGXCNYXS2DTWNIV7ANCNFSM475FMICA .

peterdavjohnson commented 3 years ago

I think that part of the problem may be the default starting parameters for the Kannisto model. I tried running the second life table using the parameters from the first one and it seems to work. I first did this directly using MortalityLaw and then borrowed the parameters from that to do it also using lt_single_mx. Unfortunately, we don't have access to the parameters of the mortality law in the second case.

The default params are in MortalityLaw_models.R line 510

kannisto = c(A = 0.5, B = 0.13)

The values from the first life table are c(.00409, .10764).

The question is how to prevent the error from happening. One possible solution is to have a "better" default starting parameter pair. I know some laws have easy direct solutions for the parameters given a certain number of inputs, so if that is possible maybe they could be used to get the starting parameters.

Sara tried this and found: Indeed, adjusting the starting values for the Kannisto parameters solves the problem.

Using parS = c(A = 0.005, B = 0.13) yields successful life table closure across the full set of projected WPP mx patterns for both males and females. These are roughly the median values of the coefficients fit across this set of life tables. I will add this parS argument to each of the lt_single_mx() calls in the WPP workflow.

sarahertog commented 2 years ago

Hi Tim! Is it possible to return a warning from lt_single_mx() when the MortalityLaw extrapolation fails to converge? Currently no-convergence still returns a populated life table when OAnew == max(Age), but with erroneous nAx, nLx and ex values for the last age group. I've found that using parS = c(A = 0.005, B = 0.13) avoids non-convergence for all WPP19 estimated and projected single-year life tables, but we are concerned that non-converging closeouts could sneak in as we introduce some extreme empirical life tables.

timriffe commented 2 years ago

Hi Sara, sure thing, will work it in. Will modify the lowest level wrapper of MortalityLaws inside DemoTools to detect, warn, and offer a robust default. I think it needs to happen at the lowest level connection in DemoTools because we use extrapolation in different places in the package.

timriffe commented 2 years ago

Hi @sarahertog I've added a test patch to a new branch called extrap. The behavior is now to issue a standard R warning() if the goodness of fit stats for the requested mortality fit turn out to be NaN. We then fit using "gompertz" and parS = c(A = 0.005, B = 0.13) as a backup. It's probably good to let these warnings come to the console as they arise. If I get an OK from anyone on this then I'll bring it to the master branch and document it. So far it seems to work as one would hope for the mx1 and mx2 case you posted here, and with the abridged values that @IvanWilli posted in issue #260

sarahertog commented 2 years ago

Thanks @timriffe ! I will do some testing today on other problem cases and get back to you.

timriffe commented 2 years ago

@sarhertog, In testing with @IvanWilli this seems to be resolved now with some better default behavior. If fitting needs an override (failed convergence) then we re-do with Gompertz mortality and your tested starting values. This is now on the master branch with a minor version increment.