IALSA / ialsa-2016-amsterdam

Multi-study and multivariate evaluation of healthy life expectancy (HLE): An IALSA workshop on multistate modeling using R
GNU General Public License v2.0
0 stars 0 forks source link

Addressing missing states #14

Open andkov opened 8 years ago

andkov commented 8 years ago

We cannot rule that a person is dead only because the age is not available for that age. Respondent may have skipped (a) wave(s). Some individuals exhibit pattern of age response, inconsistent with such ruling:

> t(d[28,c(paste0("age_at_visit_",0:16),"age_death")])
                      28
age_at_visit_0  73.86995
age_at_visit_1  74.72416
age_at_visit_2  75.67967
age_at_visit_3  76.82136
age_at_visit_4  77.84257
age_at_visit_5  78.85558
age_at_visit_6  79.79740
age_at_visit_7  80.83231
age_at_visit_8  82.02053
age_at_visit_9        NA
age_at_visit_10 83.63039
age_at_visit_11 84.64613
age_at_visit_12       NA
age_at_visit_13       NA
age_at_visit_14       NA
age_at_visit_15       NA
age_at_visit_16       NA
age_death             NA
andkov commented 8 years ago

Definitions of the missing states:

-2

-1

andkov commented 8 years ago

Here's an example of the multi-state variable encoding, that demonstrates the decisions regarding the two types of missing states (-1 and -2):

> dl %>% dplyr::filter(id %in% c(2136155))
        id msex educ smoke_bl alco_life age_death time_point age_at_visit alive mmse state
1  2136155    0   16        0         1        NA          0     73.86995     1   30     1
2  2136155    0   16        0         1        NA          1     74.72416     1   28     1
3  2136155    0   16        0         1        NA          2     75.67967     1   30     1
4  2136155    0   16        0         1        NA          3     76.82136     1   30     1
5  2136155    0   16        0         1        NA          4     77.84257     1   29     1
6  2136155    0   16        0         1        NA          5     78.85558     1   30     1
7  2136155    0   16        0         1        NA          6     79.79740     1   30     1
8  2136155    0   16        0         1        NA          7     80.83231     1   30     1
9  2136155    0   16        0         1        NA          8     82.02053     1   29     1
10 2136155    0   16        0         1        NA          9           NA     1   NA    -2
11 2136155    0   16        0         1        NA         10     83.63039     1   29     1
12 2136155    0   16        0         1        NA         11     84.64613     1   29     1
13 2136155    0   16        0         1        NA         12           NA     1   NA    -2
14 2136155    0   16        0         1        NA         13           NA     1   NA    -2
15 2136155    0   16        0         1        NA         14           NA     1   NA    -2
16 2136155    0   16        0         1        NA         15           NA     1   NA    -2
17 2136155    0   16        0         1        NA         16           NA     1   NA    -2
> dl %>% dplyr::filter(id %in% c(33027))
      id msex educ smoke_bl alco_life age_death time_point age_at_visit alive mmse state
1  33027    0   14        0         0        NA          0     81.00753     1   29     1
2  33027    0   14        0         0        NA          1     82.13552     1   NA    -1
3  33027    0   14        0         0        NA          2           NA     1   NA    -2
4  33027    0   14        0         0        NA          3           NA     1   NA    -2
5  33027    0   14        0         0        NA          4           NA     1   NA    -2
6  33027    0   14        0         0        NA          5           NA     1   NA    -2
7  33027    0   14        0         0        NA          6           NA     1   NA    -2
8  33027    0   14        0         0        NA          7           NA     1   NA    -2
9  33027    0   14        0         0        NA          8           NA     1   NA    -2
10 33027    0   14        0         0        NA          9           NA     1   NA    -2
11 33027    0   14        0         0        NA         10           NA     1   NA    -2
12 33027    0   14        0         0        NA         11           NA     1   NA    -2
13 33027    0   14        0         0        NA         12           NA     1   NA    -2
14 33027    0   14        0         0        NA         13           NA     1   NA    -2
15 33027    0   14        0         0        NA         14           NA     1   NA    -2
16 33027    0   14        0         0        NA         15           NA     1   NA    -2
17 33027    0   14        0         0        NA         16           NA     1   NA    -2
> dl %>% dplyr::filter(id %in% c(2817047))
        id msex educ smoke_bl alco_life age_death time_point age_at_visit alive mmse state
1  2817047    1   20        0       4.5  92.30664          0     89.51677     1   21     3
2  2817047    1   20        0       4.5  92.30664          1     90.57084     1   17     3
3  2817047    1   20        0       4.5  92.30664          2     91.49076     1   12     3
4  2817047    1   20        0       4.5  92.30664          3           NA     0   NA     4
5  2817047    1   20        0       4.5  92.30664          4           NA     0   NA     4
6  2817047    1   20        0       4.5  92.30664          5           NA     0   NA     4
7  2817047    1   20        0       4.5  92.30664          6           NA     0   NA     4
8  2817047    1   20        0       4.5  92.30664          7           NA     0   NA     4
9  2817047    1   20        0       4.5  92.30664          8           NA     0   NA     4
10 2817047    1   20        0       4.5  92.30664          9           NA     0   NA     4
11 2817047    1   20        0       4.5  92.30664         10           NA     0   NA     4
12 2817047    1   20        0       4.5  92.30664         11           NA     0   NA     4
13 2817047    1   20        0       4.5  92.30664         12           NA     0   NA     4
14 2817047    1   20        0       4.5  92.30664         13           NA     0   NA     4
15 2817047    1   20        0       4.5  92.30664         14           NA     0   NA     4
16 2817047    1   20        0       4.5  92.30664         15           NA     0   NA     4
17 2817047    1   20        0       4.5  92.30664         16           NA     0   NA     4
wibeasley commented 8 years ago

Regarding participant 2136155, there's a few dplyr or zoo rolling trick that can take care of the -1 on row 10. base::cummax might be one way. Tell me if you want some help.