Plan and implement the replacement counterfactual for wealth quintiles

ld-archer commented 3 years ago

Best idea yet that avoids a lot of the issues around the FEM causal pathway and the difficulty in predicting wealth or income is to do a 'replacement counterfactual'.

This is where we divide our replenishing population into quintiles based on wealth, then 'replace' or impute the risk behaviour and health status of one of the wealthier quintiles onto the poorest quintile, and simulate the population. We are assuming here that wealth is related to all of these characteristics, and so by changing them to look like the wealthier groups we are effectively intervening on wealth without having to use that explicitly as an input to our transition models (which is complicated given the 'causal pathway' of our model - risk behaviours -> chronic disease & disability -> economic outcomes).

This has to be a cohort simulation as trying to scale this up to the whole population is much much more complex.

First steps:

[x] Divide replenishing population into quintiles based on wealth
[x] Generate distribution of chronic disease, disability, and risk behaviours for a richer group (median or above)
[ ] Think about IPF for re-weighting the low wealth group to values of the high wealth group
- This can be crude at first, just get something running first and improve it later
- Will need it to run differently for things like variable type (BMI - continuous, ADLs - ordinal, chronic diseases - binary)
[x] If IPF is a no go, then research how to do this differently (and talk to Rob)

ld-archer commented 3 years ago

First step to this (before doing any of the replacement) is to assess what difference we can see in outcomes for each quintile. This will help to assess how viable this line of questioning is, and also to give us a framework for assessing our intervention in the future. We therefore need to make sure the wealth_quint variable is properly defined in the model as well as being able to look at quintiles as a subgroup in the outputs.

[x] Add wealth_quint to Vars.cpp & Vars.h
[x] Add quintiles as a subgroup for outputs
[x] Create new scenario csv and txt files for wealth interventions
[x] New R notebook for visualising outputs
- [x] Generalised functions for visualising prevalences
  - One to plot all 5 quints on 1 plot, take variable of interest as argument
  - One to plot single quint or pair of quints, take both var of interest and subgroup as argument

ld-archer commented 3 years ago

Visualising the baseline outputs by wealth quintile has worked out quite nicely:

Survival

Disease

Disability

Going to add a step here and define different variables for any severe condition (acutely life-threatening i.e. Cancer, Stroke) and mild conditions (i.e. Diabetes, Arthritis). This decision is based on a similar idea from this paper: Michaud & van Soest 2008, Health and wealth of elderly couples: Causality tests using dynamic panel data models. Will then visualise.

ld-archer commented 3 years ago

Severe Conditions

Mild Conditions

Can now start looking into IPF for copying the health and risk behaviour status of wealthier quintiles onto the lowest wealth group.

ld-archer commented 3 years ago

First attempt at this intervention is up and running. Most of the work has been done in the FEM_R/Wealth_Health_Socioeconomics.Rmd file, which so far mainly consists of replacing the risk behaviour distribution of the poorest quintile (1) with that of the median (3). This will be expanded soon to produce populations that replace more than just risk behaviours (health and disability status eventually) as well as work for different quintiles. Will collect the logic in a function first and generalise it to make this easier.

Just to note on the method, originally planned to use IPF but when looking at the practicalities of this it didn't seem like the best option for a first pass. The main problems foreseen with IPF is that the seed population would have to have been aggregated, which would need to be individualised again after modifying the totals which I don't think would have been a trivial task. Instead using Multiple Imputation for now and planning to improve or expand later on after testing these first few iterations a bit.

ld-archer commented 3 years ago

Have created 3 different replacement scenarios:

risk_quint3: Swapped risk behaviour info from quint 3 (median) onto quint 1 (poorest)
risk_quint4: Swapped risk behaviour info from quint 4 onto quint 1
rdh_quint4: Swapped risk behaviour, disability status, and health status from quint 4 onto quint 1

All three show nice effects in both survival and risk behaviour, to differing levels. See the R document for the code to generate the replaced populations and visualisations of differences in prevalence of things like survival, anydisease, disability, severeCondition etc. (couldn't upload)

ld-archer / E_FEM