ld-archer / E_FEM

This is the repository for the English version of the Future Elderly Model, originally developed at the Leonard D. Schaeffer Center for Health Policy and Microsimulation.
MIT License
3 stars 1 forks source link

Generate a variable for alcohol consumption at age 50/51 #91

Closed ld-archer closed 1 year ago

ld-archer commented 2 years ago

After speaking to Bryan about my idea for using the most common alcohol consumption group from a persons history, he suggested generating a variable for a persons consumption group at age 50/51, meant to represent their consumption just before joining the survey. Another benefit to including this variable is that it anchors people to a certain level, which should help to counteract the regression to the mean that we see in prediction of alcohol consumption. This variable could be used either in place of the most common group or alongside it. Testing required to figure out what combination is best.

For people that have data from age 50/51, this is just a simple assigning their first value to this variable. However for those who join the survey at a later age, we would need to do some backcasting to generate this var. The way Bryan suggested to do this is something they have done before for BMI in the US FEM.

First, we would look at 5-10 year birth cohorts (start with 5 year cohorts), and check their consumption level in their first observation in ELSA. Then we would look at percentiles, and assume that someone who is in a specific percentile would stay there throughout their life. I.e. someone who is in the 40th percentile of consumption at age 80 would have been in the 40th percentile when aged 50. This is a massive assumption, and would need to be properly addressed when writing this up. Then we can assign a consumption level and therefore a consumption group based on these percentiles, which can be included in the predictive models for alcstat and many chronic diseases to include a 'consumption history' of sorts.

Special consideration probably needs to go to people who have a chronic disease, who we have observed have massive changes in their consumption levels after diagnosis. Therefore we need to check those who have a chronic disease in their first ELSA observation as this method may be inaccurate for them.

ld-archer commented 2 years ago

First implementation of this idea is in the branch 91-alcohol_consumption_fvar50. It still has lots of problems, mostly around how to handle the sick-quitters (those diagnosed with a chronic illness who quit or reduce intake), as well as the fact that the relationships with chronic disease show that the backcasted variable doesn't behave as we think it should. i.e. high risk drinkers at 50 are more often than not associated with a reduced probability of developing chronic disease. Bryan thinks this whole approach may not be a good idea anyway and so is thinking of an alternative... In the mean time I am going to investigate sub-categorising the abstainers into lifetime abstainers and quitters, for which I will be making a new issue and branch.

ld-archer commented 1 year ago

Moving on from alcohol work and replacing consumption based variables with frequency based scako (#99).