opensafely / tpp-sql-notebook

2 stars 0 forks source link

mvp of workflow from variables to model #5

Closed CarolineMorton closed 4 years ago

CarolineMorton commented 4 years ago

See analysis notebook

I have imported the dummy chess data - and taken the study population to be any patient with a positive covid.

I have then used python to create variables of interest for example cardiovascular disease, smoking as well as age, and gender. These create their own csv files (in data/analysis folder) as per the usual LSHTM workflow.

For the final dataset, i have merged them together so that the final df has all the information incl. binary variables. I have then exported as a csv.

In the Notebook/stata folder you can see a do file. This has some final bits of data cleaning and labelling of variable, and then a simple model.

Hope that's clear