I had failed to properly account for duplicate and unknown brands in vaccination data, as they weren't properly included in the dummy data. The updates to fix this (and a few other minor issues) are described below:
analysis/study_definition_vax.py: increase from n=1 to n=4 for covid_vax_disease - I previously thought I'd only use this to exclude unvaccinated individuals, but it should also be used to check for vaccine doses with no brand recorded
analysis/functions/utility_functions.R: no longer rounding as these files will never be released, and it's easier to spot errors when comparing to the skim output if no rounding
analysis/eda/plot_vax_dates.R: no need to remove individuals who have unknown or duplicate brands, as this is now done in analysis/preprocess/data_eligible_a.R. Other minor changes: only plot up to 3 doses, and aesthetic changes to the plots
I had failed to properly account for duplicate and unknown brands in vaccination data, as they weren't properly included in the dummy data. The updates to fix this (and a few other minor issues) are described below:
covid_vax_disease
- I previously thought I'd only use this to exclude unvaccinated individuals, but it should also be used to check for vaccine doses with no brand recordedimd
coding and no longer join unvaccinated individuals to data_vaxskim
output if no roundingnalysis/preprocess/data_eligible_a.R
. Other minor changes: only plot up to 3 doses, and aesthetic changes to the plots