opensafely / booster-effectiveness

To estimate the effectiveness of third COVID-19 vaccine doses compared with second dose only
MIT License
0 stars 0 forks source link

codebase review (R scripts) #1

Closed wjchulme closed 2 years ago

wjchulme commented 2 years ago

The readme describes what each script is doing. A review for the data_process.R, data_process_long.R, data_selection.R, model_seqtrialcox.R and report_seqtrialcox.R scripts would be really helpful.

The sequential trial approach as outlined in the protocol is not yet implemented here -- instead of choosing recruitment periods where sufficiently many people have been vaccinated on each day, it just runs it across the entire recruitment period (16 sep to 29 dec). Just ignore this discrepancy for now.

wjchulme commented 2 years ago

There's also the functions/ folder which has a couple of scripts in. Many of the functions within are currently commented out. Ignore these ones and I'll tidy up when I'm confident they're not needed. But please do review the ones that are in use!

milanwiedemann commented 2 years ago

just adding this checklist for me as I go through the files

Analysis

Functions

milanwiedemann commented 2 years ago

data_process.R looks good! I noticed that the fct_case_when() only works when the new value is a character (because of TRUE ~ NA_character_) but I think it doesnt matter here

https://github.com/opensafely/booster-effectiveness/blob/5a073fe7d726201d906fed12c1a57f77228bce84/lib/functions/utility.R#L3-L10

this works: mutate(mtcars, new = fct_case_when(cyl <= 5 ~ "asd", TRUE ~ NA_character_))

this returns error mutate(mtcars, new = fct_case_when(cyl <= 5 ~ 1, TRUE ~ NA_character_))

milanwiedemann commented 2 years ago

datra_processing_long.R also looks good, just checking that you want the plus in the d+ in the regex in this script ... I guess it's theoretically possible that there were more than 9 admissions, first example here:

https://github.com/opensafely/booster-effectiveness/blob/5a073fe7d726201d906fed12c1a57f77228bce84/analysis/data_process_long.R#L32

milanwiedemann commented 2 years ago

data_selection.R:

has_sex

comma

milanwiedemann commented 2 years ago

model_seqtrialcox.R:

milanwiedemann commented 2 years ago

in survival.R:

milanwiedemann commented 2 years ago

in report_seqtrialcox.R you assign rr <- rate/ref_rate but it never gets called in the function itself, maybe not needed? or missing?

milanwiedemann commented 2 years ago

in redaction.R:

milanwiedemann commented 2 years ago

in tmerge() I haven't seen this before, but maybe that's a trick I don't know: