Closed HelenCEBM closed 2 years ago
Main changes all look good. As discussed will alter extraction dates for variables to be min of eligible date (based on positive covid test) and treated date.
- Changed test date to positive test date
I'd extracted date of any test and later filter on positive only, just to save having to extract the variable twice as will use it to see how many of the treated patients without a positive test had been tested. But probably easier to have as separate variable.
- output file
table_elig_treat_redacted
isn't being redacted (perhaps because the cols containn (x%)
? (Also need to ensure that additional values are redacted where just one in row/column is <=5 so that values can't be inferred).- Flow chart figures also need redacting
- I think all results should be rounded e.g. to nearest 10 to minimise issues with small number diffs from week to week.
Yes, need to need to add in redaction. Rounded to nearest 10 sounds sensible too.
- Not sure we should use
treated_within_10_days
in eligibility/exclusion criteria? These patients were still initially eligible whether they received treatment or not... We could maybe count patients separately somewhere who appeared to have the treatment too late.
Okay, will remove and describe.
- On charts, sort legend labels by line order - some of the colours are quite hard to distinguish.
Will do.
- Need to exclude people with more than 2 different drugs (though it's not necessarily bad to count them once for coverage purposes)
Done - patients now excluded based on receiving two different drugs within 2 weeks of each other.
- Can feather format be used instead? (Hopefully this would avoid having to define dtypes for each column, as well as saving on space/processing?)
I did consider using feather but sometimes R can be funny with reading in the fields with feather, which is why I prefer to be able to define dtypes for each column. Much more of a faff, but will revisit if it becomes a pain.
- Can we shorten the clinical group names, i.e. remove "Patients with [a]"?
Yes.
- Why is ronapreve in one of the R scripts?
Will be an old script from when we were doing some investigating. I will find and put in the graveyard folder.
Main changes:
by week
by week
means that patients who become eligible in weekn
(e.g. on Saturday) but aren't treated til weekn+1
(e.g. on Monday) will be on separate lines, and therefore excluded when we remove duplicates. (We'll also remove people who had two positive tests on separate weeks but who were otherwise eligible). It's also difficult to calculate the proportion of eligible patients who received treatment, as the two cohorts won't overlap neatly within each week.between = ["index_date", "index_date + 7 days"]
to+6 days
to avoid double countingMinor things:
Questions / to-do:
table_elig_treat_redacted
isn't being redacted (perhaps because the cols containn (x%)
? (Also need to ensure that additional values are redacted where just one in row/column is <=5 so that values can't be inferred).treated_within_10_days
in eligibility/exclusion criteria? These patients were still initially eligible whether they received treatment or not... We could maybe count patients separately somewhere who appeared to have the treatment too late.Less critical suggestions