Closed Angus-Morton closed 1 year ago
In AAA-KPIs 1_processing_for_KPI_11_13.R
`# Export very large measurements as they might be errors
# Derive measurements for the PHS screen result categories
# A measurement category is derived for definitive screen results i.e. positive,
# negative, external postive or external negative results unless the follow up
# recommendation is immediate recall ('05').
# This means a measurement category is not derived for technical fails, non
# visualisations and immediate recalls.
last_results_initial_screens <- last_results_initial_screens %>%
mutate(isd_aaa_size = case_when(screen_result %in% c("01", "02", "05", "06") &
(followup_recom != "05" |
is.na(followup_recom)) ~ largest_measure)) %>%
mutate(isd_aaa_size_group = case_when(isd_aaa_size >= 0 &
isd_aaa_size <= 2.9 ~ "negative",
isd_aaa_size >= 3 &
isd_aaa_size <= 4.4 ~ "small",
isd_aaa_size >= 4.5 &
isd_aaa_size <= 5.4 ~ "medium",
isd_aaa_size >= 5.5 &
isd_aaa_size <= 10.5 ~ "large",
isd_aaa_size >= 10.6 ~
"very large error"))
# Assume these have been investigated by the checking script
last_results_initial_screens <- last_results_initial_screens %>%
mutate(isd_aaa_size_group = recode(isd_aaa_size_group,
"very large error" = "large"))`
From script 4.1_vascular_outcomes.R
`# categorize largest measurement into two bins mutate(result_size = if_else(largest_measure >= 5.5, 1, 2)) %>%
mutate(result_outcome = as.character(result_outcome), outcome_type = case_when(result_outcome %in% c('1','2','3','4','5','6','7','8', '11','12','13','15','16','20', '21') ~ 1, result_outcome %in% c('9','10','14','17', '18','19') ~ 2, is.na(result_outcome) ~ 3, TRUE ~ 4)) %>%
`
But also not 100% convinced that this should be moved... are there any other script that use result_size or outcome_type?
I've added this into the extract script, so they will be produced at the beginning. Not sure about the check though... should this get sent to HB for review if "very large"? It looks from the little bit of code that it is recoded as "large"...?
I derive the age at screening using phsmethods age_calculate() function
age_at_screening = age_calculate(dob, date_screen)
A few variables in here:
cohort1 <- cohort1 %>% mutate(eligibility_period = case_when( between(dob, dmy("01-04-1947"), dmy("31-03-1948")) ~ "Turned 66 in year 201314", between(dob, dmy("01-04-1948"), dmy("31-03-1949")) ~ "Turned 66 in year 201415", between(dob, dmy("01-04-1949"), dmy("31-03-1950")) ~ "Turned 66 in year 201516", between(dob, dmy("01-04-1950"), dmy("31-03-1951")) ~ "Turned 66 in year 201617", between(dob, dmy("01-04-1951"), dmy("31-03-1952")) ~ "Turned 66 in year 201718", between(dob, dmy("01-04-1952"), dmy("31-03-1953")) ~ "Turned 66 in year 201819", between(dob, dmy("01-04-1953"), dmy("31-03-1954")) ~ "Turned 66 in year 201920", between(dob, dmy("01-04-1954"), dmy("31-03-1955")) ~ "Turned 66 in year 202021", between(dob, dmy("01-04-1955"), dmy("31-03-1956")) ~ "Turned 66 in year 202122", between(dob, dmy("01-04-1956"), dmy("31-03-1957")) ~ "Turned 66 in year 202223", between(dob, dmy("01-04-1957"), dmy("31-03-1958")) ~ "Turned 66 in year 202324" ), age65_onstartdate = case_when( hbres == "Ayrshire & Arran" & between(dob, dmy("01-06-1947"), dmy("31-05-1948")) ~ 1, hbres == "Borders" & between(dob, dmy("09-08-1946"), dmy("08-08-1947")) ~ 1, hbres == "Dumfries & Galloway" & between(dob, dmy("24-07-1947"), dmy("23-07-1948")) ~ 1, hbres == "Fife" & between(dob, dmy("09-01-1947"), dmy("08-01-1948")) ~ 1, hbres == "Forth Valley" & between(dob, dmy("18-09-1947"), dmy("17-09-1948")) ~ 1, hbres == "Grampian" & between(dob, dmy("03-10-1946"), dmy("02-10-1947")) ~ 1, hbres == "Greater Glasgow & Clyde" & between(dob, dmy("06-02-1947"), dmy("05-02-1948")) ~ 1, hbres == "Highland" & between(dob, dmy("29-06-1946"), dmy("28-06-1947")) ~ 1, hbres == "Lanarkshire" & between(dob, dmy("01-04-1947"), dmy("31-03-1948")) ~ 1, hbres == "Lothian" & between(dob, dmy("09-08-1946"), dmy("08-08-1947")) ~ 1, hbres == "Orkney" & between(dob, dmy("03-10-1946"), dmy("02-10-1947")) ~ 1, hbres == "Shetland" & between(dob, dmy("03-10-1946"), dmy("02-10-1947")) ~ 1, hbres == "Tayside" & between(dob, dmy("09-01-1947"), dmy("08-01-1948")) ~ 1, hbres == "Western Isles" & between(dob, dmy("29-06-1946"), dmy("28-06-1947")) ~ 1, TRUE ~ 0 ), over65_onstartdate = case_when( hbres == "Ayrshire & Arran" & dob < dmy("01-06-1947") ~ 1, hbres == "Borders" & dob < dmy("09-08-1946") ~ 1, hbres == "Dumfries & Galloway" & dob < dmy("24-07-1947") ~ 1, hbres == "Fife" & dob < dmy("09-01-1947") ~ 1, hbres == "Forth Valley" & dob < dmy("18-09-1947") ~ 1, hbres == "Grampian" & dob < dmy("03-10-1946") ~ 1, hbres == "Greater Glasgow & Clyde" & dob < dmy("06-02-1947") ~ 1, hbres == "Highland" & dob < dmy("29-06-1946") ~ 1, hbres == "Lanarkshire" & dob < dmy("01-04-1947") ~ 1, hbres == "Lothian" & dob < dmy("09-08-1946") ~ 1, hbres == "Orkney" & dob < dmy("03-10-1946") ~ 1, hbres == "Shetland" & dob < dmy("03-10-1946") ~ 1, hbres == "Tayside" & dob < dmy("09-01-1947") ~ 1, hbres == "Western Isles" & dob < dmy("29-06-1946") ~ 1, TRUE ~ 0 ), dob_eligibility = case_when( over65_onstartdate == 1 ~ "Over eligible age cohort - age 66plus on start date", age65_onstartdate == 1 ~ "Older cohort - age 65 on start date", !is.na(eligibility_period) & age65_onstartdate == 0 ~ eligibility_period ))
I've came across two scripts (row 15 and 27 in the spreadsheet) which require financial year based on date_surgery, so it probably makes sense to calculate this as a derived variable if it isn't already.
Variables moved to quarterly extracts:
Ideally all derived variables needed by any of the subsequent scripts should be created in the initial AAA_processing script.
The rationale being that this means everything in the initial download from ATOS can be checked by the checking scripts and issues can be flagged up and sent back to them quickly.
Add comments below with code snippets of derived variables from other scripts. Along with the name of the script they appear in.