Open jpickavance opened 2 months ago
On second thoughts these ideas don’t work anyway. Later in the pipeline it adds the missing value code only for those rows where year_group==x. But in this case, even if we change the label in one of the ways indicated we still only get the missing value code of -4 where year_group==8.
Consider giving up on the idea of adding missingness indicators based on year group branching logic and look into a way to get the branching logic embedded in the variable description in the data dictionary instead.
Original email: I’m just pulling some descriptives together quickly and I noticed there’s an error in how the RCADS items (awb2_1_illhealth_1: awb2_1_illhealth_25) have been coded calculated for 2023-24. Currently the code -4 has been assigned to all of those in year 9 and 10 with the label [only shown to those in year 8]. In fact, RCADS was also taken for year 10 so it should only be coded as -4 for those in year 9, perhaps with the note [not shown to those in year 9].
Dan's response: The reason for this is we only had this happen in Y10 for some vars originally so the following code worked:
year_group = case_when(grepl("year_group", branching) ~ str_extract(branching, "[\d]+")
It just looks for the string “year_group” in the
branching
field, when it finds this it takes the first number it finds and puts it aside, then later creates a category label from this saying “only shown in year group x”. In the case of the RCADS variables,branching
contains:So this code no longer works as it doesn’t find the
10
.Couple of options:
• Give up on automating this and make the changes manually – we might miss some? • Change the label so it just says “not shown in all year groups” – probably still gives the user the missingness info they need? • Update the regex in the str_extract function above (or rewrite entirely) so it returns all separate integers as a string, comma separated or something along those lines, then the label should just work as expected – I’m not sure I have the stomach to attempt this one but if you’d like a go… 😊
(Option 2 seems best)