`missing_value_check` - Githubissues

In the 8/14 version of the 2017-2019 Samoan data, missing_value_check flags some variables even though they properly have NA=N/A in their first VALUES column.

I think it isn't working because the NA=N/A is mapped by the value_meaning_table function to a VALUE of "NA" (a character string) instead of to a NA (R missing value code), but then this line is setting up to check for the NA R missing value code:

    codes <- c(NA, unique(na.omit(non.NA.missing.codes)))

Oh, but codes is used with the NA R missing value code to find those columns in the data that contain at least one NA via this line:

m.cols <- DS.data %>% select_if(~any(. %in% code)) %>% 
        names()

But then if the code is NA, then the next line after the m.cols line is this:

DD.cols <- tb %>% filter(.data$VALUE == code)

When the code is NA, this does not find any such columns because in the tb, the NA is instead the character string "NA".

If I instead do

DD.cols <- tb %>% filter(.data$VALUE =="NA")

then I do find all the columns that have a NA=N/A VALUES=MEANING mapping.

So maybe the solution here is to have two parallel codes:

The original one, to be used for searching in the data
- codes <- c(NA, unique(na.omit(non.NA.missing.codes)))
And a string one, to be used in searching the tb table of VALUES and MEANINGS
- codes_str <- c("NA", unique(na.omit(non.NA.missing.codes)))

lwheinsberg / dbGaPCheckup

`missing_value_check` #10