Open smgogarten opened 6 years ago
That's a good idea, but not a change I could likely make this week. I ran into something similar yesterday, where DD variable names were misspelled. I made a copy of the DD in which I first fixed the spelling and then proceeded with checks. There is a philosophy question in terms of how many errors should we allow and still proceed with all checks, versus requiring some minimum consistency up front. For something like misspelled column names, we can't programatically assume what the correct spelling should be (esp when you have similar varnames) -- capitalization differences would be a little more straightforward, I agree, but the philosophy question still stands.
I was thinking more along the lines of
if (!setequal(names(ds), dd$VARNAME)) stop("Names in DD do not match DS")
So it wouldn't proceed with the checks, but it would be easier to interpret the error. Not urgent, though, once I figured out the problem I updated the file and then re-ran the check.
Agreed - great idea for more informative error message. Leaving this issue open so as to remind myself to add this (and cover with unit testing)
I have a pheno file where the capitalization doesn't match between the column names of DS and the VARNAME in the DD. Could there be an explicit check for matching names? Currently it gives an error