karthik / testdat

A package to run unit tests on tabular data
142 stars 20 forks source link

Check for missing values. #7

Open davharris opened 10 years ago

davharris commented 10 years ago

I'm thinking that this would probably be a pretty simple function. By default, it would just return things that were clearly missing (e.g. NA), but it could also (optionally) warn if there were -999s or other common outlier codes.

I'm happy to take this on if nobody else wants it.

Thoughts?

karthik commented 10 years ago

Sure, take it!

davharris commented 10 years ago

Cool. Prototype here. https://github.com/davharris/testdat/blob/master/R/test_NA

Currently outputs a data.frame that identifies the rows and columns with possible NAs, along with the value that triggered inclusion.

> test_NA(dat)
Now checking 2 columns...
  row col   value
1   2   1 missing
2  10   1    <NA>
3   3   2    <NA>

I can make it more robust this afternoon, or change the inputs/outputs as the group suggests.

karthik commented 10 years ago

Mind sending a pull request?