ProjectMOSAIC / mosaic

Project MOSAIC R package
http://mosaic-web.org/
93 stars 26 forks source link

data cleaning needed for NHANES #342

Closed nicholasjhorton closed 10 years ago

nicholasjhorton commented 10 years ago

I really like the NHANES dataset. But in trying to generate an example logistic regression for Jeff Witmer, I found the following inconsistencies:

tally(~ smoker, data=NHANES)

yes no 4161 0 26965

tally(~ diabetic, data=NHANES)

0     1   NaN 

27795 1580 1751

It would be nice if these could be addressed. I'd be happy to do it if I can access to the original data.

rpruim commented 10 years ago

You have access

data(NHANES)
# data cleaning goes here
save(NHANES, file="data/NHANES.rda")
# edit documentation in "R/datasets.R" to match changes
# commit, push
rpruim commented 10 years ago

I updated NHANES based on the Rmd file Nick sent.