Closed nicholasjhorton closed 9 years ago
It looks to me like what we have is correct (see tables below) in that it matches the tables on the NHANES site. The trick is this, if a subject answers No to have you smoked 100 cigarettes, they are not asked whether they are currently smokers. Likely most of them are not current smokers, but it is possible that someone just started, I guess.
> tally( ~Smoke100 + SmokeNow | SurveyYr, data = NHANESraw, margins=TRUE)
, , SurveyYr = 2009_10
SmokeNow
Smoke100 No Yes <NA> Total
No 0 0 3352 3352
Yes 1520 1346 0 2866
<NA> 0 0 4319 4319
Total 1520 1346 7671 10537
, , SurveyYr = 2011_12
SmokeNow
Smoke100 No Yes <NA> Total
No 0 0 3184 3184
Yes 1259 1108 2 2369
<NA> 0 0 4203 4203
Total 1259 1108 7389 9756
The HANES site has this for 2011-12:
I've added the following example:
> if (require(mosaic)) {
+ nhanes <-
+ NHANES %>%
+ mutate(
+ SmokingStatus = derivedFactor(
+ Current = SmokeNow == "Yes",
+ Former = SmokeNow == "No",
+ Never = Smoke100 == "No"
+ )
+ )
+ tally( ~SmokingStatus, data = nhanes )
+ }
Current Former Never <NA>
1466 1745 4024 2765
and updated documentation to make the question flow clear.
Note that if Smoke100=="No" then "SmokeNow" should be "No". Either this should be fixed (or the documentation: which says that SmokeNow is:
Study participant currently smokes cigarettes regularly. Reported for participants aged 20 years or older as Yes or No.
(i.e., it doesn't say that they are lifetime smokers). I'd much prefer the former fix (recode the missing SmokeNow as "No") than the latter.