ccb-hms / nhanes-database

4 stars 4 forks source link

Do we need to add 'years' column in DEMO tables? #174

Open deepayan opened 7 months ago

deepayan commented 7 months ago

Just wondering: do we really need this?

https://github.com/ccb-hms/NHANES/blob/main/Code/R/download.R#L238-L241

The DEMO tables already have a SDDSRVYR column with the same information (in a more verbose form):

https://wwwn.cdc.gov/nchs/nhanes/2017-2018/DEMO_J.htm#SDDSRVYR

deepayan commented 7 months ago

There seems to be a larger logical flaw in this code --- there are two nested for loops:

https://github.com/ccb-hms/NHANES/blob/main/Code/R/download.R#L186

and

https://github.com/ccb-hms/NHANES/blob/main/Code/R/download.R#L197

But as far as I can tell, the second one will always loop over exactly one thing, so the loop is redundant. Maybe this is a leftover from the earlier non-manifest approach.

It seems the expectation was that currDataType would be something like "DEMO", and rowsForCurrDataType would match rows for tables like "DEMO", "DEMO_B", ..., "P_DEMO", but this is no longer true.

This is mostly harmless (just some needlessly complicated code giving the correct result), but has the effect that the "years" column is added only in DEMO, and not in DEMO_B, DEMO_C, etc.