ainilaha / RNhanes

Achieving Transparency and Reproducibility in NHANES Research using R
0 stars 2 forks source link

Update Rnhanes.Rmd #9

Open vjcitn opened 11 months ago

vjcitn commented 11 months ago

some inline edits and a few optional FIXMEs that may take some work. i don't suggest that you merge these changes without looking at the FIXMEs and deciding how you want to proceed

rgentlem commented 11 months ago

thanks Vince,

On Mon, Nov 13, 2023 at 11:05 AM Vince Carey @.***> wrote:

some inline edits and a few optional FIXMEs that may take some work. i don't suggest that you merge these changes without looking at the FIXMEs and deciding how you want to proceed

You can view, comment on, or merge this pull request online at:

https://github.com/ainilaha/RNhanes/pull/9 Commit Summary

File Changes

(1 file https://github.com/ainilaha/RNhanes/pull/9/files)

Patch Links:

— Reply to this email directly, view it on GitHub https://github.com/ainilaha/RNhanes/pull/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC7TWA2O7VDIJGVTWY2RADLYEJAMRAVCNFSM6AAAAAA7JMMRYCVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4TAOJWHAZDGMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Robert Gentleman @.***

vjcitn commented 11 months ago

The guardrail concept might be implemented with automated testing ... for a variable with a known frequency distribution for raw and translated values, periodically verify that no discrepancy pops up. A translation module for gender codes is an example of something that might be fragile. If the risk is minimal, just ignore this comment.

rgentlem commented 11 months ago

I think it is not so much a question of the risk - but rather a question of how would you do that. NHANES is a fairly structured sample, so you would need to know a lot about how they did it to know if you got something wrong.

Rather - the point we are trying to make is that they way they (CDC) handle the data is fraught with inconsistencies - so many we certainly have not found them all...the latest is that in some surveys they tag with "Don't know", and in others with "Don't Know", so if you look at the factor levels they are different. I can't see any way to address that except by either exhaustive search (too expensive) or having all the data files present - without a DB to complex (and nhanesA is a per file tool for accessing NHANES data)...maybe we should make that clearer somewhere.

On Mon, Nov 13, 2023 at 5:41 PM Vince Carey @.***> wrote:

The guardrail concept might be implemented with automated testing ... for a variable with a known frequency distribution for raw and translated values, periodically verify that no discrepancy pops up. A translation module for gender codes is an example of something that might be fragile. If the risk is minimal, just ignore this comment.

— Reply to this email directly, view it on GitHub https://github.com/ainilaha/RNhanes/pull/9#issuecomment-1809250073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC7TWA3H3G57NJOVEPERRPDYEKOZ5AVCNFSM6AAAAAA7JMMRYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBZGI2TAMBXGM . You are receiving this because you commented.Message ID: @.***>

-- Robert Gentleman @.***