UW-GAC / dbgaptools

R package to create and check standard files for dbGaP submission
Other
1 stars 3 forks source link

improve checking of consent codes #24

Open laurieca opened 5 years ago

laurieca commented 5 years ago

The function “check_subj” has a parameter of “subj_exp” which is to be a dataframe with two variables (“SUBJECT_ID”,”CONSENT”). The code then compares the consent in this set with consent variable in the subject-consent dbgap file. Comparing consents I think needs manual intervention so I’m not sure what the best thing would be for any modification of “check_subj”.

Example: My understanding is that “subj_exp” would usually be the subjects in the freeze that we are currently working on. However the consent code information in the master subject annotation is in a different format than the CONSENT variable in the subject file (e.g. for SAFS, consent in subject file is 0 or 1, with the definition of 1 being in the data dictionary; the consent in the subject annotation is “DS-DHD-IRB-PUB-MDS-RD”. It does ‘match’ the definition in the subject file data dictionary but ….

smgogarten commented 5 years ago

This could be addressed by substituting the consent code for the value in the data dictionary, and then doing a table of expected value vs observed value instead of reporting mismatches.