cancerDHC / Terminology

CCDH Terminology Workstream issues
0 stars 1 forks source link

Value-identifier pairs in GDC data dictionary (Question for GDC) #32

Open jiaola opened 3 years ago

jiaola commented 3 years ago

In the GDC data dictionary, we found some pairs of properties that appear to be value-identifier pairs. For example, in the Diagnosis model, primary_diagnosis and morphology are string values and the ICD-O 3 identifiers. Or in the Aliquot model, analyte_type and analyte_type_id are strings and a one-letter code. For these fields, are the dependencies explicitly maintained by the system? Or is it implicit and relies on a third-party system (caDSR or ICD-O) for reference of the dependency?

wwysoc2 commented 3 years ago

The value-identifier pairs are not enforced directly in the dictionary, but will be enforced using an external "Submission QC" system. This system checks specific situations that can get past dictionary validations, but will cause issues later during harmonization and data analysis.