National-COVID-Cohort-Collaborative / Data-Ingestion-and-Harmonization

Data Ingestion and Harmonization
41 stars 12 forks source link

CMS: Medicaid - two difference race value exist per person in CMS data which is causing dups in person table #101

Closed stephanieshong closed 7 months ago

stephanieshong commented 1 year ago

It can resolved in two ways - Should we drop them since we do not know which one to choose or should we set the race value to unknown and collapse the two rows into one.

stephanieshong commented 1 year ago

medicaid 1%

stephanieshong commented 7 months ago

If multiple and conflicting demographic values are found for a given person , use the latest value to address the duplicate error.

stephanieshong commented 7 months ago

Added pre-steps prior to step 4 to pick the latest values. Merged and built.