UNMCCC / CRIIS_Source_Extracts_for_OMOP

Cancer Clinical Data integration - Set of SQL extractions from disparate health system sources targeting the OMOP data model
MIT License
0 stars 0 forks source link

Velos Observation extract has duplicates #177

Open dahealy opened 2 years ago

dahealy commented 2 years ago

From Kevin: One small issue with the Velos Observation extract. There are some duplicates – i.e. rows with the same SOURCE_PK that appear multiple times (76 rows). In reviewing the data, I think a join to the provider table is causing the problem because in every case the only thing that differs on the row is the value of PROVIDER_ID. My recommended fix would be to add “-“||PROVIDER_ID to the end of the source_pk value. Your call though.

dahealy commented 2 years ago

Per Kevin: Duplicates being returned for unique Study-id, person-code (MRN) and Patient-Study_status-id due to multiple PIs being returned. Resolution: The problem is that there is only a PI-name in MINIVELOS.DM_PATIENT_STATUSES, not an ID; So we must link on PI-name to get PI-id from the eVelos User table. However, a PI has multiple entries in the User table based on the site at which the study is being conducted.
-- Ex: Dr McGuire User-ID= 2292 for care-site 50 (UNM), but she has id = 472 for care-site 58 (Veteran's admin) Solution: added condition usr.FK_siteID = 50 to Join between MINIVELOS.DM_PATIENT_STATUSES and MINIVELOS.ER_USER (usr) Data ready for upload 4/21/22