EHDEN / ETL-UK-Biobank

ETL UK-Biobank
https://ehden.github.io/ETL-UK-Biobank/
12 stars 4 forks source link

Duplicate visit_occurrence_ids for HES tables #363

Open MaximMoinat opened 2 years ago

MaximMoinat commented 2 years ago

Within a 'dsource' the visit_occurrence_id is unique, but not across. The id is created by concatenating '3'eid``spell_index. Eid-spellindex is not unique within HESIN table, but we do deduplicate. However, we also use dsource in this query (see below). Apparently the spell_index is reused for different dsources (HES, PEDW, SMR). We might want to add the dsource to the lookup key.

https://github.com/EHDEN/ETL-UK-Biobank/blob/2da7023a3b200bb174d54577f606f707f7c21b2a/src/main/python/transformation/hesin_to_visit_occurrence.py#L20