MIT-LCP / mimic-omop

Mapping the MIMIC-III database to the OMOP schema
MIT License
128 stars 48 forks source link

Duplicate concepts for prescriptions when NDC is null or zero #42

Open alistairewj opened 6 years ago

alistairewj commented 6 years ago

The ETL is currently duplicating rows because there are duplicate rows in one of the concept mappings:

select
  c3.*, c.numobs
from gcpt_prescriptions_ndcisnullzero_to_concept c3
inner join 
(
  select label, count(*) as numobs from gcpt_prescriptions_ndcisnullzero_to_concept group by label having count(*)>1
) c
  on c3.label = c.label
order by c3.label, c3.concept_id;

gives:

label concept_id concept_name mimic_id numobs
Advair Diskus 100/50MCG 40170634 fluticasone / salmeterol Dry Powder Inhaler [Advair] 2001045339 2
Advair Diskus 100/50MCG 40171027 Fluticasone propionate 0.1 MG/ACTUAT / salmeterol 0.05 MG/ACTUAT [Advair] 2001045340 2
desvenlafaxine 100 mg 1593106 desvenlafaxine succinate 100 MG 2001046244 2
desvenlafaxine 100 mg 19129663 Desvenlafaxine 100 MG 2001046243 2
Meperidine HCl 100MG/2ML AMP 1102527 Meperidine 2001045607 2
Meperidine HCl 100MG/2ML AMP 40164998 Meperidine Hydrochloride 100 MG/ML 2001045608 2

The offending rows are lines 308-309, lines 575-576, and lines 1211-1212.

@aparrot89 can you recommend which concept we should keep?