OHDSI / CommonDataModel

Definition and DDLs for the OMOP Common Data Model (CDM)
https://ohdsi.github.io/CommonDataModel
877 stars 448 forks source link

Proposal: add target_domain_id to source_to_concept_map #385

Closed sdebruyn closed 3 years ago

sdebruyn commented 3 years ago

During the ETL process you use the source_to_concept_map to map your concepts, but for most *_concept_id fields your concepts can only belong to a single domain.

If you want to filter the concepts in source_to_concept_map on domain_id in v6.0.0, then you'd have to join the concept table so that you have the domain information.

This join is expensive and can be avoided by adding the domain_id column to source_to_concept_map as well.

cgreich commented 3 years ago

@sdebruyn:

Normalized data models work that way: You store things much more efficiently, and, more importantly, you avoid integrity problems (domain_id in SOURCE_TO_CONCEPT_MAP and CONCEPT inconsistent), but you have to do a join. The OMOP CDM is very normalized (with few exceptions). For a reasonably dimensioned database system this is no problem. Such joins are not expensive at all.

BTW: The SOURCE_TO_CONCEPT_MAP is going obsolete anyway. You should use CONCEPT_RELATIONSHIP. But the mapping of things will be taken over by the wide mapping table.