OHDSI / CommonDataModel

Definition and DDLs for the OMOP Common Data Model (CDM)
https://ohdsi.github.io/CommonDataModel
877 stars 448 forks source link

How to align data standards at an international level and reflect them as standard vocabulary in CDM-SNOMED CT as an example #481

Closed lzhuuk closed 2 years ago

lzhuuk commented 2 years ago

The more we got into the OMOP work, the more we find this disconnection between what is considered by OMOP as the “standard” vocabulary and what is considered as " data standard" in other countries. Vice versa. If OMOP is augmenting its international influence and participation, it seems very necessary to expand the "standard vocabulary" and its acceptable concepts subset through reviewing each CDM data field. From UK's perspective, SNOMED CT is our NHS data standard. You probably would want NHS as your customer base because its size and influence, so aligning with SNOMED CT is essential to remove any potential hindrance. I will be logging specific CDM data fields issues in the forum e.g. admitted_from_concept, discharged_to_source, Race, Gender but I am raising here as a matter of principle for discussion.

cgreich commented 2 years ago

@lzhuuk:

Not sure I follow. SNOMED is used for most of Conditions and Procedures. Where do you have a disconnect?

lzhuuk commented 2 years ago

hi Christian,

The issue is two-folds.

  1. Standard concept terminology system;

NHS England has mandated SNOMED CT as the data standard in the UK. In fact, within the EHR system, we should aim for "one language", not multiple. SNOMED CT has been chosen as that language. This includes both clinical and non-clinical data fields. I am still working my way down from the lists, so far the disconnection are. Non-clinical fields: Examples are:

  1. gender_concept (we call it Biological sex in the UK)
  2. race_concept (we call Ethnicity in the UK)
  3. admitted_from
  4. discharged_to

Clinical fields: Examples are: Drugs: RxNorm is not used in the UK. In the UK, we use dm+d (a "subset" of SNOMED CT) or FDB.

  1. Acceptable non-standard concepts-are they complete? For example, (using SNOMED CT to explain here) admitted_from field has 516 acceptable SNOMED CT concept in the Visit domain. 390 valid and 126 invalid. They are the descendants for 276339004 |Environment (environment)|. However not all the concept in that hierarchy are acceptable. What is rationale behind "limiting" to those? and what criteria does the team use to limit to these? There are some very commonly used ones within NHS are not there. From an implementation perspective, would it be better to allow the entire 276339004 |Environment (environment)| hierarchy to be acceptable?
cgreich commented 2 years ago

Ah, ok. Here is the situation:

Overall, we need to have ONE standard for all. Can't have a standard for the UK, one for the US, one for China. Which is why we are supporting people with mapping relationships, so they can get their data standardized without losing content.

Are there any SNOMED places you are missing?

lzhuuk commented 2 years ago

Before we make you aware "any other" places we are missing concepts, i would rather "resolve" the issues one by one first. To address your comments:

  1. Overall, we need to have ONE standard for all. Can't have a standard for the UK, one for the US, one for China. Which is why we are supporting people with mapping relationships. If we agree an international standard is required, that is good, my understanding is SNOMED CT is that standard. That is why UK is endorsing it. I think there is no reason ODHSI doesn't. Mapping relationship is there only to serve the situation when we "can't" agree an international standard so sorting out the first, the second would resolve by itself.

Gender/sex: debate is good and manual mapping is the way we adopted but fundamentally, service shouldn't have to manually do it if we can endorse an international standard.

Admitted/discharged: RE: what does it mean to be referred to a hospital from, say, a [Corrosive environment] That is why we are limiting the available standards. what you are talking about and the issue what i raised are very different. information specificity is not what I refer to here. What I refer to here is I don't think the criteria you use in defining that subset is helping NHS organisations to submit our data. I would expect NHS content is also evaluated for you to decide not to "consider" them at all? It is equally unjustifiable if we had to manually map to one of your "acceptable" concepts and mapping process led the total lost of meaning from its original content. So surely we would want to retain the meaning of the data entry as much as we can and you look at the quality/specificity centrally. We dont feel the limited subset draws that balance therefore taking time to raise this issue because it is not going to affect one single organisation.

I am sorry all in all, I dont think the issues we raised are being resolved.

cgreich commented 2 years ago

@lzhuuk:

I know what you are saying. There is a standard, it is used by the NHS, and you would prefer to have the same standard in OMOP. I got it. And there is nothing wrong with it.

But: It isn't the standard now. Changing the standard is VERY costly, because all other data holders will have to change it as well. To justify such a thing you need to produce a very good reason showing the current standard doesn't work for the use cases, and yours does.

This is not the case, here. To capture sex you need a few concepts. Whether they are created by OMOP or by SNOMED doesn't matter. You can convert one into the other (even though you feel it is inconvenient). To capture discharge after hospitalization - same thing. There is no loss of meaning. Just a different representation from what you guys have.