OHDSI / Vocabulary-v5.0

Build process for the OHDSI Standardized Vocabularies. Currently not available as independent release.
The Unlicense
219 stars 75 forks source link

Non-Standard Codes with No Standard Mappings #260

Open noahgengel opened 4 years ago

noahgengel commented 4 years ago

I recently delved into my data and found that there are many non-standard codes that do not have mappings to standard codes. I identified these codes in my dataset and attached said codes (including the number of rows affected by the lack of a standard concept) to this issue. The codes are separated according to their corresponding OMOP table.

The codes may not have standard concepts for the following reasons:

  1. No equivalent standard concept exists at this time
  2. Such a standard concept exists but the nonstandard to standard linkage was never established

I also attached files that categorized the issue type (1 or 2, above) and proposed mappings (for issue type 2) to this message. The issue category and proposed mapping for each of the concepts was determined by a clinician at Columbia University Medical Center. I recognize that the OHDSI team has its own team of clinicians who will determine appropriate mappings and/or determine if a new concept needs to be created but I figured I would attach the files in case they could be of any use.

NOTE: The dataset I am using pulls data from ~35 different submissions. The column that reads 'num_sites_w_source_concept' indicates how many of these submissions used a particular non-standard concept. If the 'num_sites_w_source_concept' is 1, creating a new standard concept would be less imperative.

Please let me know if I can further clarify this issue or provide any additional information.

condition_no_dest.xlsx condition_proposed_mappings.xlsx drug_no_dest.xlsx measurement_no_dest.xlsx measurement_proposed_mappings.xlsx procedure_no_dest.xlsx procedure_proposed_mappings.xlsx

p-talapova commented 4 years ago

Dear @noahgengel,

Thank you for reporting! The Vocabulary team has started working on the issue. And I will inform you about a possible solution when everything becomes clear.

p-talapova commented 4 years ago

@noahgengel, during the analysis of the concepts you sent, we have noticed some matters.
Please, take a look at the attached file. There you will see the following sheets:

FYI - Invalid concept issues .xlsx

noahgengel commented 4 years ago

@p-talapova

Thank you for the quick response. As a follow-up:

drug_source_concept_id | drug_source_value | drug_concept_id

32355 | KETOROLAC 30 MG/ML 2ML SDV INJ | 0 32355 | ketorolac | 0 32355 | ketorolac 30 mg/mL 2 mL inj | 0 32355 | KETOROLAC 30 MG/ML 1ML SDV INJ | 0 32355 | ketorolac 30 mg/mL 1 mL inj | 0

As you can see with the above example, the '32355' from the source system actually represents something different from the 32355 normally found in OMOP. I will be sure to change things so the source_concept_id is correct and consistent with the source_value_id.

This is also similar to the issues with 'incorrect source_concept_ids.' Again, it looks like there are some disparities between the source_concept_value and the source_concept_id. I am going to define the scope of this problem.

As a point of clarification: were you able to find any concept_ids that were useful/warranted the creation of a new standard concept?

margshin commented 3 years ago

Hi all,

I am following up regarding 2, we have found that, with the new vocabulary, deprecated type concepts are not mapped to standard ones. For few examples:

  1. Radiology report with concept_id = 44814641 is now non-standard concept: https://athena.ohdsi.org/search-terms/terms/44814641

And there is now another concept, EHR Radiology report, that is standard with concept_id = 32841: https://athena.ohdsi.org/search-terms/terms/32841 but I do not see the mapping for the concept.

  1. Similarly for Pathology report: https://athena.ohdsi.org/search-terms/terms/44814642 should be mapped to EHR Pathology report https://athena.ohdsi.org/search-terms/terms/32835

Aside from Note type concept ids, a lot of the non-standard concept are not mapped: 32512 (non-standard) to 32516 (standard) for Contributory cause of death 32885 (non-standard) to 32519 (standard) for US Social Security Death Master File record

Please let me know if you need any further information. Thank you! Margaret

cukarthik commented 3 years ago

just following up to see when this might be addressed. @p-talapova @dimshitc

TinyRickC137 commented 1 month ago

5 years into the issue - time to solve it. Kudos to @m-khitrun for investigating this.

@noahgengel - we rechecked your files. Most of the concepts are now either mapped or belong to vocabularies that are not currently supported (like CIEL, which will soon be supported again with the help of Andrew Kanter :) ). Besides, some codes are corrupted and cannot be identified in the vocabularies.

@cukarthik @margshin How big is the problem with types for you? Are you interested in finding all of the unmapped types and mapping them via Community contribution?