OHDSI / Vocabulary-v5.0

Build process for the OHDSI Standardized Vocabularies. Currently not available as independent release.
The Unlicense
206 stars 73 forks source link

Possible duplicate Brand Names in RxNorm and RxNorm Extension #950

Open thisismexp opened 5 months ago

thisismexp commented 5 months ago

Describe the problem in content or desired feature
There are valid Brand Names in the RxNorm and RxNorm Extension vocabulary that differ only in their capitalization or only by having an extra space in them. This issue seems to be more prevalent in RxNorm Extension, but affects both vocabularies.

How to find it

WITH names as (SELECT DISTINCT concept_name
               FROM CONCEPT
               WHERE vocabulary_id IN ('RxNorm', 'RxNorm Extension')
                 AND concept_class_id = 'Brand Name'
                 AND invalid_reason = '')
SELECT *
FROM names
GROUP BY REPLACE(lower(concept_name), ' ', '')
HAVING count(*) > 1;

Expected adjustments
Investigate, deprecate one of the duplicates and adjust all related concepts.

Additional context - Examples concept_id concept_name vocabulary_id concept_class_id
44022153 Vitaplex RxNorm Extension Brand Name
44022295 Vita Plex RxNorm Extension Brand Name

or

concept_id concept_name vocabulary_id concept_class_id
19013606 AT 10 RxNorm Brand Name
21017917 AT10 RxNorm Extension Brand Name
TinyRickC137 commented 5 months ago

Thank you for this catch. They are duplicates, we are focusing on other domains at the moment, but the Drug overhaul is coming soon. Please stay tuned, we post our planned activities here: Roadmap.

thisismexp commented 5 months ago

@TinyRickC137 thanks for the fast reply, I discovered some more issues (mainly with RxNorm/RxE). Do you mind if I just go ahead and open them?

TinyRickC137 commented 5 months ago

@thisismexp Could you please go through the issues first? I believe most of them exist already. You can use tags.

thisismexp commented 5 months ago

Sure, I do my best to not post duplicates.