OHDSI / Usagi

Usagi is an application to help create mappings between coding systems and the Vocabulary standard concepts.
http://ohdsi.github.io/Usagi/
91 stars 29 forks source link

Gap between Concepts and Searchable Terms #61

Closed Butler925 closed 4 years ago

Butler925 commented 4 years ago

Hello,

First of all, thank you for all your hard work on USAGI, it is an incredibly helpful and beneficial tool!

I am working on a project mapping drug names to the OMOP CDM using USAGI and noticed a few common drug names were not being mapped correctly. For example, when searching for the string "Paxil," the top match is "Paxillus" which is an Observation in SNOMED. When restricting by RxNorm and RxNorm Extension vocabularies, the top match is "Paxil Pill" (score = 0.79) despite there being an RxNorm concept available in the CDM with an exact string match (concept_id = 19011044).

Additionally, when searching the USAGI interface, under 'filter by concept class,' Brand Name was not an option in the drop-down. Finally, under 'Show Index Statistics,' it lists 5.38m concepts, but only 4.52m searchable terms which may be underlying this problem.

The vocabulary was downloaded from ATHENA on 10/2/2019 with all available vocabularies selected. I uploaded these files to a SQL database and am able to find the Paxil concept described above (RxNorm; id = 19011044), but am not able to see it or match to it in USAGI. Do you have any idea why this might be happening?

Any feedback is greatly appreciated! And again, thank you for all your hard work!

schuemie commented 4 years ago

Hi @Butler925 ! The goal of Usagi is to map codes to Standard Concepts, those concepts that are allowed to be used in the Common Data Model. So only terms mapping to Standard Concepts are included in the Usagi search index.

The example you provide (concept 19011044) is a non-standard concept. It is a brand name, and it is not something you should map to. Instead, you should map to the generic name Paroxetine. Do you have the generic names of the drug codes you are mapping?

Butler925 commented 4 years ago

Hi @schuemie, thanks for your quick response! Unfortunately, the dataset we have only contains brand names and we were hoping to use the Brand Name mapping along with the Brand name of relationship id within the OMOP CDM to identify the key ingredient. Please let me know if you know of a better way to accomplish this!

cgreich commented 4 years ago

@Butler925

That would be a good improvement of the tool. But right now you'd have to do it by hand in Athena or in the database: Find your brand name and use the relationship to figure out which ingredient(s) it relates to. It's often a one-to-one, but all multi-ingredient drugs have a single brand name for many ingredients. Good luck, I know this really is sifting through bad jargon.

Butler925 commented 4 years ago

Hi @cgreich thank you for the suggestion! We ended up building a local SQL database of the OMOP files and have been using string matching to automate at least some of this process, but thank you both for your quick responses!

asivura commented 3 years ago

@cgreich @schuemie From my point of view it looks like a bug. Even if you remove filter "Standard Concepts" you don't see Brand Names in the search results, but you can see other non-standard concepts there.

We use Usagi to annotate information extracted from medical notes. Sometimes Brand Name is only we have there.