Open Sigfried opened 1 year ago
I think that this is a very specific problem, and would take too much time to make a near priority for the TermHub UI.
This does look mostly like a SQL problem. Maybe I'm not understanding the problem fully, but might not the algorithm be as simple as?:
concept_relationship
does not have any row for <mono_drug_concept_id>
,RxNorm has ing
,<target_ingredient_concept_id>
The problem described here has come up repeatedly over the past few months. One very costly solution is proposed in my ohdsi symposium submission a couple months ago. But that's something beyond single concept sets, so termhub can't implement it without greater changes elsewhere. Your solution would be great if we had a way of doing i, ii, and iii, but we don't. That's what the issue is about
Was working with @stephanieshong on concept set 387143023 / Sulfonylureas (v4) which is a drug class including a bunch of ingredients that appear in combo drugs. Unfortunately (I don't know why) just including the class concept 21600749: Sulfonylureas would miss a bunch of appropriate concepts, so Stephanie included 19 other ancestor concepts as well:
Which added an addition 1079 appropriate concepts, though only three of the ancestors(glyburide, glipizide, glimepiride) produced descendants (20) that had patient counts. But the addition of these 19 concepts and their descendants ended up bringing in a bunch of concepts that were metformin monotherapies and not Sulfonylureas. So then she excluded a number of concepts and their descendants, which then ended up excluding combo drugs that were Sulfonylureas.
Ideally, we would like to be able to:
Given some examples:
we asked: Can we write a query that would give us 1 and 2 but not 3? We tried to find ancestors that would help us get the ones we wanted and exclude the ones we didn't, but that was not easy. We still don't know if it's possible for this example, let alone more generally.
As an approach to excluding monotherapy of non-desired ingredients (in this case only metformin) we tried getting rid of any drug whose name contained 'metformin' but did not contain a slash. This worked satisfactorily (and it turned out only a few drugs needed to be excluded.)
Now we want to address two problems:
TermHub's concept hierarchy was a big help, but we still had to do most of this work in SQL. Ideas? @hlehmann17? @DaveraGabriel? Others?