Open ericaVoss opened 8 years ago
Additionally, of the 64 missing drugs, 46 were in our original DRUG_UNIVERSE, that is 5% of the original ingredients.
Reviewed and the 64 ingredients do give us a bump, but not a huge bump. Here are about the number of records we could pick up based on the old copy of LAERTES:
WITH CTE_MISSING_ROWS AS (
SELECT e.*,
c.CONCEPT_ID AS INGREDIENT_CONCEPT_ID, c.CONCEPT_NAME AS INGREDIENT_CONCEPT_NAME
FROM DRUG_HOI_EVIDENCE e
JOIN DRUG_HOI_RELATIONSHIP r
ON r.ID = e.DRUG_HOI_RELATIONSHIP
/*ROLL DOWN TO INGREDIENTS*/
JOIN CONCEPT_ANCESTOR ca
ON ca.DESCENDANT_CONCEPT_ID = r.DRUG
AND ca.ANCESTOR_CONCEPT_ID IN (
SELECT DISTINCT CONCEPT_ID
FROM CONCEPT
WHERE VOCABULARY_ID = 'RxNorm'
AND CONCEPT_CLASS_ID = 'Ingredient'
AND INVALID_REASON IS NULL
)
JOIN CONCEPT c
ON c.CONCEPT_ID = ca.ANCESTOR_CONCEPT_ID
WHERE c.CONCEPT_ID IN (
/*Potential Missing Ingredients*/
1186087,1119119,981774,757688,1397141,904525,1336825,43526465,1337620,1118084,912263,1315411,42874220,45775372,1304643,1756831,1358436,40222444,46287397,43526424,989482,1703069,1301125,757352,1151789,19011440,1526475,19017742,1304850,1149380,46276250,937368,1516976,1502905,722424,713196,40239216,1736887,40170911,967562,987366,46275968,735843,1314577,42900448,1110942,1304919,734354,19080982,1337107,40238930,1314273,40241331,733523,952004,44785086,44785094,1336926,19011093,1387104,904501,40161532,1308842,780442
)
)
SELECT '01-ROW_COUNTS' AS STAT,
EVIDENCE_TYPE AS STAT_TYPE,
SUPPORTS,
COUNT(*) AS STAT_VALUE
FROM CTE_MISSING_ROWS
GROUP BY EVIDENCE_TYPE, SUPPORTS
UNION
SELECT '01-ROW_COUNTS' AS STAT,
'Total' AS STAT_TYPE,
NULL AS SUPPORTS,
COUNT(*) AS STAT_VALUE
FROM CTE_MISSING_ROWS
ORDER BY 1,2,3 DESC,4;
So, wait. Where did you get these 64 from? Are they just random from what you were looking at?
@cgreich - I used my local copy of MedLine and basically compared if I used the Vocab vs. the lookup you provided me and I get these 64 ingredients.
Not that this would work anywhere other than my environment:
with drug_of_ade as (
/*Look up drugs per our MEDLINE AVILLACH query*/
select meshheading.pmid, meshheading.descriptorname, meshheading.descriptorname_ui
from medcit_meshheadinglist_meshheading meshheading inner join medcit_meshheadinglist_meshheading_qualifiername qualifier
on meshheading.pmid = qualifier.pmid and meshheading.medcit_meshheadinglist_meshheading_order = qualifier.medcit_meshheadinglist_meshheading_order
where qualifier.value = 'adverse effects'
),
CTE_MESH_RXNORM AS (
/*USING VOCAB*/
SELECT A.CONCEPT_CODE source_concept_code, A.CONCEPT_NAME source_concept_name, B.CONCEPT_CODE concept_code, B.CONCEPT_NAME concept_name, B.CONCEPT_ID concept_id, B.CONCEPT_CLASS_ID concept_class_id
FROM RNDUSRDHIT06.[VOCABULARY_V5.0_20160311].dbo.CONCEPT A, RNDUSRDHIT06.[VOCABULARY_V5.0_20160311].dbo.CONCEPT B, RNDUSRDHIT06.[VOCABULARY_V5.0_20160311].dbo.CONCEPT_RELATIONSHIP CR
WHERE A.VOCABULARY_ID = 'MeSH' AND
CR.CONCEPT_ID_2 = A.CONCEPT_ID AND
B.CONCEPT_ID = CR.CONCEPT_ID_1 AND
B.VOCABULARY_ID = 'RxNorm' AND
CR.RELATIONSHIP_ID = 'Mapped from'
),
CTE_NEW_MESH_RXNORM AS (
/*USING VOCAB PATCH*/
SELECT *
FROM SCRATCH.dbo.EV_new_mesh
),
CTE_COMPARE AS (
SELECT DISTINCT d.descriptorname, d.descriptorname_ui,
v.concept_code, v.concept_name, v.concept_id, v.concept_class_id,
nv.concept_code NEW_CONCEPT_CODE, nv.concept_name NEW_CONCEPT_NAME, nv.concept_id AS NEW_CONCEPT_ID, nv.concept_class_id AS NEW_CONCEPT_CLASS_ID
FROM drug_of_ade d
LEFT OUTER JOIN CTE_MESH_RXNORM v
ON v.source_concept_code = d.descriptorname_ui
LEFT OUTER JOIN CTE_NEW_MESH_RXNORM nv
ON nv.source_concept_code = d.descriptorname_ui
)
SELECT *
FROM CTE_COMPARE
WHERE CONCEPT_CODE IS NULL AND NEW_CONCEPT_CODE IS NOT NULL /*places where the VOCAB isn't making a map but the PATCH is*/
Can you give me the list of all missing descriptorname_uis (including the pacifiers)? So I can figure out how we could catch them, maybe.
I need to follow the format of the previous file. Will need to update this at a later date.
Related to #85.
One place #85 exists is in the 20160311 Vocabulary. Based on how some of the MeSH tagging has changed in PubMed we are missing some of the MeSH --> RxNorm maps.
I have an updated copy of the MeSH-->RxNorm map that will be coming out in the next release of the OMOP Vocabulary and I compared it to the 20160311 Vocab version. When trying to map to descriptorname_uis in the "adverse effects" portion of our Medline SQL, we were able to map the following:
**
Here are the 64 missing ingredients we would get from the new mapping, definitely a few good ones in here:
abatacept,adalimumab,adapalene,aripiprazole,bevacizumab,bimatoprost,bortezomib,canagliflozin,capecitabine,celecoxib,certolizumab pegol,cetuximab,cobicistat,dabigatran,darbepoetin alfa,darunavir,dasatinib,denosumab,diflubenzuron,dimethyl fumarate,Dutasteride,emtricitabine,Epoetin Alfa,Eszopiclone,Etanercept,everolimus,ezetimibe,febuxostat,Filgrastim,fluticasone,Gliotoxin,infliximab,insulin detemir,Insulin Glargine,Interferon beta-1a,interferon beta-1b,Linagliptin,linezolid,liraglutide,loteprednol etabonate,lubiprostone,Malondialdehyde,natalizumab,nebivolol,olive oil,omalizumab,pemetrexed,pregabalin,ranibizumab,ranolazine,Rilpivirine,rituximab,rivaroxaban,rivastigmine,sevelamer,simeprevir,sofosbuvir,tadalafil,Tenofovir,trastuzumab,travoprost,ustekinumab,valsartan,varenicline
Not as good of a pick up as I would like, but I don't think we'd get all descriptorname_uis, some of them are completely random. Here are a few of my favorites:
Acacia,Acids,X-Rays,Wine,Walkers,Video Games,Shoes,Riot Control Agents, Chemical,Pacifiers
For this version of LAERTES I see two solutions to this:
\ I'm running this off my local copy of MEDLINE so the numbers might not be identical to another copy.