RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
39 stars 8 forks source link

Consider Removing Retracted Papers from RTX-KG2 #386

Open ecwood opened 3 months ago

ecwood commented 3 months ago

There are currently retracted papers that show up in the publications field of edges in RTX-KG2pre (checking against KG2.9.0). While these are just two examples, it is likely that there are many others in RTX-KG2:. (These were manually identified, by looking up major scientific retractions and searching for them in RTX-KG2.) Further, while their results may not seem significant, it is difficult to know the full extent of damage done by retracted papers in RTX-KG2 without significant investment.

match (n)-[e]->(m) where "PMID:9500320" in e.publications return n.name, n.id, e.predicate, m.name, m.id limit 50
n.name n.id e.predicate m.name m.id
"Hemoglobin" "UMLS:C0019046" "biolink:located_in" "child" "UMLS:C0008059"
"CD79A" "NCBIGene:973" "biolink:located_in" "child" "UMLS:C0008059"
"(non-specific) colitis" "UMLS:C1321275" "biolink:occurs_in" "child" "UMLS:C0008059"
"Lymphoid hyperplasia" "UMLS:C0333997" "biolink:occurs_in" "child" "UMLS:C0008059"
"Measles" "UMLS:C0025007" "biolink:occurs_in" "Parent" "UMLS:C0030551"
"Pseudolymphoma" "UMLS:C0221269" "biolink:located_in" "ileum" "UMLS:C0020885"
"barium follow through" "UMLS:C0412113" "biolink:subclass_of" "Diagnostic radiologic examination" "UMLS:C0043299"
"Chronic inflammation" "UMLS:C0021376" "biolink:located_in" "Colon structure (body structure)" "UMLS:C0009368"
"CD79A" "NCBIGene:973" "biolink:located_in" "Serum" "UMLS:C0229671"
"Hemoglobin" "UMLS:C0019046" "biolink:related_to" "CD79A" "NCBIGene:973"
"Hemoglobin" "UMLS:C0019046" "biolink:related_to" "LOC102723407" "NCBIGene:102723407"

Retraction Watch is now publicly available for download. We should consider including them as a source, to remove edges that cite them. (See https://www.crossref.org/blog/news-crossref-and-retraction-watch/ about download)