sorgerlab / indra

INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.
http://indra.bio
BSD 2-Clause "Simplified" License
171 stars 65 forks source link

Add article retraction resource, functionality for filtering on article type tags #1427

Closed kkaris closed 7 months ago

kkaris commented 8 months ago

The XML publication entries of PubMed's records at https://ftp.ncbi.nlm.nih.gov/pubmed/ contain retraction records.

Retraction Entries

There are two way to collect retractions:

The first way is to use the article type tags called PublicationType, this occurs in entries of retracted articles (example from PMID 19717156):

<PublicationTypeList>
  <PublicationType UI="D016428">Journal Article</PublicationType>
  <PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
  <PublicationType UI="D016441">Retracted Publication</PublicationType>
</PublicationTypeList>

The second way to detect retractions is by looking at the CommentsCorrectionsList (example from PMID 19717156) and seeing if there is a correction with RefType=RetractionIn:

<CommentsCorrectionsList>
  <CommentsCorrections RefType="RetractionIn">
    <RefSource>Atherosclerosis. 2016 Mar;246:385</RefSource>
    <PMID Version="1">27326436</PMID>
  </CommentsCorrections>
</CommentsCorrectionsList>

One pitfall to keep in mind is that retractions themselves also have PMIDs assigned to them. A 'retraction publication' can be recognized by again checking the CommentsCorrectionList, but this time checking for RetractionOf (examle from PMID 19718576):

<CommentsCorrectionsList>
  <CommentsCorrections RefType="RetractionOf">
    <RefSource>Matsui T, Suzuki S, Ujikawa K, Usui T, Gotoh S, Sugamata M, Abe S. J Med Eng Technol. 2009;33(6):481-7</RefSource>
    <PMID Version="1">19484686</PMID>
  </CommentsCorrections>
</CommentsCorrectionsList>

There are cases where the correction itself contains text that could become evidence in reading, so keeping track of the PMIDs of the retraction entries is also useful.

Use of retracted PMID and article type tags

Assuming the full list of retracted PMIDs is small enough, it could be stored as a resource file in INDRA and used to filter out evidence. Functionality to filter on specific article type tags can also be added if this information is extracted as well.

kkaris commented 7 months ago

Closed in #1428