JULIELab / gepi

GePI (GEne - Protein Interactions) is a web portal for quick and convenient access to gene - protein interaction mentions automatically extracted from the biomedical literature, i.e. PubMed and PubMed Central (Open Access Subset).
GNU General Public License v3.0
1 stars 0 forks source link

FIG1 gene occurrence #69

Open SchSascha opened 7 years ago

SchSascha commented 7 years ago

false positives e.g. in: PMC3930414:

Fig 1 Forest plot of T2DM associated with KCNQ1 rs2237892 CT gene polymorphism under a recessive genetic model (TT versus CC+CT).

PMC4552551:

These genes, with the exception of KISS1R and NPTX2, were also upregulated in non-tumor samples from RCC-affected kidneys (Fig 1).

PMC4583406:

Salubrinal should substitute for LIF if the LIF-dependent increase in P-eIF2α (Fig 1) results from a LIF-dependent decrease in the CReP eIF2α phosphatase (Fig 2A–2D).

SchSascha commented 7 years ago

Personal log, Post-Doc SchSascha, supplemental: Fig1 or FIG1 seems to be ok, or at least provide a higher probability that a true gene is identified. In, contrast, Fig 1, Fig-1, Fig. 1, Fig 1a, Fig 1b, etc. all are bad

SchSascha commented 7 years ago

Another issue with this one: The "root" gene name is IL4I1. GePi finds hits like (pmid 26673964)

The immunosuppressive phenylalanine oxidase interleukin 4-induced gene 1 (IL4I1), primarily produced by antigen-presenting cells, inhibits T-cell proliferation and promotes the generation of Foxp3(+) regulatory T cells in vitro.

The genes identified to be connected by an event is IL4I1 and Interleukin 4 . The last one is obviously wrong. I bet induced is erroneously recognised as event, not as part of the name. I fear this is a systematic issue that may repeat with other gene names that include event terms in the long name.

SchSascha commented 6 years ago

I have similar thoughts for this as with #97 : It seems to me that an "easy" fix would be the exact matching preference, if there are more possibilities: interleukin 4-induced gene 1 should be identified as one gene and not only parts of it. Same with spatial or anything alike: If something is simply an adjective, the "rule" should be whether the whole gene name, e.g. spatial learning 1 is present.