opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Text mining evidence from Hoffmann et al. Cell paper missing for COVID-19 #1202

Closed AsierGonzalez closed 3 years ago

AsierGonzalez commented 4 years ago

The Hoffmann et al. Cell paper was one of the first publications about SARS-CoV-2 that mentioned human proteins associated with the infection. The abstract of the paper contains sentences that should be picked by EPMC's text mining algorithm, for instance:

Here, we demonstrate that SARS-CoV-2 uses the SARS-CoV receptor ACE2 for entry and the serine protease TMPRSS2 for S protein priming

Here, ACE2 and TMPRSS2 should be tagged as targets and SARS-CoV-2 as the disease (it's a synonym of COVID-19-MONDO_0100096). It would also be acceptable if SARS-CoV was identified and annotated as Severe acute respiratory syndrome - EFO_0000694, which would be a false positive. However, this is not the case although there are four target-disease associations extracted from this paper:

Target Disease
ACE2 infection - EFO_0005741
ACE2 MERS - MONDO_0100116
APN MERS - MONDO_0100116
DDP4 MERS - MONDO_0100116

@saha-shyamasree has been looking into it and she couldn't find any obvious explanations for it but she thinks that there is something wrong in the target annotation. It is important to fix this issue because we may be missing many more similar associations.

AsierGonzalez commented 4 years ago

There is still no explanation for this. @saha-shyamasree will continue working on it but it's low priority.

AsierGonzalez commented 4 years ago

To be checked again next time EPMC submit an evidence file