Knowledge-Graph-Hub / kg-covid-19

An instance of KG Hub to produce a knowledge graph for COVID-19 response.
https://github.com/Knowledge-Graph-Hub/kg-covid-19/wiki
BSD 3-Clause "New" or "Revised" License
78 stars 26 forks source link

interactions disappeared between 20201001 and 20201101 nt files #376

Open realmarcin opened 3 years ago

realmarcin commented 3 years ago

Describe the bug

A triple for interacts_with between ACE2 and GLP1R is present in the 20201001 release but not 20201101.

To Reproduce

This triple: P43220 interacts_with Q9BYF1

Is present in the .nt file from 20201001: https://kg-hub.berkeleybop.io/kg-covid-19/20201001/kg-covid-19.nt.gz

but not in the 20201101 one: https://kg-hub.berkeleybop.io/kg-covid-19/20201101/kg-covid-19.nt.gz

Note that this triple is also not present in the 20201001 .tsv (see here https://github.com/Knowledge-Graph-Hub/kg-covid-19/issues/375).

Here are all the relevant triples from the 202021101 .nt file:

. . . . "STRING" . "biolink:Association"^^ . "157.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "0.0"^^ . "108.0"^^ . "94.0"^^ . ## Expected behavior Unclear why this interaction disappeared later. ### Version 20201001 vs 20201101 ### Additional context related to https://github.com/Knowledge-Graph-Hub/kg-covid-19/issues/375
kliegr commented 3 years ago

Is it possible to track the provenance of this triple when it was still present in the .nt file?

The "metadata"/reification triples retrieved for P43220 interacts_with Q9BYF1 are listed in the issue, but none of those seems to actually point to a publication from which this triple was presumably extracted.

Could the disappearance of the triple be related to possible low extraction confidence? In this respect, is information such as
ex:textmining = "108.0" or textmining_transferred= "94.0" of any significance?

Is the semantics of predicates (textmining_transferred, textmining, combined_score) documented somewhere?