NCATS-Tangerine / beacon-aggregator

A web service that operates over the Beacon network to provide a single software interface over the all the Beacons
Other
2 stars 0 forks source link

Add an endpoint to aggregate all statements by publication #91

Open cmungall opened 5 years ago

cmungall commented 5 years ago

This would be a bit of work as it would require extensions to the main KB API and changes to existing implementations.. but it would be a really cool feature, to be able to see for any publication in pubmed all annotations arising from it.

@newgene @kevinxin90 is this something you have considered for bt explorer?

RichardBruskiewich commented 5 years ago

My superficial consideration of this proposal is that this is simply the inverse of the flow from statements to evidence (the back edge, as it were) so the required linkages ought to be accessible somewhere in some beacons, such as the Semantic Medline one (what about Monarch?) @lhannest and I could check the feasibility of such an addition to some of the beacons

lhannest commented 5 years ago

Would you want both statements that are supported by a given publication and concepts that are mentioned by a given publication? But that sounds feasible to me.

cmungall commented 5 years ago

More the former but both makes sense

On Thu, Feb 14, 2019 at 3:25 PM Lance Hannestad notifications@github.com wrote:

Would you want both statements that are supported by a given publication and concepts that are mentioned by a given publication? But that sounds feasible to me.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/NCATS-Tangerine/beacon-aggregator/issues/91#issuecomment-463843881, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGOffOlf4Hx3yPaD4sVYw9CtqJOw_Lks5vNfBZgaJpZM4a6AmF .

lhannest commented 5 years ago

SMPDB doesn't seem to correlate publications with statements. At most it mentions PMID's in the description of pathways: Cardiolipin (CL) is an important component of the inner mitochondrial membrane where it constitutes about 20% of the total lipid composition. ... BTHS patients seem to lack acyl specificity and consequently, many potential cardiolipin species can exist (PMID: 16226238).

Rhea's sparql endpoint has 23,856 reactions that have pubmed citations, and we pull many statements from each reaction.

In https://translator.ncats.io/monarch/neo4j/ (which we should set up a beacon for) there are 21,943 nodes that have publications, and 28,046,168 edges that have publications. http://scigraph.ncats.io/ has 8,850,665 edges that have publications. http://steveneo4j.saramsey.org/ has 6,937,688 edges that have publications. The semmeddb neo4j instance also has edges with publications, but the query to get the count isn't returning..

Biothings Explorer only supports going from a chemical to a publication and not the other way around.

So it looks like this will only be supported by the beacons that wrap Neo4j instances and Rhea. But I will see to developing this. Each beacon will have an endpoint like /statements/published?pmid=12324&pmid=12563, which will then return all the statements (the same format as the /statements endpoint) that are related to PMID:12324 and PMID:12563. KBA will have the same endpoint, and when called it will pass the query onto the beacons and aggregate the results.