Sage-Bionetworks / prov-service

lightweight implementation of the Synapse Activity services, based on the PROV spec
3 stars 3 forks source link

Implement relevant PHCCP example queries through API #27

Closed jaeddy closed 5 years ago

jaeddy commented 5 years ago

Here's the start of a list of requests/queries that we want to support from the portal (will likely grow).

These should be possible with the current configuration of the provenance graph... Just need to figure out the graph queries.

Nominally, all results (at least for requests that return a graph for visualization) should be a list of tuples with elements source, relationship, target — to make JSON export more consistent.

Return all activities in the graph

Return subgraph stemming from a given user

Return subgraph upstream of an entity

Return subgraph downstream of an entity

jaeddy commented 5 years ago

I think this works for the user subgraph query:

MATCH (s)-[r]-(t)
WHERE exists((s)-[:WASASSOCIATEDWITH]-(:Agent {name: <user>}))
RETURN s as source, r as relationship, t as target

For example, testing with 'User_1':

image

jaeddy commented 5 years ago

After looking at some more examples, I'm thinking that limit is not a particularly useful parameter (it constraints the number of 'rows' returned, which isn't as meaningful to the client).

For requests that expect a (sub)graph response connected to a particular entity, depth might be a more applicable parameter. This should theoretically be enabled via variable length relationships (i.e., (m)-[:RELATED_TO*1..10]->(n))... but I haven't quite worked out the details yet.

Pawel-Madej commented 5 years ago

@jaeddy I've prepared and documented some sample queries. Please take a look here: https://docs.google.com/document/d/1BQsiET0teeVFZrxRKuFgo-0q9OCbYYIa74qnc1muFPs/edit#heading=h.tpcof15hwzml and comment on whether it's something that you are looking for?

Pawel-Madej commented 5 years ago

here I've found some explanation about path length for Cypher queries: https://graphaware.com/graphaware/2015/05/19/neo4j-cypher-variable-length-relationships-by-example.html

jaeddy commented 5 years ago

Thanks, @Pawel-Madej! I added some comments and notes in the doc above. I've marked queries with '*' for what I think are the high priority features we want to support for the pilot release ('' is more of a medium priority).

I played around with some other queries to illustrate example subgraphs, and I've included those in the doc. I had better luck (more predictable results) working with collect() and UNWIND functions vs. the variable length relationships — but I'm guessing there will still be an important use for the latter.

I'm also putting together some wire frames to illustrate where and how I'm envisioning that these provenance queries will be used in the portal. I'll share that soon.

jaeddy commented 5 years ago

Here are the wireframes: https://www.figma.com/file/y3aCJBaxf9oP8EKOtj58aj/PHCCP-Provenance?node-id=0%3A1

This GIF gives a sense for how a user might navigate through the portal:

Views - PHCCP Provenance

jaeddy commented 5 years ago

Started introducing some features in the subgraph-api-queries branch (you can check out the diff in #29 to see what's new). Please feel free to add or improve to what I've implemented!