VirtualFlyBrain / VFB2

Virtual Fly Brain Documentation Site
https://virtualflybrain.org
MIT License
3 stars 6 forks source link

Optimise new TermInfo Queries #122

Open dosumis opened 6 years ago

dosumis commented 6 years ago

Step 1: Gross type + attributes. THIS WILL REQUIRE A GENERIC ENTITY NEO:LABEL FOR INDEXING short_form. Step 2: Relationships a: For Classes b: For Individuals c: Separate module for expression Step 3: Images a: for classes b: for individuals

dosumis commented 6 years ago

This works well for (anatomical) Classes and Individuals.

MATCH (a:Anatomy { short_form: 'FBbt_01000595'})
WITH a OPTIONAL MATCH (parent:Class)<-[r]-(a)
WITH collect({ object: parent.label, rel: r.label}) as rels, a
OPTIONAL MATCH (a)-[rp]->(pub:pub)
RETURN  a.short_form, a.label, a.description, rels, collect ({ miniref: pub.miniref, typ: rp.typ, synonym_scope: rp.scope, synonym: rp.synonym}) as pub_syn

Could add a more general typing for indexing.

Question: Do we expect this to be completely generic, e.g. potentially covering any node we might wish to expose to browsing? If so, we should probably make an Entity label that goes on every node (perhaps except properties?). This presumably will come at some cost for performance. Truly generic queries for TermInfo also make it hard to tailor content for node types, so we might want to revisit at some point).

dosumis commented 6 years ago

Rolling in image lookup may be a bit too slow:

MATCH (a:Class { short_form: 'FBbt_00007392'})
WITH a OPTIONAL MATCH (parent:Class)<-[r]-(a)
WITH collect({ object: parent.label, rel: r.label}) as rels, a
OPTIONAL MATCH (a)-[rp]->(pub:pub) WITH a, rels, 
collect ({ miniref: pub.miniref, typ: rp.typ, synonym_scope: rp.scope, synonym: rp.synonym}) as pub_syn
OPTIONAL MATCH (a)<-[:SUBCLASSOF|INSTANCEOF*]-(i:Individual)<-[:depicts]-(:Individual)-[irw:in_register_with]-(template:Individual) 
RETURN  a.short_form, a.label, a.description, rels, pub_syn, 
collect ({image_sf: i.short_form, image_label: i.label, template: template.label, folder: irw.folder}) as images
dosumis commented 6 years ago

TBA - dbxref (linkout) clause.

OPTION MATCH (a)-[r:hasDbXref]-(b) WITH COLLECT ({ label: b.label, base: b.base_iri, acc: r.accession, iri: b.iri}) AS links
dosumis commented 6 years ago

TBD: How should this work with expression pattern nodes? In addition to all of the default fields above (including images) these record information about the location of expression using an extra (anonymous) individual in between class and anatomy. The individual in these cases should not be displayed. It instead provides a node for linking various pieces of information like stage, publication & assay.

(ep:Class { label: 'expression pattern of X})-[:overlaps|has_part]->(i:Individual)-[:INSTANCEOF]->(:Anatomy),
(i)-[:exists_during]->(stage),

I think this should probably be an additional query that is only run for expression patterns. How much work would it be to set this up?

dosumis commented 6 years ago

Following discussion with Nico about query optimisation:

  1. Separate queries for Classes, anatomical individuals & datasets (edit XMI to support this)
  2. New query pulls back a limited number of images for display in carousel (Nico to post updated, optimised query).
  3. We will standardise the return schema across different queries so that processing code can (potentially) be unified. e.g. all schemas returning images will use a standard schema for return dict , something like this: { image_sf: i.short_form, image_label: i.label, template: template.label, folder: irw.folder}