VirtualFlyBrain / VFB_2

A repository for specs and development of version 2 of VFB.
1 stars 1 forks source link

SOLR autocomplete spec #2

Open Robbie1977 opened 9 years ago

Robbie1977 commented 9 years ago

Specify how autocomplete should function and issues related.

Robbie1977 commented 9 years ago

Currently we can't specify the type of anatomy as we don't have fully reasoned documents e.g. how do we know this is part of the adult brain:

{
        "logical_description": [
          "has_synaptic_terminal_in some fan-shaped body"
        ],
        "type": "class",
        "uri": "http://purl.obolibrary.org/obo/FBbt_00003657",
        "label": "fan-shaped neuron",
        "label_suggest": [
          "fan-shaped neuron",
          "F neuron",
          "fan shaped neuron"
        ],
        "description": [
          "Neuron with a large arborization field that forms a quasi-horizontal strata within the fan-shaped body that fills it in both the transverse and longitudinal directions. They mostly extend caudally from the anterior margin of the fan-shaped body. This pattern makes some of them look like a fan, although the strata in most cases do not show a separation into 8 (or 16) segments. Some subtypes are specific to a single fan-shaped body layer."
        ],
        "has_children": true,
        "is_defining_ontology": false,
        "is_root": false,
        "in_subset_annotation": [
          "cur"
        ],
        "id_annotation": [
          "FBbt:00003657"
        ],
        "database_cross_reference_annotation": [
          "VFB:FBbt_00003657"
        ],
        "has_obo_namespace_annotation": [
          "fly_anatomy.ontology"
        ],
        "id": "fbbt:http://purl.obolibrary.org/obo/FBbt_00003657",
        "short_form": [
          "FBbt_00003657",
          "FBbt:00003657"
        ],
        "ontology_name": "fbbt",
        "ontology_uri": "http://purl.obolibrary.org/obo/fbbt/fbbt-simple.owl",
        "synonym": [
          "F neuron",
          "fan shaped neuron"
        ],
        "_version_": 1499368735513772000
      }
dosumis commented 9 years ago

There are a relatively small number of types and regions we want to flag for autocomplete or for page construction I've been assuming we could do this by adding indexed fields to the SOLR. These could be populated from OWL queries as in the current WebQueryUtils.java

To get a list of all nervous system terms: "overlaps some 'nervous system'" To get a list of all adult brain terms: "overlaps some 'adult brain'" etc

For populating pages, we need to know whether a class is a: neuron, synaptic neuropil, 'neuron', 'neuron projection bundle', 'neuroblast lineage clone' (again see)

All of these except 'synaptic neuropil' are named classes (and I'll put in a term request for this one). It may be possible to get these classfications directly from the SOLR:

@simonjupp does OLS store complete transitive closure of class graph on each term?

simonjupp commented 9 years ago

Hi David, Yes, we have a document in SOLR for each ontology term. Each term includes the direct parent/child URI, along with the transitive closure of these in a separate field. Simon

On Mon, Apr 27, 2015 at 10:28 PM, David Osumi-Sutherland < notifications@github.com> wrote:

There are a relatively small number of types and regions we want to flag for autocomplete or for page construction I've been assuming we could do this by adding indexed fields to the SOLR. These could be populated from OWL queries as in the current WebQueryUtils.java

To get a list of all nervous system terms: "overlaps some 'nervous system'" To get a list of all adult brain terms: "overlaps some 'adult brain'" etc

For populating pages, we need to know whether a class is a: neuron, synaptic neuropil, 'neuron', 'neuron projection bundle', 'neuroblast lineage clone' (again see)

All of these except 'synaptic neuropil' are named classes (and I'll put in a term request for this one). It may be possible to get these classfications directly from the SOLR:

@simonjupp https://github.com/simonjupp does OLS store complete transitive closure of class graph on each term?

— Reply to this email directly or view it on GitHub https://github.com/VirtualFlyBrain/VFB_2/issues/2#issuecomment-96827234.

dosumis commented 9 years ago

Yes, we have a document in SOLR for each ontology term. Each term includes the direct parent/child URI, along with the transitive closure of these in a separate field.

@Robbie1977: You could base the new indexed fields for restricting autocomplete on these. Perhaps just go with "overlaps some 'nervous system'" for now. It may be worth leaving open whether we have more specfic autocomplete options by region as in VFB1. With a sufficiently well-tuned autocomplete that gives better feedback to the user about the nature of hits, these may not be necessary.

Robbie1977 commented 9 years ago

I believe the issue is that the owl doesn't expressly specify the relationships (does OLS provide any reasoning?). The parent/child is not currently available as far as I can ascertain:

{
  "responseHeader":{
    "status":0,
    "QTime":29,
    "params":{
      "q":"id:*00003679",
      "indent":"true",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "logical_description":["part_of some central body",
          "has_synaptic_terminals_of some dopaminergic PPL1 neuron"],
        "type":"class",
        "uri":"http://purl.obolibrary.org/obo/FBbt_00003679",
        "label":"fan-shaped body",
        "label_suggest":["fan-shaped body",
          "fan shaped body",
          "fb",
          "FB"],
        "description":["The largest synaptic neuropil domain of the adult central complex, covered by an extensive glial sheath. It is located posterior to the ellipsoid body. The fan-shaped body is a regular structure of 6-8 horizontal layers and 16 vertical slices (sometimes called staves, columns or segments), 8 per hemisphere numbered from medial to lateral - arranged in 4 closely associated pairs. Its inferior part is much narrower than its superior part, forming a fan shape. Fibers emerging from the basal-most area of the fan project to the noduli. Rostro-caudally, it can be divided into 4 shells on the basis of the extent and position of small field arborizations."],
        "has_children":false,
        "is_defining_ontology":true,
        "is_root":false,
        "in_subset_annotation":["cur",
          "BrainName"],
        "id_annotation":["FBbt:00003679"],
        "database_cross_reference_annotation":["VFB:FBbt_00003679"],
        "has_obo_namespace_annotation":["fly_anatomy.ontology"],
        "comment_annotation":["Each of the two neighboring slices (1-2, 3-4, 5-6, 7-8) are associated more closely because they receive small-field columnar neurons generated by the same neuroblasts, forming four groups on each side of the midline, from lateral to medial: segment pair W, X, Y and Z (Boyan and Williams et al., 2011; Ito and Awasaki, 2008). Six to eight layers have been identified, depending on the staining that is used (Hanesch et al., 1989; Young and Armstrong, 2010; Kahsai and Winther, 2011)."],
        "id":"fbbt:http://purl.obolibrary.org/obo/FBbt_00003679",
        "short_form":["FBbt_00003679",
          "FBbt:00003679"],
        "ontology_name":"fbbt",
        "ontology_uri":"http://purl.obolibrary.org/obo/fbbt/fbbt-relaxed.owl",
        "synonym":["fan shaped body",
          "fb",
          "FB"],
        "_version_":1499264061024501762}]
  }}
dosumis commented 9 years ago

We can use a direct OWL query at setup to populate an indexed field in SOLR. e.g. This neuron class would be returned by the queries "overlaps some 'nervous system'" and "overlaps some 'adult brain'".

dosumis commented 9 years ago

After further discussion:

Robbie1977 commented 9 years ago

I'm having dificulty invisaging any examples that overlaps that aren't also Part_of 'adult brain'?

dosumis commented 9 years ago

Sensory neurons have parts that are part of sensory structures outside the brain. Some neurons project from the central brain through the cervical connective into the TAG. The converse is true by definition though. RO:overlaps encompasses part_of. See Bioinformatics paper on the relations for details. This means that if we redundantly instatiate the axiom "overlaps some 'adult brain'" on classes to which it directly applies, many classes will have direct overlap and part relationships to 'adult brain'. Probably confusing for users.

Might be useful to have a semantic spec overview doc that covers cases like this, but probably not productive to discuss further here.