ome / design

OME Design proposals
http://ome.github.io/design/
1 stars 15 forks source link

ROI: Folders - Use case - Sample roi hierarchy #10

Open joshmoore opened 8 years ago

joshmoore commented 8 years ago

In the vein of trying to provide concrete use cases, I thought I'd capture something on github that I knew we'd eventually want. A subset of http://www.ebi.ac.uk/cmpo/CMPO_0000309 might work:

screen shot 2016-01-26 at 09 40 14

phenotype
  cell process phenotype
    absence of cell process
    arrested process
    cell adhesion phenotype
    cell component localisation
    cell component movement
      actin-mediated cell contraction
      cell migration phenotype
       decreased duration of cell migration process
       decreased rate of cell migration process
       impaired cell migration
       increased duration of cell migration process
       increased rate of cell migration process
       increased substrate-dependent cell migration
       increased substrate-dependent cell migration, cell extension
    cell cycle phenotype
      abnormal cell cycle
      cell cycle phase phenotype
      centriole replication
      cytokinesis phenotype
      mitotic process phenotype
    cell death phenotype
    cell division phenotype
    cell growth phenotype
      abnormal cell growth
      cell growth arrested
      decreased monopolar cell elongation
      increased monopolar cell elongation
  cellular component phenotype
    cell component morphology
      cell component shape
      cell component size
      cell component structure
      cilium morphology
      golgi morphology
      nuclear morphology
    phenotypes by organelle
      cilium phenotype
      cytoskeletal phenotype
      actin cytoskeleton
      actin filament phenotype
      ...
joshmoore commented 8 years ago

Contracting this down to something more screenshot-able, perhaps:

CMPO
  Cell Process
    Absense
    Adhesion
    Cycle
    Movement
  Cell Component
    Organelles
      Cilium
      Cytoskeleton
mtbc commented 8 years ago
$ bin/omero hql 'SELECT parent.name, folder.name FROM Folder AS folder LEFT OUTER JOIN folder.parentFolder AS parent WHERE folder.id >= 710 AND folder.id < 720'
Using session 319a93a4-504f-4bf6-8ffc-5a4e35c7effc (user-2@localhost:4064). Idle timeout: 10 min. Current group: private-1
 # | Col1           | Col2           
---+----------------+----------------
 0 | Cell Process   | Movement       
 1 | None           | CMPO           
 2 | CMPO           | Cell Component 
 3 | Cell Component | Organelles     
 4 | Organelles     | Cilium         
 5 | Organelles     | Cytoskeleton   
 6 | CMPO           | Cell Process   
 7 | Cell Process   | Absense        
 8 | Cell Process   | Movement       
 9 | Cell Process   | Adhesion       
(10 rows)

from simple-taxonomy.ome.tiff.

joshmoore commented 8 years ago

Attached file was produced by running the downloaded https://www.ebi.ac.uk/ols/beta/ontologies/cmpo through:

import ontospy
g = ontospy.Graph("cmpo.owl")
g.printClassTree(labels=True)

cmpo.txt -- Note: 100+ terms from the approx. 550 are in multiple locations!

mtbc commented 8 years ago

I'm happy to help with any mechanical transformations to be done to cmpo.txt. Is CMPO unusual in not being a strict tree?

joshmoore commented 8 years ago

I don't know. @eleanorwilliams may know more, but this may require our processing more of the OLS ontologies to get some statistics.

eleanorwilliams commented 8 years ago

Also, an ROI/cell could be annotated with more than one term e.g. the phenotype of 'abnormal microtubule cytoskeleton morphology during mitotic interphase' is annotated with both these: abnormal PATO_0000460
microtubule cytoskeleton morphology during mitotic interphase CMPO_0000369 E.g. http://idr-demo.openmicroscopy.org/webclient/?show=well-590809

eleanorwilliams commented 8 years ago

@mtbc @joshmoore. By 'not being a strict tree' are you referring to the fact that a term can be in more than one branch? This seems quite common in ontologies because there are different relationships expressed. e.g. look at 'leaf vascular tissue' in the Experimental Factor Ontology - its a part of a leaf, but also is a vascular tissue, and is a leaf component. http://www.ebi.ac.uk/ols/beta/ontologies/efo/terms/graph?iri=http://www.ebi.ac.uk/efo/EFO_0001037&&ed=http://www.ebi.ac.uk/efo/EFO_0001037&&.

CMPO_0000424 that Josh mentioned in the email (telophase arrested phenotype) http://www.ebi.ac.uk/ols/beta/ontologies/cmpo/terms/graph?iri=http://www.ebi.ac.uk/cmpo/CMPO_0000424&&ed=http://www.ebi.ac.uk/cmpo/CMPO_0000424&& is another example.

mtbc commented 8 years ago

Interesting: so Folders as currently constituted may not be appropriate for encoding realistic ontologies.

eleanorwilliams commented 8 years ago

Ontologies are Directed Acyclic Graphs (DAGs) rather than trees, and there can be multiple inheritance, but you could just pick one path through. E.g. compare this view of 'metaphase arrested phenotype' http://www.ebi.ac.uk/ols/beta/ontologies/cmpo/terms?iri=http%3A%2F%2Fwww.ebi.ac.uk%2Fcmpo%2FCMPO_0000305 and this view http://www.ebi.ac.uk/cmpo/CMPO_0000305 In the latter, they just picked (at random) one parent.

mtbc commented 8 years ago

Would we need a workaround to support analysis subtasks like "is this ROI of this kind?" and "give me all the ROIs of this kind" where the path from the ROI to the kind (e.g., "mitotic process phenotype") is one we omitted? Perhaps we can expect analysis to use their own full copy of the ontology rather than querying OMERO's simplified tree? (@pwalczysko will have already found that importing general DAG ontologies into OMERO automatically gives you a tree.) For ontology workflows it may suffice if OMERO knows which folders the ROIs are in without knowing the full folder structure.

joshmoore commented 8 years ago

In discussion with @sbesson and @chris-allan today, I think we convinced ourselves of @mtbc's last statement: the goal of this structure is to provide OMERO with a tree mechanism and not a DAG. Ontology-friends can use that structure to the extent possible, but it will not be a perfect fit. Some terms will need to be mapped to multiple Folder objects. A field (config of type Map<String, String>) might suffice to track such irregularities.