Open joshmoore opened 8 years ago
Contracting this down to something more screenshot-able, perhaps:
CMPO
Cell Process
Absense
Adhesion
Cycle
Movement
Cell Component
Organelles
Cilium
Cytoskeleton
$ bin/omero hql 'SELECT parent.name, folder.name FROM Folder AS folder LEFT OUTER JOIN folder.parentFolder AS parent WHERE folder.id >= 710 AND folder.id < 720'
Using session 319a93a4-504f-4bf6-8ffc-5a4e35c7effc (user-2@localhost:4064). Idle timeout: 10 min. Current group: private-1
# | Col1 | Col2
---+----------------+----------------
0 | Cell Process | Movement
1 | None | CMPO
2 | CMPO | Cell Component
3 | Cell Component | Organelles
4 | Organelles | Cilium
5 | Organelles | Cytoskeleton
6 | CMPO | Cell Process
7 | Cell Process | Absense
8 | Cell Process | Movement
9 | Cell Process | Adhesion
(10 rows)
from simple-taxonomy.ome.tiff
.
Attached file was produced by running the downloaded https://www.ebi.ac.uk/ols/beta/ontologies/cmpo through:
import ontospy
g = ontospy.Graph("cmpo.owl")
g.printClassTree(labels=True)
cmpo.txt -- Note: 100+ terms from the approx. 550 are in multiple locations!
I'm happy to help with any mechanical transformations to be done to cmpo.txt
. Is CMPO unusual in not being a strict tree?
I don't know. @eleanorwilliams may know more, but this may require our processing more of the OLS ontologies to get some statistics.
Also, an ROI/cell could be annotated with more than one term e.g. the phenotype of 'abnormal microtubule cytoskeleton morphology during mitotic interphase' is annotated with both these:
abnormal PATO_0000460
microtubule cytoskeleton morphology during mitotic interphase CMPO_0000369
E.g.
http://idr-demo.openmicroscopy.org/webclient/?show=well-590809
@mtbc @joshmoore. By 'not being a strict tree' are you referring to the fact that a term can be in more than one branch? This seems quite common in ontologies because there are different relationships expressed. e.g. look at 'leaf vascular tissue' in the Experimental Factor Ontology - its a part of a leaf, but also is a vascular tissue, and is a leaf component. http://www.ebi.ac.uk/ols/beta/ontologies/efo/terms/graph?iri=http://www.ebi.ac.uk/efo/EFO_0001037&&ed=http://www.ebi.ac.uk/efo/EFO_0001037&&.
CMPO_0000424 that Josh mentioned in the email (telophase arrested phenotype) http://www.ebi.ac.uk/ols/beta/ontologies/cmpo/terms/graph?iri=http://www.ebi.ac.uk/cmpo/CMPO_0000424&&ed=http://www.ebi.ac.uk/cmpo/CMPO_0000424&& is another example.
Interesting: so Folders as currently constituted may not be appropriate for encoding realistic ontologies.
Ontologies are Directed Acyclic Graphs (DAGs) rather than trees, and there can be multiple inheritance, but you could just pick one path through. E.g. compare this view of 'metaphase arrested phenotype' http://www.ebi.ac.uk/ols/beta/ontologies/cmpo/terms?iri=http%3A%2F%2Fwww.ebi.ac.uk%2Fcmpo%2FCMPO_0000305 and this view http://www.ebi.ac.uk/cmpo/CMPO_0000305 In the latter, they just picked (at random) one parent.
Would we need a workaround to support analysis subtasks like "is this ROI of this kind?" and "give me all the ROIs of this kind" where the path from the ROI to the kind (e.g., "mitotic process phenotype") is one we omitted? Perhaps we can expect analysis to use their own full copy of the ontology rather than querying OMERO's simplified tree? (@pwalczysko will have already found that importing general DAG ontologies into OMERO automatically gives you a tree.) For ontology workflows it may suffice if OMERO knows which folders the ROIs are in without knowing the full folder structure.
In discussion with @sbesson and @chris-allan today, I think we convinced ourselves of @mtbc's last statement: the goal of this structure is to provide OMERO with a tree mechanism and not a DAG. Ontology-friends can use that structure to the extent possible, but it will not be a perfect fit. Some terms will need to be mapped to multiple Folder objects. A field (config
of type Map<String, String>
) might suffice to track such irregularities.
In the vein of trying to provide concrete use cases, I thought I'd capture something on github that I knew we'd eventually want. A subset of http://www.ebi.ac.uk/cmpo/CMPO_0000309 might work: