geneontology / noctua

Graph-based modeling environment for biology, including prototype editor and services
http://noctua.geneontology.org/
BSD 3-Clause "New" or "Revised" License
37 stars 13 forks source link

Add documentation on OWL modeling #185

Closed cmungall closed 9 years ago

cmungall commented 9 years ago

cc @dosumis @hdietze @simonjupp

We need some docs that explain the OWL modeling in Noctua. In particular, something aimed at non-experts that clearly distinguishes between what the ontology says and what a model says (the fact that we use some shared visual paradigms, like boxes connected by connectors is likely to lead to more confusion). We would also like something aimed at the OWL expert community so that they can effectively consume Noctua models and help build more efficient reasoners, etc. At the same time we need to target bioinformaticians who are unlikely to be OWL experts, and provide them with effective ways of using models in analyses.

TBD: where would this live? README-owldev in the noctua repo? I'd rather have things collated in one place (we have some old material here we need to migrate http://geneontology.org//experimental/lego/owl/ )

Sketch of docs:

OWL Modeling

The native form of a Noctua model is OWL. A Noctua model consists of ABox axioms (ie classes about individuals) - this is in contrast to a traditional ontology which is TBox axioms (ie class axioms). We use the term 'LEGO model' when we are talking about an ABox with members that instantiate GO molecular function classes (ie an activity flow diagram). More generally 'Noctua model' for when we have minimal assumptions about ontologies used.

General modeling paradigm (informal)

The general paradigm can be summarized as: create an individual for anything, 'define' that individual by its connections. The individuals generally do not have properties such as labels attached. Individuals are generated by the tool and are assumed to be 'identity-less' and unique to the model (with the exception being some of the supporting provenance type individuals, e.g. an instance of a publication).

To state that gene product P has some unspecified activity whilst localized to the nucleus, we would create:

:001 rdf:type P
:002 rdf:type MF:root
:003 rdf:type CC:nucleus

:002 RO:enabled_by :001
:002 BFO:occurs_in :003

Note that we are modeling specific gene products like 'Shh protein' as classes

See other lego docs for full details on relations.

Evidence and provenance

All evidence is stored on a per-axiom basis. We create an axiom annotation, that uses a WILL CHANGE AnnotationProperty to connect the axiom to the evidence instance IRI (it's necessary for this to be to the IRI not individual because owls). The evidence instance IRI should be for an individual that instantiates an ECO class. From this, other OPEs hang off - publication, supporting object (may be literals but this will change TODO add ticket here)

Provenance can be at the level of axiom, individual or ontology. The APs are dc:date and dc:contributor are added automatically so you should see a lot of these on new models.

Availability

Currently stored here: https://github.com/geneontology/noctua-models

Any existing set of GO associations can be converted, albeit in a 'degenerate' disconnected form. This can still be useful for the purposes of uniform tooling and programmatic access:

owltools go.owl --gaf my.gaf --gaf-lego-indivduals -o my-lego.owl

Down in the weeds background details

kltm commented 9 years ago

geneontology/noctua-models might be a good place to start the documentation.

One thing that I think I'd like to see, and seemed like would be really useful from the meeting feedback, would be to have:

It might be most effective as a slide deck. I think this would provide the answers to a lot of the confusion. We've always been using "graphs", just very poor ones; this is both an evolution and a Big Deal. There also needs to be more emphasis that the graph frontend is just a frontend, and a lot of work will be done without it (although it's best to show this with the form frontends).