ontodev / robot

ROBOT is an OBO Tool
http://robot.obolibrary.org
BSD 3-Clause "New" or "Revised" License
259 stars 73 forks source link

Add commands for importing and exporting mappings in standard formats (SSSOM) #312

Closed cmungall closed 3 years ago

cmungall commented 6 years ago

(updating the original content of this ticket to be more informative)

Many ontologies maintain and release mappings, either to external resources or to other ontologies.

These are bundled alternately as:

I discuss some of this here: https://douroucouli.wordpress.com/2019/05/27/never-mind-the-logix-taming-the-semantic-anarchy-of-mappings-in-ontologie/

It would be good to support this in ROBOT releases

Internally, the mappings may be managed using any of the above mechanisms. Groups have different workflows. E.g. in GO we pull in some xrefs from upstream (e.g. interpro2go is managed by the interpro group). In some cases we want to see the mappings in the edit version of the ontology via an import.

Sometimes mappings are maintained as xrefs with additional ad-hoc semantics on the axiom annotations.

I think what we need is a kind of universal mapping converter orchestrated by two new operations:

Import would bring in from any format into the ontology to any of the non-TSV targets. E.g.

robot import-mappings --source-type TSV --target-type xref -i interpro2go.tsv -o foo.owl

There would be special purpose options for some targets. E.g.

robot import-mappings --source-type TSV --target-type skos --predicate skos:exactMatch\
    -i foo2bar.tsv -o foo.owl

or

robot import-mappings --source-type TSV --target-type skos --predicate-column 3\
    -i foo2bar.tsv -o foo.owl

Export would export from the internal ontology representation (any of the 3)

robot  export-mappings -i foo.owl --target-type TSV --source-type xref --source-prefix BAR -o foo2bar.tsv

To simplify we may omit the -o option from import-mappings and force the user to chain.

Due to the variety of approaches there would probably need to be a variety of options:

For translation to OWL logical axioms, equivalentClasses is the default, but other options are possible. See the blog post for uberon mappings. This was originally specced out for obo format translation, but we actually want to make this generally applicable outside obo format:

So for example we could have the following which would be specified when working with translation to OWL logical axioms

We will also want options for annotating exported axioms.

This is a good time to standardize annotations on mapping axioms. I suggest:

For TSVs we will likely want a standard format. We should make this pandas/dataframes friendly, with the first line being a comment header. I suggest as columns:

Implementation

It may be possible to implement this using a mixture of

In fact arguably there is no need for ROBOT commands at all.

But I think this complicates things for people using ROBOT, could lead to more complex Makefiles, and complex configurations with logic distributed over SPARQL queries and templates. This is especially true from going from xref/CURIE-literal world to skos/OWL/URI-world.

I feel ROBOT should take one for the team and internalize the complexity and provide a useful facade that is easy to use and can be flexible yet encourage good practice in distribution. I would do this using fairly procedural code but am open to other ways. I have written all this code elsewhere and don't mind taking this one on, but would like feedback on specs.

goodb commented 5 years ago

I think this is a really good idea. Mappings are indeed fundamental and are mostly done ad hoc. Consolidating this into robot makes good sense. You might want to change the title of the ticket to 'add a standard command and formats for importing and exporting mappings'. I'm not sure if its too obo-specific, but adding in a capability of following 'replaced by' edges when an input mapping file uses obsoleted terms would be handy. Also a report that indicated e.g. encounters with obsoleted or missing terms would be useful.

cmungall commented 5 years ago

adding in a capability of following 'replaced by' edges when an input mapping file uses obsoleted terms would be handy

We actually have this command:

http://robot.obolibrary.org/repair

But it's not well documented (e.g. behavior with CURIE literals is undefined).

And we should also have good xref checks on

http://robot.obolibrary.org/report

E.g. ensuring CURIE literals are registered (this may requiring putting in the prefixes as axioms using the shacl vocabulary)

As we build out these commands we should check they compose with repair (and report) in coherent and intuitive ways. A common workflow will be:

cmungall commented 5 years ago

I am sketching out the idea I have for the implementation. Alternatives welcome. I like doing this to ensure overall coherency and consistency of style of the codebase.

Object Model

class Mapping:
  OWLAnnotation[] annotations // axiom annotations on the mapping object
  OWLObject subjectEntity // typically a class in the main ontology
  Optional[OWLProperty] mappingProperty // typically an AP but we don't want to constrain
  MappingTarget target // the xref / external URI

class MappingTarget
  // note that one of the two should be set
  Optional[String] curieLiteral // the value used when serializing as xref
  Optional[IRI] iri                       // the value used when serializing as SKOS or OWL axioms

This would be a basic DAO - no logic, no methods other than accessors.

This OM would be the currency for different input/output adapters: Xref, SKOS, OWLLogical, Record (tsv). Each adapter would have read/render methods. These would take in options. For example, when writing or reading SKOS we could choose a default. We could make special Option classes but it may be easier to just pass through strings as this is how options work in ROBOT.

There would be a utility class that can do things like fill in curieLiteral from IRI and vice versa. This can be a static class/method to keep James happy :-)

matentzn commented 4 years ago

After 2 months of deliberation, we finally have a first working draft of the Simple Standard for Sharing Ontology Mappings (SSSOM). I am talking to various groups to get more feedback, and the draft is definitely VERY rough around the edges, but the foundation is there.

matentzn commented 4 years ago

the SSSOM spec is updated now with examples and clarifications.

cmungall commented 3 years ago

Note we now have python code that does most of what is specced in this ticket: https://sssom-py.readthedocs.io/en/latest/

there is still value in having in robot.. but also happy t close this

jamesaoverton commented 3 years ago

Ok, let's close this, at least for now.

dr-shorthair commented 8 months ago

@cmungall @jamesaoverton sssom-py does not appear to exist (any more) in readthedocs.io . Was it folded into something else?

matentzn commented 8 months ago

SSSOM Java is a ROBOT extension that can be used for basic processing of SSSOM files in Java.

SSSOM toolkit also known as sssom-py is a python toolkit for basic processing of sssom files in python, e.g. parsing and converting.

Does this help?