ontodev / robot

ROBOT is an OBO Tool
http://robot.obolibrary.org
BSD 3-Clause "New" or "Revised" License
259 stars 71 forks source link

Automated fixes for ROBOT report #705

Open jamesaoverton opened 4 years ago

jamesaoverton commented 4 years ago

@zhengj2007 Asks whether we could automatically fix certain errors and warnings from ROBOT report. This is especially relevant when trying to make an ontology conform to the OBO Dashboard.

The main worry is doing too much, or unexpected things, and breaking the source ontology. "To err is human. To make a million errors a second you need a computer."

Report uses SPARQL to detect errors. I would group the rules into three categories:

  1. can be fixed automatically, e.g. trim annotation whitespace
  2. (semi-automatic) interactive fixes, e.g. missing license
  3. can't be automatically fixed, e.g. invalid xref

For (1) we could use SPARQL UPDATE. For (2) we need some custom code.

Alternatively, we could use an external tool, but we might want some ROBOT methods anyway...

matentzn commented 4 years ago

(1) Would be nice! Another thing I need to constantly fix is "rogue declarations", i.e. declaration axioms that are not used otherwise in the edit file (if imports are ignored). These cause problems for ROBOT (unlabelled entity errors).

cmungall commented 4 years ago

Note that robot can already repair certain problems encountered in ontologies. So far, this is limited to updating axioms pointing to deprecated classes with their replacement class (indicated using term replaced by).

http://robot.obolibrary.org/repair

The intent was always to include more repairs (See discussion #324)

I think it might be useful and easy to support a subset of 1. E.g as you suggest, a few standard .ru files in the robot repo plus a lightweight wrapper to run these via repair.

But what is the priority here?

The current ability repair logical axioms has high value because obsoletions/merges are frequent, and can involve 1000s of downstream axiom changes. This would be really hard to do without a command.

But if someone needs to do a one-off lexical fix (one-off because future violations can be nipped in the bud), I can give you a perl one liner for a number of owl formats that will fix this. This may be a better use of resources.

I think 2-3 are lower priority. Didn't use ODK and you don't have a license? I'll do a PR for you if you really can't do this yourself (likely because you are using obo-edit, all protege users should know how to do this, or you shouldn't be making ontologies).

I like the idea of working towards a really rich set of checks with interactive repair but this belongs in an IDE, not the command line. Maybe we can work with other groups, AFAICR outside OBO there are lots of people working on similar things, maybe first engage with them?

e.g https://protegewiki.stanford.edu/wiki/OntoDebug