ontodev / robot

ROBOT is an OBO Tool
http://robot.obolibrary.org
BSD 3-Clause "New" or "Revised" License
260 stars 74 forks source link

New parameter --axioms "axiom annotations" for remove and filter #886

Closed matentzn closed 5 months ago

matentzn commented 3 years ago

This could give us a simple way to remove certain axiom annotations while preserving others.

matentzn commented 2 years ago

A use case if, for example, if we have a axiom annotation such as:

A subClassOf B [editors_note: "We are not sure this is in the right place", rdfs:seeAlso https://github.com/ontodev/robot/issues/886]

When running a release, we may want to remove all the editors_note annotations from all our axioms, but not remove them from, say, classes like:

A editors_note "this class should be used only after a few shots of Tequila"
joeflack4 commented 2 years ago

+1!

matentzn commented 2 years ago

@hkir-dev cannot assign this to you, because you are not on the list of contributors.. @jamesaoverton is it possible to give @hkir-dev enough access to ROBOT so I can tag him on issues?

jamesaoverton commented 2 years ago

Ok, I invited @hkir-dev to @ontodev/robot-team

matentzn commented 2 years ago

Awesome thank you! @hkir-dev you can assign yourself once you have accepted!

shawntanzk commented 2 years ago

spec:

--axioms axiom_annotations -> remove all the axiom annotations (starting point) However, design might be better to have a new parameter like --drop_axioms_annotations - might be an idea Not sure if we should go into specific annotation etc. -> needs rethinking

hkir-dev commented 2 years ago

Since we want to cherry-pick the annotation property to remove, having a new parameter --drop_axioms_annotation made more sense. Created the related PR #1023

jamesaoverton commented 2 years ago

(Sorry, I just got back from a week off.)

I have come to regret the design of remove/filter, because nobody seems to understand it -- including myself, sometimes. When I sit down to use it (infrequently), I need to skim the docs again, and try some experiments before I hit on the right combination of options. A better design would be more "obvious" or "intuitive". In my defence, it's a tricky design problem.

Despite my regrets, we aren't changing the design of remove/filter. If somebody comes up with a better design, we'll make a new command (or commands).

This --drop-axiom-annotations in #1023 breaks the design, because it has no connection to any of the other parts of remove/filter.

A new "axiom annotations" option for --axioms, as originally proposed, would fit with the design. Then the base file examples from #1023 would work in two steps:

robot remove --input template.owl \
  --base-iri http://example.com/ \
  --axioms external \
  remove \
  --term IAO:0000117 \
  --term IAO:0000119 \
  --axioms axiom-annotations \
  --output results/template-drop-axiom-remove.owl
jamesaoverton commented 2 years ago

@matentzn provided this example:

:A :hasDbXref NICO:1 [:hasDbXref PMID:123]
:A IAO:0000115 “def” [:hasDbXref PMID:124, :source :X]

How will I remove the axiom annotation in the second (:hasDbXref PMID:124, xrefs on definitions):

  1. without removing first (remove only xrefs on definitions) and
  2. without removing any other axiom annotations (such as :source :X)
jamesaoverton commented 2 years ago

SPARQL can handle the basic case above, but I can see how it would be tricky to handle the general case:

At this point, the use case seems very similar to #988, where I was also uncomfortable with the proposed changes to remove/filter. @psiotwo created a new axioms command to handle that case.

psiotwo commented 2 years ago

In https://github.com/ontodev/robot/issues/988 even a more fine-grained scenario is offered - to remove axioms not only based on the property, but also based on the property value (e.g. owl:deprecated=true). It seems to me this PR cannot be used to support such scenarios.

IMO, SPARQL is rarely a good match for "ontology-level" OWL manipulations - in many cases attempts to "ravish" it to handle OWL basically means to implement bits and pieces of RDF serialisation of OWL 2 "cca imitating" what has already been implemented in OWLAPI and furthermore leaking the complexity of OWL->RDF translation (as also presented by @jamesaoverton above).

Based on what I have known about ROBOT so far (not much yet :-) ), I got the feeling that either 1) a dedicated command for axiom processing or 2) extending the semantics of selectors to support axioms would be more systematic solutions. However, I understand problems of these options, as already discussed in https://github.com/ontodev/robot/issues/988.

matentzn commented 2 years ago

Before we go back into the technical details, let me just re-iterate that my use case is the following:

  1. We have a number of axiom annotations used at curation time. These help curators find their way around, inform CI processes etc and documentation.
  2. We do not want to show these to our users. We use axiom annotations like "marked for revision", "excluded from qc", "conformsTo pattern" etc that make no sense for our users, and we want to selectively delete these.
  3. We currently use SPARQL to get rid of these, but this has caused a fragmentation across repos, duplication of code (the sparql queries) and general ugliness (as @psiotwo points out above, SPARQL is not exactly the right tool for axiom level manipulation).

So, we want to selectively delete axiom annotations.

Why did I propose to implement this in the remove (filter) commands? --term/-T + --select allows me to select the terms for which I want to remove the annotations, --axioms allows to me narrow my attention to a specific kind of axioms.

Both things I still want to be able to do for the above use case (delete axiom annotations from subclassOf axioms in the "disease" branch of Mondo).

Ok, we have now selected the correct axioms for the correct entities, the way I always do with remove/filter. Now what is next? Axiom annotations are not axioms. Overloading --axioms Does not make sense, because --axioms is conceptually "disjunctive" (rather than conjunctive). So saying: --axioms Declaration SubClassOf means: select all Declaration axioms and all SubClassOf axioms. If you add some new parameter like "AxiomAnnotationAssertionAxiom" to this, you will select all of these following the same logic, not just the ones on "SubClassOf" axioms (--axioms AxiomAnnotationAssertionAxiom SubClassOf will select all "AxiomAnnotationAssertionAxiom", and all SubClassOf axioms, which is not what we want - we want conjunctive logic here). So --axiom is definitely the wrong place for the parameter.

Alright brings us to the last question: does --drop-annotation-assertion break the design of remove? I am not so sure. --select expands the term set, --axioms contracts the axioms set. So the commands already shift from one paradigm (term) to another (axioms). My belief is that --axiom-annotation simple shifts the attention to yet another third (and last(!)) aspect of the ontology: the axiom annotations. So while "drop" is maybe a bit ideosyncratic, changing it to --axiom-annotation is well within the original design.

Now what about the use case of selecting axiom annotations based on their values. We could do something like this

--axiom-annotation oboInOwl:hasDbXref=IAO:* --axiom-annotation owl:deprecated=true^^xsd:boolean --axiom-annotation oboInOwl:source

To mean:

Target all xrefs whose values start with "IAO:", and all owl:deprecated axioms annotations with value true, and all oboInOwl:source axioms..

It seems ok to me to do this, albeit a bit more complicated than what is proposed atm. What do you think?

psiotwo commented 2 years ago

Great analysis @matentzn, thanks.

I would be happy with the solution you propose as it also seems to partially solve https://github.com/ontodev/robot/issues/988. ;-)

Yet, a user-perspective teasing note : if term selectors - as currently implemented - are to be kept in ROBOT, then having an analogous parameter for axioms might make the command more clear to me. Basically, having --select classes seems to me cca analogous to --axioms subclass (both selecting some OWL object type), so --select "owl:deprecated='true'^^xsd:boolean" would be analogous to --axioms "owl:deprecated='true'^^xsd:boolean" (both selecting something deprecated).

Also, re: this conjunction/disjunction issue - isn't this solved for --select already? Having a single --select with multiple parameters should be a disjunction and multiple --selects to be a conjunction (never tried, just trying to interpret - maybe wrongly - the docs)?

matentzn commented 2 years ago

having an analogous parameter for axioms might make the command more clear to me.

It makes sense, but I would like to discuss this on another issue if possible as this does not relate directly to the axiom annotations.

conjunction/disjunction issue - isn't this solved for --select already

Yeah didn't think of that, you may be right - but it is definitely not solved for --axioms!

jamesaoverton commented 2 years ago

Talking @matentzn: I wanted to connect these use cases, but they are different. @matentzn wants to remove the axiom annotation, leaving the axiom intact. @psiotwo wants to remove the axiom itself (which of course includes any annotations on it).

matentzn commented 2 years ago

Thank you @jamesaoverton for helping to understand the two entirely different use cases here.. I did not at all realise until now we were talking about two different things.

This is a bit of a problem now design wise.

I still think that using the --axiom-annotation design I propose above is best suited for covering my use case:

EG:

robot remove --term MONDO:123 --select "self descendants" --axioms "SubClassOf Declaration" --axiom-annotation oboInOwl:hasDbXref=IAO:* --axiom-annotation owl:deprecated=true^^xsd:boolean --axiom-annotation oboInOwl:source

Selects MONDO:123 and all its descendants, then selects all related SubClassOf and Declaration axioms, then refocusses on the axiom annotations themselves (i.e. deleting the axiom annotations, not the axioms).

What you want, selecting axioms based on annotations should be solved, as you also say @psiotwo, by an --axiom selector. This would be an option, reusing much of the design of the above (careful about the disjunctive interpretation of --axioms):

robot remove --term MONDO:123 --select "self descendants" --axioms "SubClassOf Declaration A$oboInOwl:hasDbXref=IAO:* A$owl:deprecated=true^^xsd:boolean"

I don't know whether it is possible, but a possibility for the conjunctive interpretation is:

robot remove --term MONDO:123 --select "self descendants" --axioms "SubClassOf Declaration" --axioms "A$oboInOwl:hasDbXref=IAO:* A$owl:deprecated=true^^xsd:boolean"

The A$ is not necessary, but would make implementation easier as you don't have to first check that the parameter is not a OWLAxiomType or a named axiom selector like logical, but obviously we can drop it, that's not the point.

Here is my proposal:

  1. Lets agree that --axiom-annotations is NOT suitable for use case 2, so we can use this for use case 1
  2. We implement use case 1 for now so I get it off my list
  3. We agree to keep the path open for use case 2, and discuss the details of its modelling in a new issue
psiotwo commented 2 years ago

Thanks both for explanation (Before, I didn't realise we are speaking about two different things ...).

matentzn commented 1 year ago

@hkir-dev lets up the priority on this issue! Maybe add it to the agenda on Monday to at least tip-toe towards a solution!

matentzn commented 1 year ago

@jamesaoverton can you review my proposal for use case 1 (the removal of axiom annotations, comments 1 and 2) once more, and see if you can agree with me that it does not break the current design of filter/remove but indeed, builds exactly on top of it? We can deal with the second use case in a second round of changes. I know it is cognitively a bit demanding to sign off on the proposal, but I do try to be careful not to break your designs, and I honestly think this one is not :D

matentzn commented 10 months ago

@jamesaoverton I keep hitting a case in my work which is almost, but not 100%, addressed by this PR.

We are trying to use axiom annotations more and more for provenance, like:

AnnotationAssertion(Annotation(<http://www.geneontology.org/formats/oboInOwl#hasDbXref> <https://orcid.org/0000-0002-6548-5200>) Annotation(<http://www.geneontology.org/formats/oboInOwl#hasSynonymType> <http://purl.obolibrary.org/obo/hp#layperson>) <http://www.geneontology.org/formats/oboInOwl#hasExactSynonym> <http://purl.obolibrary.org/obo/HP_0000763> "Damage to nerves")

This axiom states that the synonym "damage to nerves" was provided by https://orcid.org/0000-0002-6548-5200 and is of type "layperson".

For this command here to do absolutely everything that is needed to handle all my (admittedly specialised) workflows, I would need to be able to do things like:

  1. Delete all layperson synonyms
  2. Remove all evidence provided by orcids (for example, because I want to update this information from another source)

Right now, we can do this (according the the PR here):

--drop-axiom-annotations oboInOwl:hasDbXref

which gets rid of all hasDbXref annotations in a target set. But what I need to do to really do everything I want is something like this, using the pattern language defined here http://robot.obolibrary.org/remove#pattern-subset-selectors:

--drop-axiom-annotations "oboInOwl:hasDbXref=~.*orcid.*"

to remove all xrefs annotations whose values contain the string "orcid" or:

--drop-axiom-annotations "oboInOwl:hasSynonymType=<http://purl.obolibrary.org/obo/hp#layperson>"

to remove all synonym type declarations that have exactly the value "layperson" (the IRI).

We should support the entire set of pattern subset selectors given here: http://robot.obolibrary.org/remove#pattern-subset-selectors and reuse the respective code.

This is also related to @psiotwo comment above (https://github.com/ontodev/robot/issues/886#issuecomment-1177741400).

COMMENT:

This comment has been edited as per @jamesaoverton comment below.

jamesaoverton commented 9 months ago

@matentzn Yes, if you're going to implement this, I think it would be good to include it in ROBOT.

I think that using regexes like --select would be more consistent than using wildcard/globs in this case.

I don't love the @. I'd prefer = and =~ to be consistent with our --select patterns: http://robot.obolibrary.org/remove#pattern-subset-selectors