Open jclerman opened 1 year ago
Surprising! What happened when you tried without the ofn intermediary?
Hi @matentzn. When I just did:
robot convert -i my-original-ontology.owl -o my-attempted-canonicalized-ontology.owl
I found that annotation-values were not sorted in the output. After round-tripping through ofn, I got a stable result (including sorting of those values).
Here's a fragment of a diff of the my-attempted-canonicalized-ontology.owl
against what I get after round-tripping:
*** 104805,104816 ****
</owl:Axiom>
<owl:Axiom>
<owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010703"/>
<owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym"/>
<owl:annotatedTarget>wing zeugopod skeleton</owl:annotatedTarget>
- <oboInOwl:hasDbXref>OBOL:automatic</oboInOwl:hasDbXref>
<oboInOwl:hasDbXref>NCBITaxon:8782</oboInOwl:hasDbXref>
<oboInOwl:hasSynonymType rdf:resource="http://purl.obolibrary.org/obo/uberon/core#SENSU"/>
</owl:Axiom>
<owl:Axiom>
<owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010703"/>
<owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym"/>
--- 104805,104816 ----
</owl:Axiom>
<owl:Axiom>
<owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010703"/>
<owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym"/>
<owl:annotatedTarget>wing zeugopod skeleton</owl:annotatedTarget>
<oboInOwl:hasDbXref>NCBITaxon:8782</oboInOwl:hasDbXref>
+ <oboInOwl:hasDbXref>OBOL:automatic</oboInOwl:hasDbXref>
<oboInOwl:hasSynonymType rdf:resource="http://purl.obolibrary.org/obo/uberon/core#SENSU"/>
</owl:Axiom>
<owl:Axiom>
<owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010703"/>
<owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym"/>
Very important to know for us, thank you for taking the time to report this.. Apart from us knowing about this, is there anything you think that should be done here in terms of a fix? It seems we basically have to live with this (short of someone working on the OWX parser in the OWL API itself)
Doesn't seem like there is too much that ROBOT could do (I can imagine internal workarounds, like setting up the ROBOT code to internally do an ofn
round-trip when being asked to do a no-op conversion from/to the same format - but not sure that's a good idea).
My only real suggestion would be to perhaps update the 1.9.2. release-notes, to tell people that they might need to do an ofn
round-trip to achieve canonicalization - that'd help users avoid getting bitten by this issue.
Thanks @jclerman! I added a mention of this issue to the 1.9.2 release notes. Once #1088 and #1089 are resolved and everything is updated, we'll make a bigger push to get everyone to update, and we'll keep this in mind.
I am experimenting with this behaviour by using the robot template
command to generate a .owl file from a .tsv file. Unfortunately, the workaround using .ofn does not work. :(
@CarMoreno What doesn't work? I think the suggestion in this thread it to use robot template
to create an .ofn
file, then then robot convert
to .owl
(RDF/XML).
@jamesaoverton That's exactly what I am doing. I generated thedummy.ofn
file from the template. And then, I generate the dummy.owl
using dummy.ofn
created previously:
robot template --template dummy_template.csv --output dummy.ofn
robot convert --input dummy.ofn --output dummy.owl
The axioms keep unsorted.
I have been exploring this somewhat with ROBOT 1.9.4 and Protege 5.6.2 (starting files were built with ROBOT 1.8.3 and Protege 5.5.0). Based on my exploration, for any file to reach a stable serialization two convert operations are needed but it doesn't matter what the file is converted to, e.g. to make doid.owl stable, either of the following work and end up with the same result.
robot convert -i doid.owl -o doid1.owl
robot convert -i doid1.owl -o doid.owl
OR
robot convert -i doid.owl -o doid.ofn
robot convert -i doid.ofn -o doid.owl
Stabilizing the doid-edit.owl file, which is actually in OWL functional syntax, also requires two filetype-agnostic converts. Protege has similar behavior. The first edit and save results in sorting by language tag, then alphabetical (same as first convert) and the second edit, if made after closing and re-opening the file, gets the final sort ordering.
For some reason the first convert operation sorts by presence/absence of language tag before sorting strings alphabetically, while the second sorts alphabetically first and language tag second.
I have only tested using ROBOT template to add axioms to an existing file, e.g. robot template -i doid-edit.ofn --template template.tsv --merge-before -o doid-edit.ofn
and to me it appears that the added axioms are sorted correctly. As long as the file serialization is already stable nothing needs to be done; if it hasn't, one ROBOT convert is needed.
@allenbaron super useful analysis, thank you!
Just noting that after stabilizing serialization of an .ofn file, if I run a robot query --update
command the ordering of lines in the output file (from .ofn to .ofn, in my case) changes making it similar to running a single robot convert
as I described above. I have to run another non-chained robot convert
to get the ordering back.
Full command to maintain stable ordering (and prefixes):
robot --add-prefixes build/doid-edit_prefixes.json \
query -i src/ontology/doid-edit.owl \
--update ../../DO_dev/sparql/update/DO-def_format_gene.ru \
-o tmp.ofn && \
robot convert -i tmp.ofn -o tmp2.ofn && \
mv tmp2.ofn src/ontology/doid-edit.owl && \
rm tmp.ofn
Serialization is stable when I run chained reason
& annotate
going from .ofn to .owl (i.e. another robot convert
on the resulting .owl file has no effect).
The recommendation in the release notes for
robot
1.9.2 suggests to:In my experience, that wasn't quite enough - complete canonicalization of my ontology didn't happen without round-tripping through OWL functional format - without doing that, some lines in the XML output were re-ordered when I round-tripped.
What worked for me (other variants might work too; haven't tested):