INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
219 stars 54 forks source link

Mirroring updates local imports even when files are identical #863

Closed allysonlister closed 1 year ago

allysonlister commented 1 year ago

Observed Behaviour

Imports of OBO Foundry ontologies (within a new release procedure for an application ontology I build) are mirroring & downloading with every make command, rather than only when the source import has changed.

In this example, I am importing PO, OBO, and NCIt via the filter module type within import_group (see yaml below). Each time I run refresh-xxx it mirrors the file anew, even if the source ontology has not been updated. This means that big files such as NCIt (which don't have a gzipped or slim version) take ages...

However, weirdly the custom EDAM import that pulls from https://edamontology.org/EDAM.owl seems to be correctly noticing that EDAM hasn't changed, and doesn't pull a new copy.

Background

I'm using the ODK to build an application ontology that uses mainly OBO Foundry ontologies, but also EDAM. This will never been submitted to the Foundry officially, but I use the ODK for the release of my foundry ontologies, so wanted to upgrade my build process for my application ontology so that all build using the same system. You can find the old build process (which uses Ontofox) here: https://github.com/FAIRsharing/subject-ontology

There have been some stability issues with Ontofox recently, so I thought it was a good time to convert to ODK to match my other release workflows. I am currently working in a target/ ODK subfolder that isn't checked into the repository, so add the appropriate config here. Please let me know if you need any other details.

yaml file

Here is my SRAO-odk.yaml

id: SRAO
title: Subject Resource Application Ontology
github_org: allysonlister
repo: SRAO
license: CC-BY 4.0
import_group:
  mirror_max_time_download: 400
  products:
    - id: obi
      module_type: filter
    - id: po
      module_type: filter
    - id: edam
      mirror_from: https://edamontology.org/EDAM.owl
      module_type: custom
    - id: ncit
      module_type: filter
uribase: http://www.fairsharing.org/ontology/subject
robot_java_args: '-Xmx16G'

P.S. As ODK creates files as root in Ubuntu, I did try chowning everything to be as my normal user, in case it was a permissions issue, but this doesn't seem to be the case.

output of sh run.sh make

Every time I run this command, I get the same re-mirroring happening as shown below.

$ sh run.sh make 
SRAO.Makefile:10: warning: overriding recipe for target 'imports/edam_import.owl'
Makefile:335: warning: ignoring old recipe for target 'imports/edam_import.owl'
echo "ODK Makefile version: v1.4 (this is the version of the ODK with which this Makefile was generated, \
        not the version of the ODK you are running)" &&\
echo "ROBOT version (ODK): " && robot --catalog catalog-v001.xml --version
ODK Makefile version: v1.4 (this is the version of the ODK with which this Makefile was generated,         not the version of the ODK you are running)
ROBOT version (ODK): 
ROBOT version 1.9.3
robot --catalog catalog-v001.xml reason --input tmp/SRAO-preprocess.owl --reasoner ELK --equivalent-classes-allowed asserted-only \
    --exclude-tautologies structural --output test.owl && rm test.owl
robot --catalog catalog-v001.xml verify  --catalog catalog-v001.xml -i tmp/SRAO-preprocess.owl --queries ../sparql/owldef-self-reference-violation.sparql ../sparql/iri-range-violation.sparql ../sparql/label-with-iri-violation.sparql ../sparql/multiple-replaced_by-violation.sparql -O reports
PASS Rule ../sparql/owldef-self-reference-violation.sparql: 0 violation(s)
PASS Rule ../sparql/iri-range-violation.sparql: 0 violation(s)
PASS Rule ../sparql/label-with-iri-violation.sparql: 0 violation(s)
PASS Rule ../sparql/multiple-replaced_by-violation.sparql: 0 violation(s)
if [ true  = true ] && [ true  = true ]; then curl -L http://purl.obolibrary.org/obo/obi.owl --create-dirs -o mirror/obi.owl --retry 4 --max-time 400 &&\
    robot --catalog catalog-v001.xml convert -i mirror/obi.owl -o mirror-obi.tmp.owl &&\
    mv mirror-obi.tmp.owl tmp/mirror-obi.owl; fi
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   339  100   339    0     0   1022      0 --:--:-- --:--:-- --:--:--  1024
100 9129k  100 9129k    0     0  3174k      0  0:00:02  0:00:02 --:--:-- 4970k
if [ true  = true ] && [ true  = true ] && [ -f tmp/mirror-obi.owl ]; then if cmp -s tmp/mirror-obi.owl mirror/obi.owl ; then echo "Mirror identical, ignoring."; else echo "Mirrors different, updating." &&\
    cp tmp/mirror-obi.owl mirror/obi.owl; fi; fi
Mirrors different, updating.
if [ true  = true ]; then robot --catalog catalog-v001.xml query -i mirror/obi.owl --update ../sparql/preprocess-module.ru \
    extract -T imports/obi_terms_combined.txt --copy-ontology-annotations true --force true --method BOT \
    remove --base-iri http://purl.obolibrary.org/obo/OBI --axioms external --preserve-structure false --trim false \
    remove  --term rdfs:label  --term IAO:0000115 -T imports/obi_terms_combined.txt --select complement \
    query --update ../sparql/inject-subset-declaration.ru --update ../sparql/inject-synonymtype-declaration.ru --update ../sparql/postprocess-module.ru \
    annotate --ontology-iri http://www.fairsharing.org/ontology/subject/SRAO/imports/obi_import.owl annotate -V http://www.fairsharing.org/ontology/subject/SRAO/releases/2023-05-09/imports/obi_import.owl --annotation owl:versionInfo 2023-05-09 convert -f ofn --output imports/obi_import.owl.tmp.owl && mv imports/obi_import.owl.tmp.owl imports/obi_import.owl; fi
if [ true  = true ] && [ true  = true ]; then curl -L http://purl.obolibrary.org/obo/po.owl --create-dirs -o mirror/po.owl --retry 4 --max-time 400 &&\
    robot --catalog catalog-v001.xml convert -i mirror/po.owl -o mirror-po.tmp.owl &&\
    mv mirror-po.tmp.owl tmp/mirror-po.owl; fi
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   341  100   341    0     0   1424      0 --:--:-- --:--:-- --:--:--  1426
100 6802k  100 6802k    0     0  3105k      0  0:00:02  0:00:02 --:--:-- 4948k
if [ true  = true ] && [ true  = true ] && [ -f tmp/mirror-po.owl ]; then if cmp -s tmp/mirror-po.owl mirror/po.owl ; then echo "Mirror identical, ignoring."; else echo "Mirrors different, updating." &&\
    cp tmp/mirror-po.owl mirror/po.owl; fi; fi
Mirrors different, updating.
if [ true  = true ]; then robot --catalog catalog-v001.xml query -i mirror/po.owl --update ../sparql/preprocess-module.ru \
    extract -T imports/po_terms_combined.txt --copy-ontology-annotations true --force true --method BOT \
    remove --base-iri http://purl.obolibrary.org/obo/PO --axioms external --preserve-structure false --trim false \
    remove  --term rdfs:label  --term IAO:0000115 -T imports/po_terms_combined.txt --select complement \
    query --update ../sparql/inject-subset-declaration.ru --update ../sparql/inject-synonymtype-declaration.ru --update ../sparql/postprocess-module.ru \
    annotate --ontology-iri http://www.fairsharing.org/ontology/subject/SRAO/imports/po_import.owl annotate -V http://www.fairsharing.org/ontology/subject/SRAO/releases/2023-05-09/imports/po_import.owl --annotation owl:versionInfo 2023-05-09 convert -f ofn --output imports/po_import.owl.tmp.owl && mv imports/po_import.owl.tmp.owl imports/po_import.owl; fi
if [ true  = true ] && [ true  = true ]; then robot --catalog catalog-v001.xml convert -I https://edamontology.org/EDAM.owl -o mirror-edam.tmp.owl &&\
    mv mirror-edam.tmp.owl tmp/mirror-edam.owl; fi
if [ true  = true ] && [ true  = true ] && [ -f tmp/mirror-edam.owl ]; then if cmp -s tmp/mirror-edam.owl mirror/edam.owl ; then echo "Mirror identical, ignoring."; else echo "Mirrors different, updating." &&\
    cp tmp/mirror-edam.owl mirror/edam.owl; fi; fi
Mirror identical, ignoring.
if [ true  = true ] && [ true  = true ]; then curl -L http://purl.obolibrary.org/obo/ncit.owl --create-dirs -o mirror/ncit.owl --retry 4 --max-time 400 &&\
    robot --catalog catalog-v001.xml convert -i mirror/ncit.owl -o mirror-ncit.tmp.owl &&\
    mv mirror-ncit.tmp.owl tmp/mirror-ncit.owl; fi
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   357  100   357    0     0   1484      0 --:--:-- --:--:-- --:--:--  1481
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  874M  100  874M    0     0  4998k      0  0:02:59  0:02:59 --:--:-- 5047k^[[1;2B^[[1;2B
if [ true  = true ] && [ true  = true ] && [ -f tmp/mirror-ncit.owl ]; then if cmp -s tmp/mirror-ncit.owl mirror/ncit.owl ; then echo "Mirror identical, ignoring."; else echo "Mirrors different, updating." &&\
    cp tmp/mirror-ncit.owl mirror/ncit.owl; fi; fi
Mirrors different, updating.
if [ true  = true ]; then robot --catalog catalog-v001.xml query -i mirror/ncit.owl --update ../sparql/preprocess-module.ru \
    extract -T imports/ncit_terms_combined.txt --copy-ontology-annotations true --force true --method BOT \
    remove --base-iri http://purl.obolibrary.org/obo/NCIT --axioms external --preserve-structure false --trim false \
    remove  --term rdfs:label  --term IAO:0000115 -T imports/ncit_terms_combined.txt --select complement \
    query --update ../sparql/inject-subset-declaration.ru --update ../sparql/inject-synonymtype-declaration.ru --update ../sparql/postprocess-module.ru \
    annotate --ontology-iri http://www.fairsharing.org/ontology/subject/SRAO/imports/ncit_import.owl annotate -V http://www.fairsharing.org/ontology/subject/SRAO/releases/2023-05-09/imports/ncit_import.owl --annotation owl:versionInfo 2023-05-09 convert -f ofn --output imports/ncit_import.owl.tmp.owl && mv imports/ncit_import.owl.tmp.owl imports/ncit_import.owl; fi
robot --catalog catalog-v001.xml merge --input tmp/SRAO-preprocess.owl  \
    reason --reasoner ELK --equivalent-classes-allowed asserted-only --exclude-tautologies structural \
    relax \
    reduce -r ELK \
     annotate --ontology-iri http://www.fairsharing.org/ontology/subject/SRAO/SRAO-full.owl annotate -V http://www.fairsharing.org/ontology/subject/SRAO/releases/2023-05-09/SRAO-full.owl --annotation owl:versionInfo 2023-05-09 --output SRAO-full.owl.tmp.owl && mv SRAO-full.owl.tmp.owl SRAO-full.owl
robot --catalog catalog-v001.xml annotate --input SRAO-full.owl --ontology-iri http://www.fairsharing.org/ontology/subject/SRAO.owl annotate -V http://www.fairsharing.org/ontology/subject/SRAO/releases/2023-05-09/SRAO.owl --annotation owl:versionInfo 2023-05-09 \
    convert -o SRAO.owl.tmp.owl && mv SRAO.owl.tmp.owl SRAO.owl
robot --catalog catalog-v001.xml merge -i SRAO.owl convert -f ofn -o tmp/validate.ofn
robot --catalog catalog-v001.xml validate-profile --profile DL -i tmp/validate.ofn -o reports/validate_profile_owl2dl_SRAO.owl.txt || { cat reports/validate_profile_owl2dl_SRAO.owl.txt && exit 1; }
echo "Finished running all tests successfully."
Finished running all tests successfully.
robot --catalog catalog-v001.xml convert --input SRAO.owl --check false -f obo  -o SRAO.obo.tmp.obo && grep -v ^owl-axioms SRAO.obo.tmp.obo > SRAO.obo && rm SRAO.obo.tmp.obo
robot --catalog catalog-v001.xml convert --input SRAO-full.owl --check false -f obo  -o SRAO-full.obo.tmp.obo && grep -v ^owl-axioms SRAO-full.obo.tmp.obo > SRAO-full.obo && rm SRAO-full.obo.tmp.obo
matentzn commented 1 year ago

I think I found what is wrong: https://github.com/INCATools/ontology-development-kit/pull/865

My bad.

allysonlister commented 1 year ago

No need to apologise - you're fantastic!! Thanks for looking into this :)