INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
223 stars 54 forks source link

Allow refreshing the mirrors under IMP=false. #973

Closed gouttegd closed 9 months ago

gouttegd commented 9 months ago

This PR overhauls the mirroring pipeline to make it independent of the IMP variable, so that refreshing the mirrors is controlled by the MIR variable only (except for “large” imports, for which the IMP_LARGE variable also comes into play).

This is done using Make conditionals rather than shell conditionals. This makes the generated Makefile much easier to read, as the entire mirroring pipeline is simply skipped (with a single conditional at the top) under MIR=false.

closes #946

gouttegd commented 9 months ago

@matentzn Summarily tested on FBbt.

The Jinja2 template is borderline unreadable, so to understand better what the PR does, here is an example of the mirroring section of the generated Makefile:

# ----------------------------------------
# Mirroring upstream ontologies
# ----------------------------------------

IMP=true # Global parameter to bypass import generation
MIR=true # Global parameter to bypass mirror generation
IMP_LARGE=true # Global parameter to bypass handling of large imports

ifeq ($(strip $(MIR)),true)

## ONTOLOGY: ro
.PHONY: mirror-ro
.PRECIOUS: $(MIRRORDIR)/ro.owl
mirror-ro: | $(TMPDIR)
    curl -L $(OBOBASE)/ro.owl --create-dirs -o $(TMPDIR)/ro-download.owl --retry 4 --max-time 200 && \
    $(ROBOT) remove -i $(TMPDIR)/ro-download.owl --base-iri http://purl.obolibrary.org/obo/BFO_ --base-iri http://purl.obolibrary.org/obo/RO_ --axioms external --preserve-structure false --trim false -o $(TMPDIR)/$@.owl

## ONTOLOGY: bfo
.PHONY: mirror-bfo
.PRECIOUS: $(MIRRORDIR)/bfo.owl
mirror-bfo: | $(TMPDIR)
    curl -L $(OBOBASE)/bfo.owl --create-dirs -o $(TMPDIR)/bfo-download.owl --retry 4 --max-time 200 && \
    $(ROBOT) convert -i $(TMPDIR)/bfo-download.owl -o $(TMPDIR)/$@.owl

## ONTOLOGY: pato
.PHONY: mirror-pato
.PRECIOUS: $(MIRRORDIR)/pato.owl
ifeq ($(strip $(IMP_LARGE)),true)
mirror-pato: | $(TMPDIR)
    curl -L $(OBOBASE)/pato.owl --create-dirs -o $(TMPDIR)/pato-download.owl --retry 4 --max-time 200 && \
    $(ROBOT) remove -i $(TMPDIR)/pato-download.owl --base-iri $(OBOBASE)/PATO --axioms external --preserve-structure false --trim false -o $(TMPDIR)/$@.owl
else
mirror-pato:
    @echo "Not refreshing pato because refreshing large imports is disabled (IMP_LARGE=$(IMP_LARGE))."
endif

ALL_MIRRORS = $(patsubst %, $(MIRRORDIR)/%.owl, $(IMPORTS))
MERGE_MIRRORS = true

ifeq ($(strip $(MERGE_MIRRORS)),true)
$(MIRRORDIR)/merged.owl: $(ALL_MIRRORS)
    $(ROBOT) merge $(patsubst %, -i %, $^)  remove --axioms equivalent --preserve-structure false -o $@
.PRECIOUS: $(MIRRORDIR)/merged.owl
endif

$(MIRRORDIR)/%.owl: mirror-% | $(MIRRORDIR)
    if [ -f $(TMPDIR)/mirror-$*.owl ]; then if cmp -s $(TMPDIR)/mirror-$*.owl $@ ; then echo "Mirror identical, ignoring."; else echo "Mirrors different, updating." &&\
        cp $(TMPDIR)/mirror-$*.owl $@; fi; fi

else # MIR=false
$(MIRRORDIR)/%.owl:
    @echo "Not refreshing $@ because the mirrorring pipeline is disabled (MIR=$(MIR))."
endif

to compare with what the same section generated by ODK 1.4.3:

# ----------------------------------------
# Mirroring upstream ontologies
# ----------------------------------------

IMP=true # Global parameter to bypass import generation
MIR=true # Global parameter to bypass mirror generation
IMP_LARGE=true # Global parameter to bypass handling of large imports

## ONTOLOGY: ro
.PHONY: mirror-ro
.PRECIOUS: $(MIRRORDIR)/ro.owl
mirror-ro: | $(TMPDIR)
    if [ $(MIR) = true ] && [ $(IMP) = true ]; then curl -L $(OBOBASE)/ro.owl --create-dirs -o $(MIRRORDIR)/ro.owl --retry 4 --max-time 200 &&\
        $(ROBOT) convert -i $(MIRRORDIR)/ro.owl -o $@.tmp.owl && \
        $(ROBOT) remove -i $@.tmp.owl --base-iri http://purl.obolibrary.org/obo/BFO_ --base-iri http://purl.obolibrary.org/obo/RO_ --axioms external --preserve-structure false --trim false -o $@.tmp.owl &&\
        mv $@.tmp.owl $(TMPDIR)/$@.owl; fi

## ONTOLOGY: bfo
.PHONY: mirror-bfo
.PRECIOUS: $(MIRRORDIR)/bfo.owl
mirror-bfo: | $(TMPDIR)
    if [ $(MIR) = true ] && [ $(IMP) = true ]; then curl -L $(OBOBASE)/bfo.owl --create-dirs -o $(MIRRORDIR)/bfo.owl --retry 4 --max-time 200 &&\
        $(ROBOT) convert -i $(MIRRORDIR)/bfo.owl -o $@.tmp.owl &&\
        mv $@.tmp.owl $(TMPDIR)/$@.owl; fi

## ONTOLOGY: pato
.PHONY: mirror-pato
.PRECIOUS: $(MIRRORDIR)/pato.owl
mirror-pato: | $(TMPDIR)
    if [ $(MIR) = true ] && [ $(IMP) = true ] && [ $(IMP_LARGE) = true ]; then curl -L $(OBOBASE)/pato.owl --create-dirs -o $(MIRRORDIR)/pato.owl --retry 4 --max-time 200 &&\
        $(ROBOT) convert -i $(MIRRORDIR)/pato.owl -o $@.tmp.owl && \
        $(ROBOT) remove -i $@.tmp.owl --base-iri $(URIBASE)/PATO --axioms external --preserve-structure false --trim false -o $@.tmp.owl &&\
        mv $@.tmp.owl $(TMPDIR)/$@.owl; fi

ALL_MIRRORS = $(patsubst %, $(MIRRORDIR)/%.owl, $(IMPORTS))
MERGE_MIRRORS = true

$(MIRRORDIR)/merged.owl: $(ALL_MIRRORS)
    if [ $(IMP) = true ] && [ $(MERGE_MIRRORS) = true ]; then $(ROBOT) merge $(patsubst %, -i %, $^) -o $@; fi
.PRECIOUS: $(MIRRORDIR)/merged.owl

$(MIRRORDIR)/%.owl: mirror-% | $(MIRRORDIR)
    if [ $(IMP) = true ] && [ $(MIR) = true ] && [ -f $(TMPDIR)/mirror-$*.owl ]; then if cmp -s $(TMPDIR)/mirror-$*.owl $@ ; then echo "Mirror identical, ignoring."; else echo "Mirrors different, updating." &&\
        cp $(TMPDIR)/mirror-$*.owl $@; fi; fi
gouttegd commented 9 months ago

I may note that there is a tiny risk here that a custom pipeline depends on the mirror-% goal, but that would be in any case bad design.

There is one occurrence that I know of in the Uberon custom Makefile: we re-define the mirror/ro.owl rule to add “OBO shorthands” to the mirror – whether adding such shorthands is something that we should really do is a good question in itself, but the important point here is that the re-definition does depend on mirror-ro:

# We add OBO shorthands to the RO mirror before merging it with the other mirrors
# FIXME: https://github.com/obophenotype/uberon/issues/3016
mirror/ro.owl: mirror-ro | $(MIRRORDIR)
        if [ $(MIR) = true ] && [ $(IMP) = true ]; then $(OWLTOOLS) $(TMPDIR)/mirror-ro.owl --add-obo-shorthand-to-properties -o $@ ; fi

After this PR is merged and when Uberon will be next updated with the latest ODK workflows, that rule above will need updating as well to adhere to the new principle of using Make conditionals rather than shell conditionals. I can take care of that of course.

gouttegd commented 9 months ago

you had already merged it

What the… I don’t recall at all having merged it to dev. I might have been more tired than I thought at the end of December. ^^'

gouttegd commented 9 months ago

Not sure a changelog entry is warranted. This is purely an internal change. It may affect custom pipelines that re-defines or re-uses standard workflow targets, but you could say that for basically all internal changes. Users should be aware that the moment they start doing this kind of things in their $(ONT).Makefile, there is always a risk that something would break at the next update.

matentzn commented 9 months ago

Ok!