INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
220 stars 54 forks source link

Release file - types and descriptions #175

Closed dosumis closed 4 years ago

dosumis commented 5 years ago

There has been a gradual change in release file content that, unless I’ve not been paying attention (entirely possible), has happened without proper discussion - and certainly without announcement. I particularly want to avoid causing problems for downstream who have embedded assumptions and expectations.

  1. When did we agree that fu.obo/owl should merge in all imports? The OWL files at least used to reference imports. Not sure if OBO files included them - but the it’s clear from the diffs on the ODK update to CL the cl.obo that it did not. The new one seems to pull in most of BFO.

  2. Back when we made releases with OORT, we had a number of ontology file versions that served important use cases (mostly for the OBO world). These have been dropped with the move to Robot.

Many ontologies also produced a fu-basic.obo version. This was like fu-simple.obo, but guaranteed to be a DAG with no double-labelled edges - as all OBO files once were. This was typically achieved by using an OWLTOOLS filter to remove particular relationship types - typically removing the less useful partner of a reciprocal pair - e.g. removing has_part because this is not used in graph-based grouping whereas its reciprocal, part_of, is.

I think we should at least be producing -simple and -basic versions for everything and documenting the content and usage of all release files on the main GitHub readme. It might be worth having the -basic be a version be designed for naive graph-based grouping without any need to pay attention to edge type. This could be configured for each ontology by asking for a list of grouping relations - making clear that this should exclude any reciprocal pairs that potentially lead to cycles. Maintaining DAGness potentially requires ongoing work and so these should be the subject of CI tests.

I think some quick solutions to this are urgent given the current fast and otherwise welcome progress in implementing ODK across the community.

CC @cmungall @balhoff @matentzn

matentzn commented 5 years ago

After a discussion with David, I propose the following:

It is fine IMHO to drop the special case of "retain import statements" -> We just purge that possibility from existence (i.e. we just always merge imports in).

@dosumis can you check I captured the definition of "simple" correctly?

@cmungall To save you some ink: I agree with you that I would rather have a reliable behaviour of hp.owl and hp.obo, them always being the full case. But we need to weigh our desires against the expectation of some important users such as Flybase or the HP user community. We know we (as technically versed people) can chose now freely; and in fact, we will more often than not chose foo-base than foo.owl in any case. So perhaps its a show of love to let the community keep their desired primary artefacts.

dosumis commented 5 years ago

Looks good. Worth explicitly mentioning reduce?

matentzn commented 5 years ago

just added that

dosumis commented 5 years ago

One more artefact:

-basic: A version of -simple containing only relationships using relations on a configurable whitelist (default = BFO:0000050 (?)). See above for explanation. It would be good to add a CI check for whether this is a acyclic (although I have no idea how to do this!)

matentzn commented 5 years ago

Added it to the list

balhoff commented 5 years ago

@dosumis @matentzn I agree I'm not sure we officially agreed to merge imports. In any case I support that move. Probably this will require some announcements to downstream consumers. I think it would be a good idea to convert import statements to ontology annotations ("imports_content_from" or something). Maybe this should be a ROBOT command or option.

matentzn commented 4 years ago

Closed in favour of #194