Closed balhoff closed 5 years ago
@alexsign I got a suggestion to check with you about this change. Do you make any use of http://purl.obolibrary.org/obo/go/extensions/go-plus.owl
? If so would it make any difference if this and its imported files were merged into a single file?
I believe he uses the JSON. Note that in the JSON, all ontologies are combined into one file, but there are different graph objects. There may be assumptions about GO belonging to it's own graph.
It should be possible to have different policy for the JSON, although it's cleaner if it's the same rules for everything
On Sun, Feb 3, 2019 at 12:16 PM Jim Balhoff notifications@github.com wrote:
@alexsign https://github.com/alexsign I got a suggestion to check with you about this change. Do you make any use of http://purl.obolibrary.org/obo/go/extensions/go-plus.owl? If so would it make any difference if this and its imported files were merged into a single file?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/16876#issuecomment-460084368, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGOW5_Sqo_9UMObqBawJsG8n-u8qV5ks5vJ0OAgaJpZM4aZIdg .
@balhoff @cmungall I believe we use both. I'll investigate further and let you know.
Thanks!
@balhoff sorry for delayed reply, EBI datacenter had a major incident last weekend, so we just getting our internal services back. From what I can see right now @cmungall is right. From late 2017 we are using http://purl.obolibrary.org/obo/go/snapshot/extensions/go-plus.json instead of OWL file.
Here, the list of the other files we currently using:
http://purl.obolibrary.org/obo/go/snapshot/extensions/go-plus.json http://purl.obolibrary.org/obo/go/snapshot/extensions/gorel.obo http://purl.obolibrary.org/obo/go/snapshot/extensions/go-upper.obo http://purl.obolibrary.org/obo/go/snapshot/imports/go-taxon-groupings.obo
https://s3.amazonaws.com/go-public/metadata/db-xrefs.json https://s3.amazonaws.com/go-public/metadata/eco-usage-constraints.json
Please let me know if they are subjects of major changes.
@alexsign thanks, no problem! As @cmungall pointed out, currently in the JSON file there are multiple graphs, each representing an ontology, such as go-plus and all the ontologies it imports. The change to that file would be that all the axioms would be in one graph, for example since go-plus imports some terms from CHEBI, there would be some relationships between CHEBI terms merged into the GO graph. These are currently in a separate graph in the same JSON file.
We could really keep the JSON file the way it is, since it already nicely packages everything into one file. But I'm curious if you think it would make any difference to you if everything was in one graph.
By the way, all the information from http://purl.obolibrary.org/obo/go/snapshot/imports/go-taxon-groupings.obo
should be included inside the go-plus.json
as the graph with id http://purl.obolibrary.org/obo/go/imports/go-taxon-groupings.owl
.
@balhoff I looked trough procedure that extracts data from JSON file. It might need some changes, but I should be able to adjust it. If decision would be made to combine all ontologies on one graph in JSON file, can we get it for testing before it replaces one we are using now.
Thanks @alexsign. We will let you know ahead of time if we decide to do that.
I'm thinking this is more urgent: now that the ontology PURL points to the official release, and there is a separate PURL for snapshots, if someone loads a go-plus snapshot into Protege, they end up loading the imports from release rather than snapshot. We could fix this by a pretty complicated system of setting different ontology IRIs for import modules depending on if it is a snapshot or release; but MUCH easier would be to just merge the imports as described in this ticket.
This caught me yesterday. Even apart from loading snapshot/official from the PURLs, I opened an ontology in a directory with an edited catalogue.xml file and mistakenly was looking at the ontology with a local, out of date version of an import. I think merged imports for releases is a really good idea. (It would be nice to also have a way for advanced users to get the non-merged files for development purposes, but no reason that needs to happen as the default way of getting the ontology.)
Sounds like a good idea. Is this something we should announce to go-friends or in the 'announcements' repo? I am not sure of the best way to reach interested people.
@pgaudet I think we should announce it. Should that be before it happens in snapshot, or instead wait until a snapshot is available so that folks can immediately take a look at the snapshot?
Don't we want to give a bit of time for people to adjust their parsers and loading scripts ? I propose announcing it in advance and give a date (at least approximate) when it will happen.
@pgaudet @cmungall how does this sound?
We plan to make a change to versions of the ontology, such as "go-plus.owl", that import external files. In an upcoming release, the external imports will be merged into the ontology, rather than referenced via an 'owl:import'. These external imports include content extracted from other ontologies, such as Uberon and ChEBI, which is needed for full classification of the GO. By merging the external content and GO content into a single file, we can ensure that the version of the external content used with a given release is exactly the version tested with that release.
This change will not affect the primary GO ontology files, such as go.obo, go.owl, and go-basic.obo, which are already standalone files.
For flexible OWL integration of GO axioms with different versions of external ontologies, we also provide 'go-base.owl', which references external terms but does not import any content from external ontologies.
This change to standalone, merged releases for 'go-plus' (and undocumented internal GO files 'go-gaf' and 'go-lego') will first take place as a snapshot release, no earlier than April 22, 2019.
Looks good
On Wed, Apr 10, 2019 at 8:28 PM Jim Balhoff notifications@github.com wrote:
@pgaudet https://github.com/pgaudet @cmungall https://github.com/cmungall how does this sound?
We plan to make a change to versions of the ontology, such as "go-plus.owl", that import external files. In an upcoming release, the external imports will be merged into the ontology, rather than referenced via an 'owl:import'. These external imports include content extracted from other ontologies, such as Uberon and ChEBI, which is needed for full classification of the GO. By merging the external content and GO content into a single file, we can ensure that the version of the external content used with a given release is exactly the version tested with that release.
This change will not affect the primary GO ontology files, such as go.obo, go.owl, and go-basic.obo, which are already standalone files.
For flexible OWL integration of GO axioms with different versions of external ontologies, we also provide 'go-base.owl', which references external terms but does not import any content from external ontologies.
This change to standalone, merged releases for 'go-plus' (and undocumented internal GO files 'go-gaf' and 'go-lego') will first take place as a snapshot release, no earlier than April 22, 2019.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/16876#issuecomment-481829818, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGOZD3Ih-u0iQL1flmad7V5KjUlqkxks5vfjtagaJpZM4aZIdg .
@alexsign FYI this (imports merged into go-plus) has been implemented and will appear in the next successful snapshot release.
@balhoff Thanks for letting me know
@balhoff in trying to figure out what is going on with https://github.com/geneontology/pathways2GO/issues/88 I've been looking at go-plus and am a little confused. There are a lot of classes in there from other ontologies (CL, CHEBI, BFO, CARO, ENVO, MOD, NBO, OBI, NCBITaxon, PATO, PR, SO) that are not logically defined and do not have any label or text definition. E.g. CARO_0001001 neuron projection bundle, CHEBI:22868 bile salt, etc.
As the issue here discusses, GO-Plus does not import any of these ontologies so the product ends up being incomplete. I think? What am I missing?
Is there documentation anywhere on what go-plus is used to do downstream within the GO infrastructure? Its an important part of go-lego of course, where else is it used?
In case it's relevant, "bile salt" is a family of chemicals whose members get a lot of Reactome annotations, but we always refer to them by their individual identifiers and this grouping term CHEBI:22868 is not even an instance in our central database. And to the extent that we talk about development of the nervous system we do it without ever referring to CARO_0001001 neuron projection bundle or any other CARO term.
In fact CARO, BFO, ENVO, NBO, OBI, PATO, and PR are not reference ontologies that we ever refer to for any purpose. Terms from MOD, NCBITaxon, and SO are used in Reactome as are terms from all of the ontologies listed here.
@deustp01 many of these seem to appear in logical definitions that appear in go-plus, for example, go-plus includes a logical definition for the Uberon term 'bile' that includes 'subclass of (has part some 'bile salt')'
I think we should change the release process of go-plus.owl to merge import modules, so that the downloaded file is completely standalone. Currently, while a given ontology release has a version IRI, and we are putting effort into making previous version IRIs resolve to these past versions, when you load one of these older files (e.g. in Protege), you end up loading all the current versions of the import modules. Clearly someone would prefer to get the exact ontology content from that previous version.
I think users are best served by having a prepackaged complete file, so that they don't need to use software that resolves an import chain. If they want to use just GO content and not axioms coming from imports, we are already publishing the standalone go-base.owl for this purpose.
Some further rationale for standalone OWL file and "base" files is in this Google doc: https://docs.google.com/document/d/1eCo5C3aZ9kjhBu98-24c2FHZIHEeFV7I2TJ4Vd6qw6k/edit#heading=h.u7or56kflnm2
In discussion of this with @kltm, @goodb, and @dougli1sqrd, it sounds like making all released files "standalone" will solve some difficult issues in the GO pipeline as well.
Are there any users that would notice such a change and be inconvenienced in any way?