geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

Output most specific taxon constraints for every term in go-computed-taxon-constraints.owl #28760

Closed balhoff closed 1 month ago

balhoff commented 1 month ago

The current computed taxon constraints require navigation up the class hierarchy to get the full set of constraints applying to a term. In cases of multiple applicable constraints, some complex reasoning is still required in order to know which are the most specific constraints for a term. To simplify usage by consumers, we will compute the applicable taxon constraints for every GO term and directly assert them, so no traversal or reasoning is required.

@alexsign when this is incorporated, QuickGO can simply show the direct taxon constraints for each term rather than showing any from ancestor terms.

alexsign commented 1 month ago

@balhoff Hi Jim, I can switch pipelines to use go-computed-taxon-constraints.obo file. Couple of question before I do this.

  1. Right now go-plus.json has taxon constraints to 5052 GO terms vs go-computed-taxon-constraints.obo which has only 5003. I would expect much more because of precomputed term constraints. Am I missing something?
  2. Because we are using RO more and more, it looks like ro_import.obo file will be very helpful. Is it something you use as well? Right now it's missing never_in_taxon, but I guess it can be updated.
balhoff commented 1 month ago

@alexsign I haven't merged the change yet, since I wanted to make sure you were ready. It will be around 20,000 terms after the change. You will either be able to get these from the import file, or else just continue using go-plus, since they will be merged in there.

I can try to add never_in_taxon to ro_import. Not sure why it wouldn't be there already. However with the new computed taxon constraints you shouldn't need to use any property chains from RO anymore.

alexsign commented 1 month ago

@balhoff Thanks Jim, when data will be in go-plus, I can switch off my own propagation quite fast. Please go ahead with merge when you ready.
As for ro_import, I was just looking for a good place to use as RO dictionary for translations, if needed in the future.

balhoff commented 1 month ago

Thanks @alexsign, I merged the PR, so the new data should be appearing soon.

alexsign commented 1 month ago

@balhoff precomputed taxon constraint now processed by GOA and available in the QuickGO

balhoff commented 1 month ago

That's great @alexsign! Looks good. I think you can change one thing; the taxon constraints table no longer needs to say "Ancestor GO ID | Ancestor GO Term Name", since these are directly on the term.

alexsign commented 1 month ago

@balhoff I'm not sure what changes to go-plus.json make all taxon constraint disappear from the GOA database. I'm looking into it now. What I found is following section is not in go-plus.json any more. { "id" : "http://purl.obolibrary.org/obo/RO_0002161", "meta" : { "definition" : { "val" : "S never_in_taxon T iff: S SubClassOf in_taxon only not T.", "xrefs" : [ ] } }, "type" : "PROPERTY", "lbl" : "never_in_taxon" } look like it was replaced with { "id" : "http://purl.obolibrary.org/obo/RO_0002161", "lbl" : "never in taxon", "type" : "PROPERTY", "meta" : { "definition" : { "val" : "x never in taxon T if and only if T is a class, and x does not instantiate the class expression \"in taxon some T\". Note that this is a shortcut relation, and should be used as a hasValue restriction in OWL." } } }

balhoff commented 1 month ago

@alexsign I did update the JSON output to use ROBOT instead of owltools (newer version of obographs JSON library). It seems like the only difference there is some changes to the text in the label and the definition, and removal of the empty xrefs list. Are you also having trouble with http://purl.obolibrary.org/obo/RO_0002162?

kltm commented 4 weeks ago

@balhoff Small question: I was expecting to see RO:0002161 (never_in_taxon) in the GOlr neighborhood_graph_json field, but it is not showing up (https://github.com/geneontology/amigo/issues/721#issuecomment-2364623725). Is there something we need to do with the build or inclusion to get this to propagate?

balhoff commented 3 weeks ago

@kltm I answered at https://github.com/geneontology/amigo/issues/721#issuecomment-2367046543.