Closed kltm closed 1 year ago
This will be implemented in the GAF 2.1 parser, and the rule message will be placed in the report for gorule-0000059. While normally gorule-0000001 is used during parsing, we will do this just to let people know we have upgraded their relations/qualifiers. This will be a warning.
You will be doing a repair, wont you ? Then it's not a warning ?
Noting here that we'll want to look at gafrencer as well once the rules are finalized. This is not an issue in the current flow of the pipeline, but it's good to keep things aligned. @balhoff
Noting that this will not technically be "necessary" until the March, but must be implemented and tested before then.
So just to clarify the above rule for CC: If the GO Term is subClassOf GO:0110165 "cellular anatomical entity" then we use relation "located in". If the GO Term is subClassOf GO:0032991 "protein-containing complex" then we use "part of".
In Ontobee, it looks like GO:0019012 "virion", GO:0044217 "other organism part", GO:0044423 "virion part" are also direct children of Cellular Component. What should the relation be if it's not in the above two subClassOf closures?
@dougli1sqrd I believe a better way of formulating the instruction is: CC protein-containing complexes and children: part_of Everything else: located_in
Yeah, that @kltm "is-complex/is-not-complex" logic is pretty much what I do for the PAINT GAF 2.2 files.
That makes sense
It looks like with the GAF 2.2 switchover, most sources now fail our sanity checks due to severe output line reduction. Somebody might want to look a little more deeply into that, but I'm assuming that this is an expression of this ticket? https://github.com/geneontology/pipeline/issues/212
I need to make a separate error file format 2.1 to test this & remove tests for gorule-0000059 in the GAF 2.2 test file
We also need rules for direct annotations to the root
This is currently working correctly, as tested by the CGD file, coming in as GAF2.1, and being correctly fixed in the GO products , see http://release.geneontology.org/2023-07-27/annotations/cgd.gaf.gz
So no more action needed, since that format is not much used anymore.
Proposed action as GORULE:0000059: upgrade 2.1 GAF files with GAF 2.2 default formula
If an annotation line does not have a gp2term qualifier, the above formula should be used to add the information.
This is specifically for dealing with 2.1--2.2 files without gp2term relation will still be treated as an error.