geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

'O-glycan processing' (GO:0016266) child terms #27105

Open sjm41 opened 7 months ago

sjm41 commented 7 months ago

All of the leaf terms shown below are labeled with 'do not annotate' (though 3 of them do have a 1-4 annotations). Should they be obsoleted, or is the 'do not annotate' label an error?

|__O-glycan processing
           |__O-glycan processing, core 1 (2 annotations)
           |__O-glycan processing, core 2 (4 annotations)
           |__O-glycan processing, core 3 (1 annotation)
           |__O-glycan processing, core 4 (0 annotations)
           |__O-glycan processing, core 5 (0 annotations)
           |__O-glycan processing, core 6 (0 annotations)
           |__O-glycan processing, core 7 (0 annotations)
           |__O-glycan processing, core 8 (0 annotations)
sjm41 commented 1 day ago

Update: The terms I reported back in Feb that had 0 annotations have been obsoleted because they "represent a molecular function" (see #28482) But 3 child terms remain, now tagged with "subset: gocheck_obsoletion_candidate":

|__O-glycan processing
           |__O-glycan processing, core 1 (9882 annotations (all TreeGrafter), 0 EXP)
           |__O-glycan processing, core 2 (86 annotations, 2 EXP)
           |__O-glycan processing, core 3 (1 annotation = NAS)

@pgaudet - Should those remaining 3 child terms also be obsoleted?

pgaudet commented 1 day ago

Hi Steven,

I thought they should, but it seems they are 'common' structures, see https://www.ncbi.nlm.nih.gov/books/NBK453030/ (and co is core 4). What do you think ?

sjm41 commented 1 day ago

Happy to keep them, but then the "subset: gocheck_obsoletion_candidate" flag should be removed.

I see MetaCyc defines two pathways: mucin core 1 and core 2 O-glycosylation (PWY-7433) mucin core 3 and core 4 O-glycosylation (PWY-7435)

That separation seems to fit with the book reference you sent.

That all suggests that GO might want to aim for something like this:

|__O-glycan processing
           |__O-glycan processing, core 1 and core 2 [MetaCyc:PWY-7433, NBK453030]
           |__O-glycan processing, core 3 and core 4 [MetaCyc:PWY-7435, NBK453030]
pgaudet commented 23 hours ago

I kept the 3 terms (although it seems we could also restore core 4, and others, but they are rarer, so let's wait?)

I also renamed the terms to be more consistent with what is found in the literature: GO:0016267 O-glycan processing, core 1 >> core 1 O-glycan biosynthetic process GO:0016268 O-glycan processing, core 2 >> core 2 O-glycan biosynthetic process GO:0016269 O-glycan processing, core 3 >> core 3 O-glycan biosynthetic process

I also wonder if we should not merge GO:0016266 O-glycan processing into parent GO:0006493 protein O-linked glycosylation - I dont see justification for having both terms.

Thanks, Pascale

sjm41 commented 22 hours ago

How about keeping GO:0016266 O-glycan processing but renaming it to "core O-glycan biosynthetic process"?

sjm41 commented 22 hours ago

Should also remove MetaCyc:PWY-7433 as xref on the parent GO:0016266 since it's better as broadMatch on the core 1 and core 2 children.

pgaudet commented 20 hours ago

Thanks ! Done