geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

NTR - metabolism: new terms for proteoglycan metabolic process branch #29294

Open rozaru opened 4 days ago

rozaru commented 4 days ago

The current proteoglycan metabolic process branch doesn't cover all the different pathways for proteoglycan biosynthesis (heparan sulfate (HS), chondroitin sulfate (CS), dermatan sulfate (DS) and keratan sulfate (KS) -Images are at the bottom of the ticket). To fill the gaps, new terms would need to be created.

Related to Review of glycosaminoglycan metabolic process (GO:0030203) and proteoglycan metabolic process (GO:0006029) branches ticket #28977

Below is the current branch (for clarity I removed the terms related to the glycosaminoglycan metabolic process (this is another issue), the terms for the cell wall proteoglycan metabolic branch and the regulatory terms).

|__proteoglycan metabolic process

|__chondroitin sulfate proteoglycan metabolic process
|   |__chondroitin sulfate proteoglycan biosynthetic process
|       |__chondroitin sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process

|__dermatan sulfate proteoglycan metabolic process
|   |__dermatan sulfate proteoglycan biosynthetic process
|       |__dermatan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process

|__heparan sulfate proteoglycan metabolic process
|   |__heparan sulfate proteoglycan biosynthetic process, enzymatic modification

|__proteoglycan biosynthetic process
|   |__chondroitin sulfate proteoglycan biosynthetic process
|   |   |__chondroitin sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|   |__dermatan sulfate proteoglycan biosynthetic process
|   |   |__dermatan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|   |__heparan sulfate proteoglycan biosynthetic process
|       |__heparan sulfate proteoglycan biosynthetic process, enzymatic modification
|       |__heparan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process

|__proteoglycan catabolic process
    |__heparan sulfate proteoglycan catabolic process

Issues:

  1. heparan sulfate biosynthetic has 2 GO terms: one for the chain elongation: heparan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process one for the post elongation modification (epimerisation, sulfation etc...): heparan sulfate proteoglycan biosynthetic process, enzymatic modification CS and DS have only have XX proteoglycan biosynthetic process polysaccharide chain biosynthetic process term. Could a XX proteoglycan biosynthetic process, enzymatic modification term be created for CS, DS and KS?

  2. CS, DS and HS synthesis starts with the formation of a common linker (4 reactions). MetaCyc, KEGG and Reactome all have a separate module for the linker synthesis. Is it worth to have a GO term for this (i.e. heparan sulfate proteoglycan biosynthetic process, linker formation)? Currently the genes involved in the linker synthesis are annotated with XX proteoglycan biosynthetic process, chain biosynthetic process

  3. If points 1 and 2 are too specific the other option would be to only have XX proteoglycan biosynthetic process term and all the genes involved in the linker synthesis, elongation and post-elongation modification will be annotated to it.

  4. The relationships between the heparan sulfate proteoglycan metabolic process children terms is not quite right. FIXED

  5. There is no GO term for keratan sulfate proteoglycan metabolic process. Should they be added?

  6. The GO term for DS and CS proteoglycan catabolic process are missing. Should they be added?

If all the suggestions are implemented the branch will look like this (NEW is for the new terms and RRR for the terms for which the relationship has been updated):

|__proteoglycan metabolic process

|__chondroitin sulfate proteoglycan metabolic process
|   |__chondroitin sulfate proteoglycan biosynthetic process
|       |__chondroitin sulfate proteoglycan biosynthetic process, linker formation  NEW
|       |__chondroitin sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|       |__chondroitin sulfate proteoglycan biosynthetic process, enzymatic modification    NEW
|   |__chondroitin sulfate proteoglycan catabolic process NEW

|__dermatan sulfate proteoglycan metabolic process
|   |__dermatan sulfate proteoglycan biosynthetic process
|       |__dermatan sulfate biosynthetic process
|       |__dermatan sulfate proteoglycan biosynthetic process, linker formation     NEW
|       |__dermatan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|       |__dermatan sulfate proteoglycan biosynthetic process, enzymatic modification   NEW
|   |__dermatan sulfate proteoglycan catabolic process    NEW

|__heparan sulfate proteoglycan metabolic process
|   |__heparan sulfate proteoglycan biosynthetic process RRR
|       |__heparan sulfate proteoglycan biosynthetic process, linker formation  NEW RRR
|       |__heparan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process RRR
|       |__heparan sulfate proteoglycan biosynthetic process, enzymatic modification RRR
|   |__heparan sulfate proteoglycan catabolic process RRR

|__keratan sulfate proteoglycan metabolic process NEW
|   |__keratan sulfate proteoglycan biosynthetic process NEW
|       |__keratan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process NEW
|       |__keratan sulfate proteoglycan biosynthetic process, enzymatic modification NEW
|   |__keratan sulfate proteoglycan catabolic process     NEW   

|__proteoglycan biosynthetic process
|   |__chondroitin sulfate proteoglycan biosynthetic process
|       |__chondroitin sulfate proteoglycan biosynthetic process, linker formation  NEW
|       |__chondroitin sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|       |__chondroitin sulfate proteoglycan biosynthetic process, enzymatic modification    NEW
|   |__dermatan sulfate proteoglycan biosynthetic process
|       |__dermatan sulfate proteoglycan biosynthetic process, linker formation     NEW
|       |__dermatan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|       |__dermatan sulfate proteoglycan biosynthetic process, enzymatic modification   NEW
|   |__heparan sulfate proteoglycan biosynthetic process
|       |__heparan sulfate proteoglycan biosynthetic process, linker formation  NEW
|       |__heparan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process
|       |__heparan sulfate proteoglycan biosynthetic process, enzymatic modification
|   |__keratan sulfate proteoglycan biosynthetic process NEW
|       |__keratan sulfate proteoglycan biosynthetic process, polysaccharide chain biosynthetic process NEW
|       |__keratan sulfate proteoglycan biosynthetic process, enzymatic modification NEW

|__proteoglycan catabolic process
|   |__heparan sulfate proteoglycan catabolic process
|   |__chondroitin sulfate proteoglycan catabolic process NEW
|   |__dermatan sulfate proteoglycan catabolic process    NEW
|   |__keratan sulfate proteoglycan catabolic process     NEW   

There will be 12 new terms to create. If this is OK I will provide the usual information for their creation.

Glycosaminoglycan biosynthesis

images_large_10 1177_00220345231224228-fig1

Keratan sulfate biosynthesis

sjm41 commented 2 days ago
  1. The relationships between the heparan sulfate proteoglycan metabolic process children terms is not quite right.

This was because the 'child' terms had a LD but the intended 'parent' term lacked one. I've just added this to 'heparan sulfate proteoglycan metabolic process': intersection_of: GO:0008152 ! metabolic process intersection_of: has_primary_input_or_output CHEBI:24499 ! heparan sulfate proteoglycan

which should fix this particular issue.

sjm41 commented 2 days ago

I also added the missing LDs for these two terms:

sjm41 commented 2 days ago
  1. CS, DS and HS synthesis starts with the formation of a common linker (4 reactions). MetaCyc, KEGG and Reactome all have a separate module for the linker synthesis. Is it worth to have a GO term for this (i.e. heparan sulfate proteoglycan biosynthetic process, linker formation)? Currently the genes involved in the linker synthesis are annotated with XX proteoglycan biosynthetic process, chain biosynthetic process

Yes, it makes sense to have this process as a BP in GO. I just created the following term, based mainly on the MetaCyc pathway. (I'm not sure that KEGG 'Modules' are valid xrefs in GO?) Since this is a common preliminary step to make CS, DS and HS proteoglycans, I think we just want this single BP term with 'part of' relations to the other BPs. Hopefully I've done this correctly, @pgaudet ?

+[Term] +id: GO:0120532 +name: glycosaminoglycan-protein linkage region biosynthetic process +namespace: biological_process +def: "The formation of a tetrasaccharide linker sequence (xylose-galactose-galactose-glucuronate) on specific serine residues of a core protein, on to which dermatan sulfate, chondroitin sulfate, heparan sulfate or heparin glycosaminoglycans may be assembled to synthesise the corresponding proteoglycan." [MetaCyc:PWY-6557] +synonym: "Glycosaminoglycan biosynthesis, linkage tetrasaccharide" EXACT [] +synonym: "glycosaminoglycan-protein linkage region biosynthesis" EXACT [MetaCyc:PWY-6557] +xref: MetaCyc:PWY-6557 +is_a: GO:0030166 ! proteoglycan biosynthetic process +relationship: part_of GO:0015012 ! heparan sulfate proteoglycan biosynthetic process +relationship: part_of GO:0050650 ! chondroitin sulfate proteoglycan biosynthetic process +relationship: part_of GO:0050651 ! dermatan sulfate proteoglycan biosynthetic process +property_value: term_tracker_item "https://github.com/geneontology/go-ontology/issues/29294" xsd:anyURI +created_by: sjm +creation_date: 2024-11-27T13:13:42Z +

deustp01 commented 2 days ago

Yes. Fits the weeds discussion of module size and sensible boundaries. Here, at the level of physiological specificity we're annotating, a newly synthesized linker can have any of three distinct fates so one module to make the linker and three to use it in diverse ways sounds exactly right. (Beating dead horses can be satisfying.)