geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Obsoletion request: homoserine biosynthetic process GO:0009090 #28911

Closed Antonialock closed 1 month ago

Antonialock commented 2 months ago

Please provide as much information as you can:

The reason for obsoletion is that this represents an intermediate (branch point)

threonine biosynthesis + de novo methionine biosynthesis

Homoserine is a toxic intermediate in the pathway that converts aspartate to threonine OR methionine; homoserine sits at the branch point of the pathway and can either be phosphorylated which leads it down the threonine biosynthesis route or acetylated which leads it towards methionine.

I don't think "homoserine biosynthesis" represents a GO biological programme, it's just an intermediate in the pathway. Genes that are part of this pathway should either be annotated to 1. de novo methionine biosynthesis or 2. to threonine biosynthesis, or 3. to both these terms if upstream of homoserine.

i.e. in the iagram below; keep the blue and the purple branches of the pathway, but remove the red

366876984-3c0194d9-74b8-437f-aa1f-5e9ce18f53a5

https://link.springer.com/article/10.1007/s00726-014-1873-1


Checklist for ontology editor

Check term usage and metadata in Protégé

Check annotations

Notification

raymond91125 commented 2 months ago

I don't think "homoserine biosynthesis" represents a GO biological programme, it's just an intermediate in the pathway. Genes that are part of this pathway should either be annotated to 1. de novo methionine biosynthesis or 2. to threonine biosynthesis, or 3. to both these terms if upstream of homoserine.

Is it desirable to always make 2 separate annotations for Hom[2/3/6]p. Following the logic, isn't it reasonable to also include isoleucine biosynthesis as the 3rd?

ValWood commented 2 months ago

As I build pathways I am making a list of processes that appear to be spurious.

GO:0071269 L-homocysteine biosynthetic process GO:0009092 homoserine metabolic process

are on my list

ValWood commented 2 months ago

Although I am not sure how this fits with the existing discussions about representing branch points. Maybe that rule only applies to more "linear" pathways. It is interesting that me and @Antonialock came to the same conclusion. It would be useful to wait to see what @rozaru thinks about this one when she gets to it.

I don't think we would include isoleucine biosynthesis because this is clearly an end point,

deustp01 commented 2 months ago

One approach would be to consult a comprehensive textbook of biochemistry like Devlin and use the pathway boundaries defined there to identify / individuate GO process boundaries and identify the defining inputs and outputs. This at least represents a biochemistry community consensus. To the extent that these boundaries are arbitrary, we can at least be consistently arbitrary.

raymond91125 commented 2 months ago

I cannot find homoserine or the synthesis of threonine in "Textbook of Biochemistry with Clinical Correlation (4th ed.)". Devlin does not appear to describe essential AA synthesis. Are there some other standard textbooks where I should be looking for these defined pathways? I'd argue against references that are not digitized, archived, and readily accessible on the internet, though. Should we include this issue in a more general pathway start/end discussion? @pgaudet Thanks.

Antonialock commented 2 months ago

Is it desirable to always make 2 separate annotations for Hom[2/3/6]p. Following the logic, isn't it reasonable to also include isoleucine biosynthesis as the 3rd?

@raymond91125 I think that would be redundant since isoleucine is made from threonine; it ”follows on” from threonine, and is not made from a separate branch in the pathway. Threonine seems important enough to be its own module (being an essential building block and all) even though it’s arbitrary for sure where things start and end.

In papers, isoleucine is “always” shown as a dashed line in the pathway; I interpret that as it being thought of as a downstream module. Similarly I wouldn’t annotate the citric acid cycle to threonine biosynthesis even though it provides the ocaloacetate to make the aspartate to make threonine.

I cannot find homoserine or the synthesis of threonine in "Textbook of Biochemistry with Clinical Correlation (4th ed.)".

I guess it is not in a clinical textbook since it is a microbe (and plant?) specific pathway This is how Stryer represents it p.1003 in Biochemistry. Note how they use separate arrows for threonine and methionine biosynthesis even though the start of those two pathways (hom2/3/6) are the same

IMG_1633

https://biokamikazi.wordpress.com/wp-content/uploads/2013/10/biochemistry-stryer-5th-ed.pdf

(also, might be important to be aware of; lysine is made by a different pathway in fungi, it's bacteria that make it from aspartate, so there are differences between kingdoms. The specific steps in methionine biosynthesis also differ within fungi, I don't know if this is important for GO annotation)

raymond91125 commented 2 months ago

@Antonialock Thanks for providing additional information. Personally I prefer minimizing pathways overlapping with one another. But the most important thing is to try to be consistent within GO.

sjm41 commented 2 months ago

Not sure this helps, but GO:0009090 is xreffed to MetaCyc:HOMOSERSYN-PWY:

Screenshot 2024-09-16 at 17 03 02

pgaudet commented 2 months ago

Proposal from ontology call @raymond91125 @edwong57 @sjm41 : keep homoserine biosynthetic process because it is a branch point

See also discussion here: https://github.com/geneontology/go-ontology/issues/22542#issuecomment-1168893682

We need to make a (final) decision and document it.

Actions:

ValWood commented 2 months ago

That makes sense based on the previous discussions. So, in this case, I guess we should not annotate hom6 (yeast) to methionine or threonine biosynthesis?

ValWood commented 2 months ago

or hom2, https://www.yeastgenome.org/locus/S000002565#go was that the outcome of the discussion?

pgaudet commented 2 months ago

What's the start and end of each pathway? If methionine and threonine biosynthesis start with homoserine, then, you shouldn't annotate hom6 to these.

How do we want the methionine and threonine biosynthesis to ALSO include homoserine biosynthesis ? Is the idea to align with MetaCyc?

ValWood commented 2 months ago

That is the question. I want to be clear that I am doing it correctly. I'm still a bit unsure. My understanding is that we align with Metacyc. This would require updates to the existing PANTHER annotation for hom6 and hom2

pgaudet commented 2 months ago

Let's confirm this on next Monday's ontology call.

Antonialock commented 2 months ago

Isn't it a bit odd to have a standalone pathway (homoserine biosynthesis) that only produces an "intermediate"? (wikipedia: "Homoserine is an intermediate in the biosynthesis of three essential amino acids....").

There must be many metabolic pathways with multiple branchpoints....it seems it could become very arbitrary?

deustp01 commented 2 months ago

I cannot find homoserine or the synthesis of threonine in "Textbook of Biochemistry with Clinical Correlation (4th ed.)". Devlin does not appear to describe essential AA synthesis. Are there some other standard textbooks where I should be looking for these defined pathways?

Sorry, that's my species-ist bias showing. Indeed, Devlin and other human/clinical-centric texts are useless for synthesis of essential amino acids, so the later suggestions to use MetaCyc as a resource for pathway definition sounds much better.

ValWood commented 2 months ago

This is true, I can see that we might like the upstream pathway annotated to the individual amino acid biosynthesis (or at least protein-amino acid biosynthesis if this is the only destination. However, we recently discussed ergosterol/ dolichol-linked oligosaccharide/ubiquinone in the context of isoprenoid biosynthesis, and here the decision was to annotate up to the branch point, so genes involved in isoprenoid biosynthesis would not be annotated to ergosterol/ dolichol-linked oligosaccharide/ubiquinone biosynthesis as the pathways are usually described as beginning at the first committed step.

So, if we were to treat these pathways differently, we would need a rule for why and when. But if we follow MetaCyc it wouldn't be arbitrary.

raymond91125 commented 2 months ago

Metacyc has superpathways and subpathways, e.g. https://www.biocyc.org/pathway?orgid=META&id=THRESYN-PWY. superpathway of L-threonine biosynthesis has 3 subpathways: L-homoserine biosynthesis L-threonine biosynthesis L-aspartate biosynthesis

pgaudet commented 2 months ago

@cmungall on ontology call:

I think the issue is we don’t define the start point of threonine biosynthesis

pgaudet commented 2 months ago

TO DO: Define starts and ends: for example 'L-threonine biosynthetic process' has primary input L-homoserine has primary output L-threonine

Maybe later: make superpathways?

raymond91125 commented 2 months ago

TO DO: Define starts and ends: for example 'L-threonine biosynthetic process' has primary input L-homoserine has primary output L-threonine

Does this mean that we should also change the term label and textual defintion? Perhaps threonine biosynthetic process from homoserine The chemical reactions and pathways starting from homoserine, resulting in the formation of threonine (2-amino-3-hydroxybutyric acid), a polar, uncharged, essential amino acid found in peptide linkage in proteins.

pgaudet commented 2 months ago

Hi @raymond91125

I would suggest that we are a lot more precise about the fact that this represents a pathway and not any pathway:

biosynthetic process is defined as

A cellular process consisting of the biochemical pathways by which a living organism synthesizes chemical substances. This typically represents the energy-requiring part of metabolism in which simpler substances are transformed into more complex ones.

So, to build on that defniition:

The biochemical pathways resulting in the synthesis of L-threonine, starting with L-homoserine.

or

The biochemical pathways resulting in the synthesis of L-threonine, starting with L-homoserine being converted into xx.

or

The biochemical pathways resulting in the synthesis of L-threonine from L-homoserine.

  1. I dont know if 'pathway' should be singular or plural
  2. I am not sure about 'starts with', since starts with refers to a process (continuant), not to an occurent (chemical). (I think?) https://www.ebi.ac.uk/ols4/ontologies/ro/properties/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FRO_0002224

example of usage Every insulin receptor signaling pathway starts with the binding of a ligand to the insulin receptor

Thanks, Pascale

raymond91125 commented 2 months ago

@pgaudet I think we should not keep the term label as L-threonine biosynthetic process since it follows a design pattern that does not include has_input. And I think we should make another design pattern which standardize similar terms. Is there a process of proposing new design patterns?

pgaudet commented 2 months ago

Is there a process of proposing new design patterns?

You can just open a GH ticket, and use the label 'design pattern'.

But are we talking about 'L-threonine biosynthetic process'? In this case, the DP exists: https://github.com/geneontology/go-ontology/blob/master/src/design_patterns/biosynthetic_process.yaml

Also of course, we can modify it if needed.

I've added this to Monday's ontology call.

Thanks, Pascale

pgaudet commented 1 month ago

Nothing more to be done right now, this meets current design patterns.