geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

NTR - metabolism: [proteinogenic amino acid biosynthesis] #23268

Closed danielhhaft closed 9 months ago

danielhhaft commented 2 years ago

Please provide as much information as you can:

Biosynthesis of any amino acid that is incorporated into protein naturally by ribosomal translation of mRNA, and that has a specific codon for translation from mRNA to protein.

Biological processes for the biosynthesis of L-Ala, L-Cys, L-Asp, L-Glu, L-Phe, Gly, L-His, L-Ile, L-Lys, L-Leu, L-Met, L-Asn, L-Pro, L-Gln, L-Arg, L-Ser, L-Thr, L-Val, L-Trp, L-Tyr. May also include biosynthesis of selenocysteine, pyrrolysine, and N-formylmethionine. Includes processes in which biosynthesis is completed after a tRNA is charged (e.g. Ser-to-Sec conversion on the selenocysteine tRNA).

This term shall be used to distinguish its processes from "non-proteinogenic amino acid biosynthesis", as used in natural product biosynthesis, peptidoglycan biosynthesis (e.g. D-ala), metabolism (e.g. ornithine), etc.


pgaudet commented 1 year ago

See https://biocyc.org/META/NEW-IMAGE?type=ECOCYC-CLASS&object=IND-AMINO-ACID-SYN for a nice classification

tberardini commented 1 year ago

Would it be more accurate to say that these amino acids have been found in proteins but may also function on their own and not as a component of a protein? (think glutamate as a neurotransmitter)

edwong57 commented 11 months ago

@pgaudet and I were discussing this and #23269 and thought we could use ChEBI to group these amino acids. I was going to make a ChEBI ticket, but then found that they already have a proteinogenic amino acid term - https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A83813 that these L-a.a. are under. As well as a non-proteinogenic amino acid term (https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:83820). Would we have to do anything other than make sure that the appropriate ChEBI term is added to the biosynthesis process?

pgaudet commented 11 months ago

Hi @edwong57 I think this works! Please create the new term, using 'proteinogenic amino acid' amino acid in the logical definition, and see that all amino acids get classified correctly.

OK?

Thanks, Pascale

edwong57 commented 11 months ago

@pgaudet, I created the term. image

When I went to look, only a portion of the metabolic processes were inferred children image

GO:0006523: Alanine Biosynthetic process doesn't get classified under the new term. I think it is because the primary output is 'alanine zwitterion', which is not classified as a proteinogenic amino acid in ChEBI. Would the solution be changing all the inputs to the form that is classified as proteinogenic in ChEBI, or do we request that ChEBI classify these as proteinogenic, or should a new term specific to the L-/proteinogenic form need to be created?

deustp01 commented 11 months ago

GO:0006523: Alanine Biosynthetic process doesn't get classified under the new term. I think it is because the primary output is 'alanine zwitterion', which is not classified as a proteinogenic amino acid in ChEBI.

@cmungall and here we are again hacking our way around gaps in the ChEBI ontology to try to represent biochemistry accurately. (Just saying ...)

edwong57 commented 11 months ago

Should we discuss at next editors' call?

danielhhaft commented 11 months ago

It looks like there is a problem here, for which the solution is allowing something like "aspartate biosynthetic process" to have two different parents. If its a child of "aspartate family metabolic process", but so are terms for biosynthesis for some non-proteinogenic amino acids, then obviously "aspartate family metabolic process" cannot be a child of "proteinogenic amino acid biosynthesis". It has to aim one higher.

"proteinogenic amino acid biosynthesis" will be incredibly useful because it belongs to core metabolism. It's informative to know that the cell is actively making new amino acids, not just biding time, for example.

"non-proteinogenic amino acid biosynthesis", however, is the more exciting one. It covers multiple processes of secondary metabolism biosynthesis.

For me, this is a much more important distinction than "making amino acids, some for core metabolism, some for secondary metabolism, that use a few of the same building blocks as aspartate"

I suggest making the biosynthesis GO process term for biosynthesis of each the 22 amino acids directly a child "proteinogenic amino acid biosynthesis", since that is already exactly the definition of the term, "makes one of these 22 amino acids".

And then the other can be "makes any amino acid that is NOT one of the list of 22".

I will probably never assign "proteinogenic amino acid biosynthesis" directly, because instead I will apply the more specific child term for the one or more amino acids themselves. But the other, "non-proteinogenic", I've been waiting years, because there are probably dozens at least that don't yet have there individual GO terms.

deustp01 commented 11 months ago

I suggest making the biosynthesis GO process term for biosynthesis of each the 22 amino acids directly a child "proteinogenic amino acid biosynthesis", since that is already exactly the definition of the term, "makes one of these 22 amino acids".

And how does that work when some freshly synthesized proteinogenic glutamate is directed away from protein synthesis to fill a synaptic vesicle in a neuron? A better solution might be to recognize the "proteinogenic" is a valid description of a role, like "toxin" or "drug", and is not a kind of function or process.

danielhhaft commented 11 months ago

"... how does that work when some freshly synthesized proteinogenic glutamate is directed away from protein synthesis to fill a synaptic vesicle in a neuron?"

I think it works fine, because glutamate is still glutamate. Obviously cells can use any amino acid for some purpose other than ribosomal synthesis. What cells can't do is use the ribosome to put ornithine into a polypeptide chain, or D-glutamate. Based on the literature I've read over that past 25 years, researchers hearing that an enzyme is involved in the biosythesis of an amino acid will want to know, "do you mean 'amino acid' as one of those 20 [but actually 22] used in ribosomal synthesis, or do you mean one of the weird ones that simply cannot be used purposefully by tRNA ligases and then ribosomes.

I'm very happy to agree that any of the 20 common amino acids may have an alternate use. Including by non-ribosomal peptide synthetases and various other biosynthesis. But they belong to an exclusive club, and giving researchers one GO term to rule them all, one GO term to bind them, it's overdue. It enables clearer thought than knowing which amino acids are in the pyruvate family. Which apparently are L-alanine and D-alanine, one of which is proteinogenic and one of which isn't.

An analogy would be - a GO term for putting money into a retirement account. Yes, the money could be pulled out a different purpose, such as dealing with a medical emergency. But still, the financial process is "putting money into a retirement account", what ever happens to the money later.

Proteinogenic amino acids is an account. Non-proteinogenic amino acids (please turn that one on now! I will use that one extensively) is a different account. I would very much like researchers into both core metabolism and natural product biosynthesis to be able to do better accounting here.

edwong57 commented 10 months ago
edwong57 commented 10 months ago

D-amino acid biosythetic process, D-amino acid catabolic process and D-amino acid metabolic processes already exit

danielhhaft commented 10 months ago

I imagine

D-ornithine biosynthesis will go under D-amino acid biosynthetic process and L-ornithine biosynthesis will go under L-amino acid biosynthetic process

but that does address the request I made, with I based on the idea that there will be used for whom "amino acid biosynthesis" is too broad, unable to distinguish the ones that ribosomes and tRNAs and dieticians care about from the thousands of other chemicals that technically qualify.

I see nearly every user wanting a good way to split up the amino acids into "used to build proteins" vs. "can't be used for that, must be used for something else."

It's mostly the companion to this one that I care about, "non-proteinogenic amino acid biosynthesis". Making any amino acid for which there is no tRNA that gets charged when the charging is correct and complete, where that tRNA is actually used by ribosomes.

vs. the process of synthesizing any amino acid for which there isn't any tRNA for the ribosome to use as an adapter for the purpose of putting the amino into a protein.

I agree it is important to specify D- or L- if that's what is being made.

But that is no replacement giving users of GO the ability to select for proteinogenic (which also includes Gly) vs. non-proteinogenic (which can include any number of L-amino acids that are not among the 19).