ebi-chebi / ChEBI

Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds.
https://www.ebi.ac.uk/chebi
Creative Commons Attribution 4.0 International
43 stars 10 forks source link

Amino acid families, antigens and more #2749

Closed muthuvenkat closed 8 years ago

muthuvenkat commented 11 years ago

Dear ChEBI curators,

I'm going through a list of GO metabolism terms that don't have cross-products to CHEBI terms yet, trying to add them in where appropriate. I have a few questions that I hope you may help me with. Some of them are related (amino acid families, antigens). They stem from the following GO terms:

GO:0009067 aspartate family amino acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of amino acids of the aspartate family, comprising asparagine, aspartate, lysine, methionine and threonine. Should there be a CHEBI grouping term for these?

GO:0046145 D-alanine family amino acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of D-alanine and related amino acids. Its only child is GO:0030632 D-alanine biosynthetic process and it maps to D-alanine (CHEBI:15570). What family of compounds does GO:0046145 refer to? It doesn't have any annotation, so I can't tell how it was meant to be used. Should we get rid of it and just leave its child?

GO:0009084 glutamine family amino acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of amino acids of the glutamine family, comprising arginine, glutamate, glutamine and proline. Should there be a CHEBI grouping term for these?

GO:0009076 histidine family amino acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of amino acids of the histidine family. No direct annotation; no children other than 'histidine biosynthetic process'. Is there really such a family? Should there be a grouping term in CHEBI, or is the GO term useless?

GO:0009079 pyruvate family amino acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of any amino acid that requires pyruvate for its synthesis, e.g. alanine. Its only child is 'alanine biosynthetic process'. Is this grouping term really meaningful, and if so is there any CHEBI compound I could map it to?

GO:0009070 serine family amino acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of amino acids of the serine family, comprising cysteine, glycine, homoserine, selenocysteine and serine. Should there be a grouping term in CHEBI?

GO:0042872 D-galactarate biosynthetic process Def: The chemical reactions and pathways resulting in the formation of D-galactarate, the D-enantiomer of the anion of galactaric acid. Could we please have CHEBI:NEW D-galactaric acid anion is enantiomer of galactaric acid anion (CHEBI:48871) synonym D-galactarate

GO:0009246 enterobacterial common antigen biosynthetic process Def: The chemical reactions and pathways resulting in the formation of the enterobacterial common antigen, an acidic polysaccharide containing N-acetyl-D-glucosamine, N-acetyl-D-mannosaminouronic acid, and 4-acetamido-4,6-dideoxy-D-galactose. A major component of the cell wall outer membrane of Gram-negative bacteria. Would 'enterobacterial common antigen' belong in CHEBI? Asking because I see quite a few specific children of CHEBI 'antigen'.

O:0009248 K antigen biosynthetic process Def: The chemical reactions and pathways resulting in the formation of a K antigen, a capsular polysaccharide antigen carried on the surface of bacterial capsules that masks somatic (O) antigens. Would 'K antigen' belong in CHEBI? I see quite a few children of 'antigen', but I don't think this one is in (I might have missed it, it's not an easy term to search.)

GO:0046951 ketone body biosynthetic process Def: The chemical reactions and pathways resulting in the formation of ketone bodies, any one of the three substances: acetoacetate, D-3-hydroxybutyrate (beta-hydroxybutyrate) or acetone. Biosynthesis involves the formation of hydroxymethylglutaryl-CoA, which is cleaved to acetate and acetyl-CoA. Would a grouping term 'ketone body' belong in CHEBI? I presume not based on Wikipedia '...although beta-hydroxybutyric acid is not technically a ketone but a carboxylic acid'?

GO:0010025 wax biosynthetic process Def: The chemical reactions and pathways involving wax, a compound containing C16 and C18 fatty acids. Would a grouping term 'wax' belong in ChEBI?

Thanks in advance for your feedback. I might have more questions in the near future, but hopefully not too many. Cheers, Paola.

Reported by: paolaroncaglia

muthuvenkat commented 11 years ago

Original comment by: G-Owen

muthuvenkat commented 11 years ago

Hi Paola,

1) Amino acid families. We now have: a) aspartate family amino acid (CHEBI:22658) [chldren are the L-enantiomers of aspartic acid, asparagine, lysine, methionine, threonine and isoleucine]. b) glutamine family amino acid (CHEBI:24318) [children are the L-enantiomers of glutamic acid, glutamine, proline and arginine]. c) pyruvate family amino acid (CHEBI:26463) [children are the L-enantiomers of alanine, valine, and leucine]. d) serine family amino acid (CHEBI:26650) [children are glycine and the L-enantiomers of serine, cysteine and homocysteine]. e) erythrose 4-phosphate/phosphoenolpyruvate family amino acid (CHEBI:73690) [children are the L-enantiomers of phenylalanine, tyrosine, and tryptophan]. I have not created a class for histidine amino acid family, as histidine would be the only member, so it seems a bit pointless! I think the same would also apply to D-alanine.

2) D-galactarate - not sure what is wanted here. Galactaric acid contains a plane of symmetry, so the terms D-galactaric acid and L-galactaric acid are meaningless - it is a meso compound with no optical rotation. In principle, it would be possible to selectively deprotonate the pro-R (or pro-S) carboxy group, but I'm not sure which one would need to be deprotonated to give D-galactarate. (Deprotonating both would give a meso-dianion). So this is still pending.

3) Antigen terms - should be ok - i'll try to get these sorted next week.

4) Ketone body - now done (CHEBI:73693).

5) Wax - Now done (CHEBI:73702), but not easy to classify, as a wax may be a pure single substance, or a mixture, which only join at the very top of the tree. Will have to rely on classification of child terms to classify as organic, etc.

Cheers, Gareth

Original comment by: G-Owen

muthuvenkat commented 11 years ago

Dear Gareth,

Many thanks for your feedback. I'm going through the list and making the necessary changes in GO.

I have a question on 'D-alanine family amino acid'. You wrote: "I have not created a class for histidine amino acid family, as histidine would be the only member, so it seems a bit pointless! I think the same would also apply to D-alanine." Unfortunately in my initial ticket I only listed 'D-alanine family amino acid biosynthetic process', whose only child is GO:0030632 D-alanine biosynthetic process and it maps to D-alanine (CHEBI:15570). However, GO:0046144 D-alanine family amino acid metabolic process (defined as 'The chemical reactions and pathways involving D-alanine and related amino acids') has children for D-alanine, D-arginine, D-glutamate, D-glutamine and D-ornithine. Are these correct? If so, would this warrant the creation of a CHEBI term for 'D-alanine family amino acid'?

Cheers, Paola

Original comment by: paolaroncaglia

muthuvenkat commented 11 years ago

Original comment by: paolaroncaglia

muthuvenkat commented 11 years ago

Hi again Gareth,

About 'pyruvate family amino acid'. The metabolic term in GO is defined as 'The chemical reactions and pathways involving any amino acid that requires pyruvate for its synthesis, e.g. alanine'. Alanine is indeed the only child term. Does this grouping term have a meaning? Otherwise I'll obsolete all GO terms referring to 'pyruvate family amino acid' because they don't have any direct annotation at all.

Thanks! Paola

Original comment by: paolaroncaglia

muthuvenkat commented 11 years ago

Hi Paola

The 2 antigen terms have now been added to CHEBI:

K antigen (CHEBI;73772) enterobacterial common antigen (CHEBI:73774)

These will be visible to you after our next release scheduled for June 2nd.

Cheers, Steve

Original comment by: stevet7

muthuvenkat commented 11 years ago

Hi Paola,

Re: amino acid families. My knowledge of the biosynthetic origins of amino acids is only as good as Wikipedia's ( http://en.wikipedia.org/wiki/Amino\_acid\_synthesis ), so I am open to correction here, but they reckon that the pyruvate family should contain alanine, valine and leucine, so I made these child terms for CHEBI:26463. All of the classes they list contain several members with the exception of histidine, whose origins are apparently unlike those of any other amino acid, which is why I didn't create a histidine family term, on the grounds that a class that can only have a single child term seems to be a bit silly, ontology wise.

I have no authoritative knowledge about how D-amino acids are produced in nature, but had just assumed that they were made from the corresponding L- amino acids via an 'oxidation alpha- to the carboxy group followed by enantioselective reduction' -type process. If my assumption is correct, then either: 1) The D-amino acids should be members of the same families as the corresponding L-amino acids (so that the chldren of aspartate family amino acid (CHEBI:22658) would be both the L- and the D-enantiomers of aspartic acid, asparagine, lysine, methionine, threonine and isoleucine), or 2) There would be separate D-amino acid families (D-aspartate family amino acid; D-glutamate family amino acid; etc. whose members would be the D-enantiomers of the members of the corresponding L- families.

Do you have a preference (or any other option!)? - as I say, my knowledge of how these things are produced and what aspects are important to biologists is extremely limited, so I am happy to go along with any workable solution that meets user needs.

Cheers, Gareth

Original comment by: G-Owen

muthuvenkat commented 11 years ago

Dear Gareth (and Steve), thanks for your work!

Re. D-galactarate: summing up:

[my query:] GO:0042872 D-galactarate biosynthetic process Def: The chemical reactions and pathways resulting in the formation of D-galactarate, the D-enantiomer of the anion of galactaric acid. Could we please have CHEBI:NEW D-galactaric acid anion is enantiomer of galactaric acid anion (CHEBI:48871) synonym D-galactarate

[Gareth's reply]: D-galactarate - not sure what is wanted here. Galactaric acid contains a plane of symmetry, so the terms D-galactaric acid and L-galactaric acid are meaningless - it is a meso compound with no optical rotation. In principle, it would be possible to selectively deprotonate the pro-R (or pro-S) carboxy group, but I'm not sure which one would need to be deprotonated to give D-galactarate. (Deprotonating both would give a meso-dianion). So this is still pending.

I'm afraid I'm not qualified to make statements here. However I looked into how D-galactarate terms are (manually) annotated in GO. I found that D-galactarate metabolism and catabolism have been used to annotate bacterial proteins, based on PMID:9772162, PMID:4887503, PMID:10952301. I'm not quite sure if the use of 'D-galactarate' is justified here. I'd be very grateful if you could please have a quick look at those papers and advice. If 'D-galactarate' shouldn't exist, I'll merge the relative terms into their parent 'galactarate' terms so we don't lose the annotations.

Many thanks, Paola

Original comment by: paolaroncaglia

muthuvenkat commented 11 years ago

Hi Paola,

I have had a look at the three papers you mention. The first one (Biochemistry, 1998, 37, 14369-14375) uses the the phrase "(D)-glucarate/galactarate catabolic pathway" and in one scheme erroeously uses the label (D)-galactarate for a structure. But it is clear that this is an error - in a later scheme, they compare the same structure, this time correctly labelled "Galactarate" with that of "(D)-altronate". In all cases, they are talking about the dianion, which cannot be D. The second one (J. Bacteriol., 1969, 97, 1227-1233) only uses the term "galactarate" in reference to earlier work (Biochem. Biophys. Res. Commmun, 1963, v11, 239-243, which only mentions "galactarate" in reference to a 1958 paper (Bacteriol. Proc., 101) which I can't find. The third (Nature, 2000, v406, 477-483) seems to make no mention of galactarate. So I reckon "D-galactarate" shouldn't exist!

Cheers, Gareth

Original comment by: G-Owen

muthuvenkat commented 11 years ago

Original comment by: G-Owen