geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
216 stars 39 forks source link

fatty acid biosynthesis vs elongation #10773

Open gocentral opened 10 years ago

gocentral commented 10 years ago

Hi GO

sorry to trouble you but I'd like to get some advice about a few GO terms relating to fatty acid biosynthesis.

GO includes terms for fatty acid elongation as shown below (fatty acids are synthesized by addition of stepwise addition of 2-carbon building blocks):

GO:0030497 fatty acid elongation
GO:0019367 fatty acid elongation , saturated fatty acid GO:0019368 fatty acid elongation , unsaturated fatty acid
GO:0034625 fatty acid elongation , monounsaturated fatty acid
GO:0034626 fatty acid elongation , polyunsaturated fatty acid

But there are also a number of other process terms, a bit higher level:

GO:0006633 fatty acid biosynthetic process (parent of GO:0030497 fatty acid elongation).

What is the difference in usage between the two? When would one use "GO:0006636 unsaturated fatty acid biosynthetic process" instead of "GO:0019368 fatty acid elongation, unsaturated fatty acid"?

Thanks for your help,

Best, Alan

Reported by: alanbridge

Original Ticket: geneontology/ontology-requests/10586

gocentral commented 10 years ago

Original comment by: paolaroncaglia

gocentral commented 10 years ago

Hi Alan,

I'm looking at a section of the 2002 edition of 'Biochemistry' (Stryer):

http://www.ncbi.nlm.nih.gov/books/NBK22554/

"...

  1. The enzymes of fatty acid synthesis in higher organisms are joined in a single polypeptide chain called fatty acid synthase. In contrast, the degradative enzymes do not seem to be associated.
  2. The growing fatty acid chain is elongated by the sequential addition of two-carbon units derived from acetyl CoA. The activated donor of twocarbon units in the elongation step is malonyl ACP. The elongation reaction is driven by the release of CO2. ...
  3. Elongation by the fatty acid synthase complex stops on formation of palmitate (C16). Further elongation and the insertion of double bonds are carried out by other enzyme systems."

So in terms of usage I'd annotate fatty acid synthase to GO:0006633 fatty acid biosynthetic process, and 'other enzyme systems' as above to GO:0030497 fatty acid elongation. My understanding is that unsaturated fatty acids may be synthesized and elongated by different enzymatic systems (http://www.ncbi.nlm.nih.gov/books/NBK22401/). In this sense, current manual experimental annotations to GO:0006636 unsaturated fatty acid biosynthetic process are more numerous and different from those to GO:0019368 fatty acid elongation, unsaturated fatty acid (with the exception of Wormbase elo-1). I'm trying to look these pathways up in Reactome but their server is down at the moment... Please let me know if you are still unconvinced of the need for separate synthesis and elongation terms, and I'll search for further details.

Thanks, Paola

Original comment by: paolaroncaglia

gocentral commented 10 years ago

I'm unconvinced... val

Original comment by: ValWood

gocentral commented 10 years ago

OK, I've gone back and looked into this further. The Reactome view is possibly a bit too rich for this, so I've gone back to the basics i.e. Stryer's Biochemistry again.

http://www.ncbi.nlm.nih.gov/books/NBK22554/

Section 22.4.3 shows that the biosynthesis of fatty acids occurs in cycles and these are referred to as 'elongation cycles', which may cause some confusion. So the first cycle yields a 4-carbon compound, which, according to our fatty acid nomenclature borrowed by ChEBI, is a short chain fatty acid. But it is also the precursor for fatty acids in general. So the enzymes in table 22.2 should be annotated to 'fatty acid biosynthetic process' I'd say. [ChEBI nomenclature of fatty acids according to number of carbon atoms: <6=short, 6-12=medium, 13-22=long, >23=very long.] The last paragraph in 22.4.3 tells about synthesis of fatty acids containing between 6 and 16 carbon atoms. I assume it's the same enzymes as in table 22.2 carrying out these steps - so these enzymes cover, really, fatty acids biosynthetic process short-chain fatty acid biosynthetic process medium-chain fatty acid biosynthetic process long-chain fatty acid biosynthetic process IF number of C atoms is between 13 and 16 but also fatty acid elongation, saturated fatty acid IF number of C atoms is not higher than 16. Because of the mix in number of C atoms, and because it's the same enzymes taking part in these cyclic steps, it's hard to add part_of or has_part links between the biosynthesis and elongation terms, if you see what I mean.

Then... "The synthesis of longer-chain fatty acids is discussed in Section 22.6." http://www.ncbi.nlm.nih.gov/books/NBK22401/ This section shows that enzymes different from the ones in table 22.2 catalize elongation to yield fatty acids with more than 16 Cs and/or insertion of double bonds (desaturation). Elongation and desaturation reactions are often combined... In the example shown, NADH-cytochrome b5 reductase, cytochrome b5, and desaturase may be annotated to 'fatty acid elongation, monounsaturated fatty acid'. I guess the same enzymes may play a role in any desaturation reaction but I'm not sure. But maybe not, because "Mammals lack the enzymes to introduce double bonds at carbon atoms beyond C-9 in the fatty acid chain.".

I think we may need the opinion of a biochemist to get the (need for) terms and links straight... shall I ask Peter to take a look? Any other suggestion?

Thanks, Paola

Original comment by: paolaroncaglia

gocentral commented 10 years ago

Hi Paola

thanks for your reply, I'm still reading up on this issue myself. My current understanding is that the actual 4 reaction "biochemical cycle" of fatty acid elongation:

condensation to form a 3-keto-acyl reduction of 3-keto-acyl to 3-hydroxy-acyl dehydration of 3-hydroxy-acyl to 2,3-trans-enoyl reduction of 2,3-trans-enoyl to 2,3-saturated-acyl

is thought to be the same for all lengths. The first reaction in this cycle is the actual "elongation" step, and the 3 subsequent reactions convert the 3-keto intermediate back to a regular fatty acid.

I think as you suggested that "synthesis" might have been designed to refer to "de novo" FA synthesis from 2C building blocks (repeated cycling to build 2C,4C,6C,8C.... an elongation process) by cytoplasmic fatty acid synthases, while "elongation" might have been intended to refer to the subsequent continued elongation of FA synthesized in this way (or eaten in the diet) in microsomes. You could imagine a fatty acid removed from a phospholipid (phospholipase A), attached to a CoA molecule, then subsequently fed back into the elongation pathway again.

I wonder if it might be easier to define processes by their start and end points using a TermGenie template?

That way you could define each elongation cycle as a process.

GO BP: elongation of hexadecanoyl-CoA to octadecanoyl-CoA

start point CHEBI:nnnnn hexadecanoyl-CoA

GO MF: condensation of hexadecanoyl-CoA and malonyl-CoA to form 3-keto-octadecanoyl-CoA GO MF: reduction of 3-keto-octadecanoyl-CoA to 3-hydroxy-octadecanoyl-CoA GO MF: dehydration of 3-hydroxy-octadecanoyl-CoA to 2,3-trans-octadecenoyl-CoA GO MF: reduction of 2,3-trans-octadecenoyl-CoA to octadecanoyl-CoA

end point CHEBI:nnnnn octadecanoyl-CoA

I think on balance I'd define a process of FA biosynthesis, and break it up into steps like this. That would make precise enzyme annotation easier: elongases tend to have defined length ranges, so you would have specific process annotations for each elongase.

NB Although the above shows acyl-CoA, these reactions can in some instances take place on acyl carrier protein too (and in many cases it's not really known, so you could just define more general versions of these functions). Desaturation can also occur in a variety of lipids (-coA, -ACP, and actually in situ in phospholipids themselves) and it's not all that well understood.

Thanks for your help Paola, would also be interested to hear what the reactome guys think too.

Cheers, Alan

Original comment by: alanbridge

gocentral commented 10 years ago

Thanks Alan. I'll email Peter. I like your suggestion but we'd have to use different types of relationships: start point CHEBI:nnnnn corresponds more or less to our has_input (that we currently use for catabolism terms) and end point CHEBI:nnnnn corresponds to our has_output (that we use for biosynthesis terms).

Let's see what Peter suggests, thanks, Paola

Original comment by: paolaroncaglia

gocentral commented 10 years ago

thanks Paola, one last thing - I wondered would it be possible (maybe in termgenie?) to define (metabolic) pathways using both has_input and has_output - so you aren't obliged to classify pathways as catabolic or anabolic?

the catabolism of one molecule is often linked to the synthesis of another:

e.g. "production of hexadecanoate from sphingosine-1-phosphate" has_input sphingosine-1-phosphate has_output hexadecanoate

otherwise you might be obliged to annotate a protein with

catabolism of sphingosine-1-phosphate biosynthesis of hexadecanoate

Cheers, Alan

Original comment by: alanbridge

gocentral commented 10 years ago

Hi Alan, that links to a wider discussion among editors, I'll ask if anyone wants to comment on this. Sorry I can't offer a more speedy resolution but it's useful discussion!

Paola

Original comment by: paolaroncaglia

gocentral commented 10 years ago

One thing that comes straight to my mind is that the biosynthesis terms can be used for named entities, that is they can be used to describe the biosynthesis of a chemical that is a child of fatty acid in Chebi. The elongation terms on the other hand would not be used in this way because the elongating chains would not be used to describe the final product. The output of the biosynthesis would be the Chebi entity. The output of an elongation process would be a chain plus two. It may be that we want the input of the biosynthesis process to be further upstream than a chain minus two and we may want other subprocesses included in the biosynthesis of a given fatty acid.

In general, we are working on modeling metabolic pathways with both inputs and outputs, as well as functions that are steps in those pathways. We have just begun to try this and are still working on it.

David

Original comment by: ukemi

gocentral commented 10 years ago

thanks David maybe we could discuss this in Texas some more

Original comment by: alanbridge

gocentral commented 10 years ago

I've just caught up with this issue and don't have anything extra to add except that using has_input and has_output would help clarify the terms. There has already been talk about revisiting enzymes/metabolism at the Texas meeting so the use of these relationships in GO and how we will incorporate them would be a good agenda item.

Original comment by: tberardini

gocentral commented 10 years ago

It seems that most topics in this ticket would require further discussion, possibly at the next GOC meeting. I'll leave it open for the time being - Peter emailed me that the issue is quite complex and involves taxon differences - he'll add his comments here but it might take some time. Alan, I hope this is ok with you. And happy holidays to everyone :-)

Paola

Original comment by: paolaroncaglia

gocentral commented 10 years ago

sure that sounds great enjoy the holidays

Original comment by: alanbridge

gocentral commented 10 years ago

Original comment by: paolaroncaglia

gocentral commented 10 years ago

The FAS (fatty acid synthase) enzyme of yeast, chickens, and several mammals is a large cytosolic protein with separate domains that catalyze each of the reactions needed to add a 2-carbon unit to a growing fatty acid chain. Throughout the process, the chain is covalently attached to the ACP domain of FAS by a thioester linkage. Until the growing fatty acid reaches a length sufficient to make it a substrate of the thioesterase domain of FAS or of a separate thioesterase enzyme found in lactating mammary gland tissue, the only option open to it is further rounds of chain extension. The synthetic process is most efficiently primed with a two-carbon unit (acetyl CoA) and its major product outside the mammary gland is palmitic acid (C16). The process can also be primed less efficiently in vitro by a three- or four carbon unit (propionyl CoA, butyryl CoA) to yield C17 or C18 products, respectively. The extent of these alternative reactions in vivo is unclear and probably varies according to taxon, tissue, and metabolic state.

For the default version of de novo fatty acid synthesis, best studied in mammalian and chicken liver, then, the major inputs are acetyl CoA as a source of carbon atoms and NADPH as a source of reducing equivalents, and the major outputs are palmitate (the 16-carbon unbranched fully saturated fatty acid), CoA, and NADP+. While shorter fatty acids and derivatives of them appear as covalently bound intermediates in this synthesis, they are not normal reaction products any more than any other reaction intermediate that features a derivative of a substrate covalently attached to the enzyme would be.

Here are some references for the “liver” version of the process. Lin and Smith 1978 (PMID:416021) – thioesterase domains purified from rat liver and rat lactating mammary gland FAS proteins had virtually indistinguishable physical and enzymatic properties. In particular, both polypeptides efficiently cleaved palmitoyl-CoA (C16) and stearoyl-CoA (C18) to the corresponding free fatty acid, but had no activity against mediun chain length (C10, C12, C14) fatty acyl CoAs. Mattick et al. 1983 (PMID:6654913) likewise showed that a thioesterase domain purified from chicken liver FAS catalyzed the cleavage of palmitoyl-CoA but did not report tests of activity against other fatty acyl CoAs. Pazirandeh et al. 1989 (PMID:2681189) showed that a recombinant construct corresponding to the acyl carrier and thioesterase domains of chicken FAS is active against C16 and C18 fatty acyl esters and much less active against C14 or C20 fatty acyl esters. Modeling studies of the thioesterase domain of human FAS determined by X-ray crystallography are likewise consistent with specificity of the thioesterase activity for palmitoyl (C16) acyl substrates.

In a variant process that occurs in lactating mammary glands of non-ruminant mammals and in the uropygeal glands of some birds, a thioesterase enzyme encoded by a gene distinct from the one that encodes FAS, interacts with FAS molecules bearing a fatty acyl group of moderate chain length (C8 – C12) and catalyzes the release of these chains as free medium chain length fatty acids (Libertini and Smith 1987 (PMID:627544); Mikkelsen et al. 1987 (PMID:3805044)).

Both the “liver” and “mammary” versions of de novo fatty acid biosynthesis yield saturated unbranched fatty acids. In mammals, saturated unbranched fatty acids with more than 18 carbons are synthesized by addition of 2-carbon units in a sequence of reactions exactly like that performed by FAS. These reactions, however, are catalyzed by separate enzymes associated with the endoplasmic reticulum. Double bonds can also be introduced into saturated fatty acids, and the two kinds of events (further elongation, desaturation) can probably occur in either order.

To account for mammalian fatty acid synthesis, it would be sufficient to have process terms for the generation of saturated medium- and long-chain fatty acids, and for the desaturation and further elongation of the latter. While the enzymes that catalyze the desaturation and further elongation may well have substrate preferences (e.g., to extend preferentially fatty acids desaturated at a particular position, or to desaturate fatty acids of a particular chain length or which already contain a double bond at a particular position), GO terms fine-grained enough to capture these specificities and distinctions may come dangerously close to describing activities or processes associated with specific proteins. It may be sufficient to have terms for the synthesis of medium- and long-chain saturated fatty acids, and for the elongation and desaturation of fatty acids.

Original comment by: deustp01

gocentral commented 10 years ago

Hi all,

Many thanks to Peter for his detailed summary and references. My understanding is that there are significant taxon differences in these processes and I wonder if this topic would/should be material for modular annotation (LEGO, but we should stop using that name). But as for having useful and valid terms in the ontology, these are the key points by Peter:

"To account for mammalian fatty acid synthesis, it would be sufficient to have process terms for the generation of saturated medium- and long-chain fatty acids, and for the desaturation and further elongation of the latter. While the enzymes that catalyze the desaturation and further elongation may well have substrate preferences (e.g., to extend preferentially fatty acids desaturated at a particular position, or to desaturate fatty acids of a particular chain length or which already contain a double bond at a particular position), GO terms fine-grained enough to capture these specificities and distinctions may come dangerously close to describing activities or processes associated with specific proteins. It may be sufficient to have terms for the synthesis of medium- and long-chain saturated fatty acids, and for the elongation and desaturation of fatty acids."

So, necessary terms would be:

1) generation (synthesis) of saturated medium- and long-chain fatty acids GO already has these:

GO:0051792 medium-chain fatty acid biosynthetic process is_a fatty acid biosynthetic process Def: The chemical reactions and pathways resulting in the formation of any fatty acid with a chain length of between C6 and C12. [no mention of saturation though]

GO:0042759 long-chain fatty acid biosynthetic process is_a fatty acid biosynthetic process Def: The chemical reactions and pathways involving long-chain fatty acids, A long-chain fatty acid is a fatty acid with a chain length between C13 and C22. [no mention of saturation though]

Are the defs and parentage ok? Should we add anything in the defs. or def. comments? Are the terms the needed ones, in the end?

2) desaturation and further elongation of (long-chain?) fatty acids GO already has:

GO:0006636 unsaturated fatty acid biosynthetic process is_a fatty acid biosynthetic process Def: The chemical reactions and pathways involving an unsaturated fatty acid, any fatty acid containing one or more double bonds between carbon atoms. Should this term have a different parent & name - is it really it? (Note for self: if it's edited, make sure its reg. children are too.)

GO:0030497 fatty acid elongation is_a fatty acid biosynthetic process Def: The elongation of a fatty acid chain by the sequential addition of two-carbon units. Is this term ok, or should it have a different placement - and should any of its children be obsoleted? Or can they be useful to annotate non-mammalian proteins?

Should the 4 terms I just listed above be restricted to Mammalia? In general, should any of the current children/grandchildren of GO:0006633 fatty acid biosynthetic process be obsoleted or restricted to one or more taxa? Can we add annotation examples?

Lastly, could David or Peter please add the relevant topics from this discussion to the agenda for the Texas meeting (general issues on metabolism terms and relationships).

Thanks! Paola

Original comment by: paolaroncaglia

gocentral commented 10 years ago

A parent process, "Biosynthesis of any sort of long chain fatty acid", could have three children, "saturated long-chain fatty acid biosynthetic process", "unsaturated fatty acid biosynthetic process", and "long chain fatty acid elongation process". In mammals - but also yeast (Stryer 4th edition, page 618), the second process and third processes take as their input the output of the first one. In other taxa, this order constraint may not exist but I think the sibling arrangement accommodates that. When a fatty acid is to be both elongated and desaturated in a mammal, I don't know what the order is - perhaps either is possible or even back and forth, a desaturation step, a few elongation steps, another desaturation.

Maybe a good way to accommodate synthesis of medium and short chain-length fatty acids would be to have a "saturated fatty acid biosynthetic process term" (agnostic as to chain length), which itself could have chain length-specific terms. Desaturation could also have child terms for fatty acids of particular chain lengths or preexisting double bonds. These last two refinements would be useful for supporting taxon constraints - just as chain length varies between species and physiological state (e.g., mammalian exception for lactation), there are constraints for mammals about where a double bond can be put that definitely do not apply to plants (that's why some plant-derived unsaturated fatty acids are essential human dietary constituents).

Original comment by: deustp01

gocentral commented 10 years ago

Paola just posted a link to this ticket on this other one

https://sourceforge.net/p/geneontology/ontology-requests/10634/

I think they are different enough to not mark one as duplicate of, since 10634 is about the related activity terms, but perhaps whatever is done will or should resolve both at the same time, as there is the question of what kinds of evidence would lead one to use process vs activity terms.

I dislike the idea of having separate processes for each round of the elongation cycle, unless there is a biologically relevant scenario where one would separate them (i.e. I would not count in vitro systems that break the cycle as biological processes)

Original comment by: jimhu

gocentral commented 10 years ago

Original comment by: paolaroncaglia

gocentral commented 10 years ago

Also see https://sourceforge.net/p/geneontology/ontology-requests/7887/

Original comment by: paolaroncaglia

gocentral commented 10 years ago

Original comment by: ukemi

ValWood commented 3 years ago

@ukemi what is the action here?

ValWood commented 1 year ago

The original question was:

What is the difference in usage between the two? When would one use "GO:0006636 unsaturated fatty acid biosynthetic process" instead of "GO:0019368 fatty acid elongation, unsaturated fatty acid"?

causing this problem:

Screenshot 2022-11-19 at 10 28 53

(multiple axis of classification). need to decide if both are required (and if so add missing paretages).