cmungall / disjointness-analysis

1 stars 0 forks source link

biosynthesis and catabolism #2

Open ukemi opened 4 years ago

ukemi commented 4 years ago

At first glance, this might seem like biosynthesis and catabolism would be disjoint, however I don't think this is always true. It seems to me that it would be valid to have a process that catabolizes one compound and at the end of the process has an output of a compound that would be considered synthesized. This is probably not always the case, but it is sometimes. @deustp01 do you agree? It seems kind of funny when you think about biosynthesis/anabolism processes in the energy flow aspects of the classic biochemistry textbooks. The alternative would be to classify a pathway as only biosynthesis or catabolism and simply have the respective inputs and outputs be just that. They would only be double-classified if the pathway had both a primary input and a primary output.

ukemi commented 4 years ago

All the ones in category #2 would be easy fixes if we decide catabolism and biosynthesis are disjoint. @deustp01 here are some of the examples: tyrosine catabolic process to fumarate is asserted to be a dicarboxylic acid biosynthetic process because of the fumarate output. I'm not sure this sits well with me. L-xylitol catabolic process to xylulose 5-phosphate is asserted to be a xylulose 5-phosphate biosynthetic process. I think I did this, but now I'm not sure it was a good idea.

ukemi commented 4 years ago

LOL. Looks like I made a self-referential ticket. Usually I catch that.

deustp01 commented 4 years ago

In humans excess phenylalanine is hydroxylated to form tyrosine, which is further metabolized to acetoacetate (ketone body), which can be converted to CO2 + H2O plus energy (catabolism; I think there are no other fates for acetoacetate, at least in mammals). The other final product of tyr catabolism is fumarate, a TCA cycle intermediate, which could be fully oxidized via the TCA cycle (catabolism) or could be pulled out of the cycle to provide carbons for gluconeogenesis, a biosynthesis, or, probably, other more obscure biosynthetic products. If someone's diet is supplying lots of both phe and tyr, then all of these steps look catabolic. If that diet provides excess phe and insufficient tyr, then the first step looks biosynthetic and the rest (if there's more excess phe than is needed to meet the tyr requirement) still look catabolic.

It's easy to take advantage of all the reversible steps in the pentose phosphate pathway to make xylulose 5-phosphate either an intermediate in the biosynthesis of ribose-5-phosphate or an intermediate in glycolysis via fructose-6-phosphate and glyceraldehyde-3-phosphate.

We could escape the phe - tyr trap by noting that that the conversion of phe to tyr actually involves a couple of molecular functions, so we could have a phe-to-tyr metabolic process, a tyr catabolic process, and no phe catabolic process. That works for humans but maybe not for other taxa, and anyway isn't a generally useful strategy.

But the mess can be generalized. All the "glucogenic" amino acids are called that because, like tyrosine, some or all of their carbon atoms can be used in humans as starting materials for gluconeogenesis, even though they are also intermediates on pathways of catabolic energy generation.

And we are avoiding (for myself out of ignorance) all the wonderful alternative fates provided by secondary metabolic processes of plants.

The biological way out may be to note that in a given physiological state, substrate abundance and regulatory factors interact to drive all of these processes that appear to have multiple plausible outcomes that mix biosynthesis and catabolism, in single directions that are cleanly biosynthetic or cleanly catabolic. So, looking at the level of a single molecular function, we may often be unable to determine whether it contributes to biosynthetic or catabolic processes or both. But when we look at it in the context of a whole biological process AND the regulatory processes affecting that process AND the particular non-primary inputs and outputs involved due to physiological context, then the direction of the function - biosynthesis or catabolism - is set.

Stated that way, the whole problem is one of missing information. We can't say whether the conversion of phe to tyr is biosynthetic or catabolic in isolation any more than we can say whether the biosynthetic fate of any given molecule of phe is tyr or a hormone or melanin.

ukemi commented 4 years ago

Thanks @deustp01. If I am reading into this correctly, despite similarities like separate branches and directions, the biosynthetic pathways and the catabolic pathways would never be identical, right? They would always either branch off or go in the reverse direction. So for example, you would never want to say that cannonical glycolysis (a glucose catabolic pathway) is a pyruvate biosynthetic pathway, right?

deustp01 commented 4 years ago

I think that what I'm saying is that if you fiddle with the boundaries right, catabolic processes / pathways should always be distinguishable from biosynthetic ones, but I'm worried that some of the needed fiddles may be out scope for GO reasoning. E.g., how do all the physiological regulatory factors that determine whether aldolase activity is catabolic or biosynthetic get worked into the logical definition of GO:0004332? Or is the answer that molecular functions can be agnostic, with directoion to be resolved by looking at the biological processes they are parts of?

ukemi commented 4 years ago

So at the end of the day, if we fiddle around with it enough, we could make these disjoint classes. It would be interesting to query the Reactome models to see if we get violations. Once again, we might be able to use them as a seed.

I think the scope, whether we like it or not gets addressed historically at the level of the ontology. For molecular functions that can work in both directions, the logical definition is agnostic and will be represented by the Rhea agnostic reaction. However, these functions will be used in models where the direction is specified. We already have examples of this from the Reactome import. We have a model for glycolysis (catabolic) and a separate model for gluconeogenesis (biosynthetic). So the answer to your last question is yes.

I think it is in the models where we will take the physiological regulatory factors into consideration. But the physiology also brings up the issue of proposing that metabolic process can only be cellular. Anything that goes beyond things happening in a cell will necessarily have to be relegated to regulation, I think. I really think this will limit those if us who work on multicellular organisms, but we will need to figure out a way to deal with it. Think about how complex carbohydrates get catabolized in you; saliva, intestine. Do we relegate that to digestion and save catabolism for only processes that occur in a single cell? But there are other examples too.

deustp01 commented 4 years ago

More thought and examples / cases needed for the [cellular] metabolism part but maybe this is not a problem.

Complex carbohydrates are digested in the extracellular region, enabled by enzymes secreted from neighboring cells, so plain metabolism plus anatomy terms get all the information. That may be true in general for processes that involve different cell types but sequentially, like the various kinds of bile acid synthesis where different cells initiate specific versions of the process distinguished by initial hydroxylation site, then export intermediate products. These all get taken up by hepatocytes, which carry out the rest of the process. Again, cellular metabolism plus transport gets the whole job done, with assistance from anatomy terms.

ukemi commented 4 years ago

Actually, this supports what I had been thinking about too! Do we even need a cellular versus supercellular distinction in the ontology? What if in GO-CAM models we just specified where things happen and determined the level of granularity through queries. It would alleviate the whole cellular process conundrum, but it would not provide a single cellular process ontology term for slimming.