SysBioChalmers / Human-GEM

The generic genome-scale metabolic model of Homo sapiens
https://sysbiochalmers.github.io/Human-GEM-guide/
Creative Commons Attribution 4.0 International
88 stars 40 forks source link

Model-based assessment of metabolic functionalities using omics data #280

Open rasools opened 2 years ago

rasools commented 2 years ago

Description of the issue:

Richelle et al., in a recent research article suggested an approach for interpretation of omics data using GEMs. The key idea behind the suggested approach is to connect changes in transcriptomics/proteomics data to changes in cell functions using defining metabolic tasks. To do this, we first need to define a list of metabolic tasks that human cells can accomplish (similar to defined tasks used by automatic GEM reconstruction approaches such as tINIT) and then extract gene sets associated with each metabolic task. Finally, we need to overly the omics dataset and measure the pathway usage for each metabolic task. Developed functions potentially could be used alongside functions in GeneSetAnalysisMatlab to provide more biological insights from changes in transcriptomics data.

Questions to answer:

  1. The minimum reaction set associated with each metabolic task depends on allocated boundary fluxes. Whats fluxes should be open when we want to find the minimum associated reactions with a metabolic task?
  2. What approach should we use to score genes associated with each metabolic task based on their expression in transcriptomic data?

Expected feature/value/output:

Developing a pack of functions for performing metabolic tasks-based gene set analysis.

I hereby confirm that I have:

cherkaos commented 2 years ago

This is a great idea. After comparing Cellfie's tasks with Human1's, you notice that theirs are more likely in human. For example, in Human1's essential tasks, de novo synthesis of nucleotides uses ammonia as input. This is more link to bacterial metabolism, not human. Did you think about integrating Cellfie's tasks into your task collection? I started importing Cellfie tasks so that we could use them with the functions developed for Human1. This is my current attempt : CONSENSUS_TASKS_tINIT_2.xlsx However, some tasks are currently not working. If it is something you are looking into, we could try to make it work in Human1.

haowang-bioinfo commented 2 years ago

@cherkaos thanks you for the nice input.

Did you think about integrating Cellfie's tasks into your task collection? I started importing Cellfie tasks so that we could use them with the functions developed for Human1. This is my current attempt : CONSENSUS_TASKS_tINIT_2.xlsx However, some tasks are currently not working. If it is something you are looking into, we could try to make it work in Human1.

It would be beneficial to integrate CellFie tasks to Human-GEM for general usage. Please upload it into folder data/metabolicTasks/ with a PR, in which the source citation and involved adjustments are explicated.

rasools commented 2 years ago

@cherkaos and @Hao-Chalmers, thanks for your inputs and suggestions. The first step for addressing the current issue is defining the desired set of metabolic tasks. Sarah, as you have suggested, it would be a good start to first develop a version of Human1 that can satisfy the maximum number of both CellFie tasks and tasks mentioned in data/metabolicTasks/ I will investigate why some CellFie tasks are not satisfied by Human1 and try to solve the problem. By this, probably we can also generate a list of suggestions for improving Human1.

cherkaos commented 2 years ago

Great, I'm glad to hear that it is in your interest. @Rasools - So far, I've only converted the Cellfie's metabolites (Recon Ids) into Human1. Only Tyr_ggn could not be converted (Task 37 Glycogen Biosynthesis). I also converted the compartment [x] into [s] as it was not accepted as input in Human1. Metabolites which were both IN and OUT generated also an error when using checkTasks.m, which I removed from OUT. It was mainly H20. @Hao-Chalmers - Sure, I can push the document I shared above in the data/metabolicTasks/. However, it is not working in its current form. Is that okay?

haowang-bioinfo commented 2 years ago

@Hao-Chalmers - Sure, I can push the document I shared above in the data/metabolicTasks/. However, it is not working in its current form. Is that okay?

@cherkaos it's okay to begin with this, which can be improved in follow-up PRs.

mihai-sysbio commented 2 years ago

@cherkaos another option is to mark the PR as a draft if you prefer, and then mark it as ready for review whenever you feel it has reached that state.

cherkaos commented 2 years ago

By the way, I noticed some strange results using essential tasks (data/metabolicTasks/metabolicTasks_Essential.xlsx) to create the models and the full tasks (data/metabolicTasks/metabolicTasks_Full.xlsx) for functional comparison. For example, heme biosynthesis is in both lists but it was reported in the Human1 manuscript that it doesn't pass in blood. How could that be if it is essential ?

Blood_HemeBiosynthesis

@Rasools @JonathanRob

JonathanRob commented 2 years ago

@cherkaos

For example, heme biosynthesis is in both lists but it was reported in the Human1 manuscript that it doesn't pass in blood. How could that be if it is essential ?

This is due to a minor difference in the heme biosynthesis tasks between the two task files. I suspect it is because the task in the "full" task list requires production and excretion (to the extracellular compartment) of heme, whereas the "essential" task only requires that it is produced (in the cytosol).

I'm not saying that this was intentional or how I believe it should be formulated - we simply implemented these task lists as provided. But this highlights some issues with this approach, in particular some inconsistencies/overlaps between the two lists that likely could use some curation.

cherkaos commented 2 years ago

Thanks for the clarification!

CadavidJoseL commented 2 years ago

This is an issue I am very interested in. I think one question to reconcile is how we define and check for tasks, since the Richelle et al paper does it without relaxing the pseudo steady state assumption and only constrains inputs/outputs at the flux level.

haowang-bioinfo commented 2 years ago

@CadavidJoseL the tasks collected to Human-GEM are normally defined according to textbook and/or published literature. They are checked by checkTasks function that basically simulate model with inputs/outputs constrains as defined in a task file .

mihai-sysbio commented 2 years ago

And here is the link to the corresponding section in Human-GEM-guide.

CadavidJoseL commented 2 years ago

Thanks for your answers, those issues were clear to me. I should have been clearer: What I mean is that the way tasks are checked in this related paper by Richelle et al (and in COBRA toolbox in general) and in the RAVEN toolbox are slightly different in terms of how the LP is set up: "We also propose to define a metabolic task as the capacity of producing a defined list of output products when only a defined list of input substrates is available. However, we modified the way to implement it from the RAVEN toolbox. Instead of relying on the relaxation of the steady-state assumption, we take an approach more similar to that proposed by [14] by imposing constraints only at the flux level. Therefore, a model successfully passes a task if the associated LP problem is still solvable when the sole exchange reactions allowed carrying flux in the model are temporary sink reactions associated with each of the inputs and outputs listed in the task". I will check whether the LPs are equivalent, but maybe this can be a source of discrepancy?

haowang-bioinfo commented 2 years ago

I will check whether the LPs are equivalent

Look forward to the comparison

JonathanRob commented 2 years ago

Hi @CadavidJoseL, I think I partially answered this question in the Gitter chat, but let me know if not. It's true that the checkTasks and tINIT algorithms modify the pseudo steady state assumption (i.e., the b value) for metabolites rather than adjusting reaction flux bounds. The optimization problem works out to be effectively the same for this approach as for the flux-based method, since they're both testing feasibility of the LP for each task. It should be relatively straightforward to implement the tasks from Richelle et al. using the same formulation as expected by checkTasks, just may require some additional testing.

cherkaos commented 2 years ago

Hi @CadavidJoseL I started doing the import from Cellfie Consensus Tasks to Human-GEM but a non-negligible number of them are failing (not just the ones that are supposed to fail) and I was wondering why. I originally thought it was due to compartments but maybe the point you raised about the differences in checking tasks also matters. Hi @JonathanRob - Thanks for the explanation. Where is the Gitter chat? Would like to see what you wrote.

mihai-sysbio commented 2 years ago

@cherkaos here's is the link to SysBioChalmers/Human-GEM on Gitter.

rasools commented 2 years ago

@cherkaos thank you Sara for translating tasks from Cellfie paper to be tested by Human1. I checked these tasks on Human1 to see how many of them can reproduce expected results (pass/fail). Out of 195 tasks included in Cellfie Consensus Tasks, 168 of them passed (86%), while 17 of them are failed and 10 tasks generate errors.

image

Error-making tasks are mainly because of undetected metabolites either among substrates or products of the task.

However, failed tasks probably need more investigations for finding the reason and each could be treated as a separate issue. Here is a list of failed tasks based on my analysis. Could you please share your results to see if it is a similar list of failed tasks?

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Task status | Task name -- | -- FAIL | Deoxyguanosine triphosphate synthesis (dGTP) FAIL | Deoxyuridine triphosphate synthesis (dUTP) FAIL | Deoxythymidine triphosphate synthesis (dTTP) FAIL | 3'-Phospho-5'-adenylyl sulfate synthesis FAIL | Degradation of guanine to urate FAIL | Conversion of 1-phosphatidyl-1D-myo-inositol 4,5-bisphosphate to 1D-myo-inositol 1,4,5-trisphosphate FAIL | Arginine synthesis FAIL | Aspartate synthesis FAIL | Synthesis of taurine from cysteine FAIL | Glutamate synthesis FAIL | Glutamine synthesis FAIL | Glycine synthesis FAIL | Conversion of lysine to L-2-Aminoadipate FAIL | Methionine degradation FAIL | Tyrosine synthesis (need phenylalanine) FAIL | Triacylglycerol synthesis FAIL | Synthesis of palmitoyl-CoA

The other question is that all the defined tasks in the list are set to pass. Do you confirm it based on defined tasks in Cellfie paper? I didn't check the tasks there and just used tasks listed in Cellfie Consensus Tasks.

cherkaos commented 2 years ago

@Rasools Great. Thank you for testing. I tried using the latest Human-GEM. I had 184 tasks passed (94%), 10 errors and 1 failed. Maybe our differences in failed tasks come from the different GEM versions. Which one did you use?

'Error-making tasks are mainly because of undetected metabolites either among substrates or products of the task.' I agree. But I think it has to do with localization. Do you think we should change compartments? What are your ideas on that end?

Task status Task name
Error Glycogen biosynthesis
Error Glycogen degradation
Error Starch degradation
Error Taurochenodeoxycholate synthesis
Error Glycochenodeoxycholate synthesis
Error tauro-cholate synthesis
Error glyco-cholate synthesis
Error Synthesis of bilirubin
Error Glucosaminyl-acylphosphatidylinositoll to deacylated-glycophosphatidylinositol (GPI)-anchored protein
Error Biosynthesis of g3m8masn
Fail Tyrosine synthesis (need phenylalanine)

Yes, I confirm. All these 195 tasks should pass. I wanted to clarify that I was using Cellfie's list which contains tasks that should pass, in comparison to Human-GEM's full tasks where some should fail.

rasools commented 2 years ago

@cherkaos, thanks for sharing the results. Yes, I am using the same model for checking tasks. Because error-making tasks are similar in our results let's first focus on failed/passed tasks to find what is the origin of the difference in our results. Have you added boundary metabolites to the model prior to checking tasks? If you have boundary metabolites in the model, the total number of metabolites should be 10035. But, if boundary metabolites are not included in the model, the number of metabolites is 8370.

cherkaos commented 2 years ago

No, I haven't added boundary metabolites to the model (I have 8371 metabolites).

rasools commented 2 years ago

@cherkaos, so probably that's why you get passed for almost all tasks that are not error-making. For checking tasks we need to have the model in its closed form (containing boundary metabolites). You can add boundary metabolites to the model using the addBoundaryMets function.

cherkaos commented 2 years ago
Okay I get the same results as you. Task status Task name
Error Glycogen biosynthesis
Error Glycogen degradation
Error Starch degradation
Error Taurochenodeoxycholate synthesis
Error Glycochenodeoxycholate synthesis
Error tauro-cholate synthesis
Error glyco-cholate synthesis
Error Synthesis of bilirubin
Error Glucosaminyl-acylphosphatidylinositoll to deacylated-glycophosphatidylinositol (GPI)-anchored protein
Error Biosynthesis of g3m8masn
Fail Deoxyguanosine triphosphate synthesis (dGTP)
Fail Deoxyuridine triphosphate synthesis (dUTP)
Fail Deoxythymidine triphosphate synthesis (dTTP)
Fail 3-Phospho-5-adenylyl sulfate synthesis
Fail Degradation of guanine to urate
Fail Conversion of 1-phosphatidyl-1D-myo-inositol 4,5-bisphosphate to 1D-myo-inositol 1,4,5-trisphosphate
Fail Arginine synthesis
Fail Aspartate synthesis
Fail Synthesis of taurine from cysteine
Fail Glutamate synthesis
Fail Glutamine synthesis
Fail Glycine synthesis
Fail Conversion of lysine to L-2-Aminoadipate
Fail Methionine degradation
Fail Tyrosine synthesis (need phenylalanine)
Fail Triacylglycerol synthesis
Fail Synthesis of palmitoyl-CoA
CadavidJoseL commented 2 years ago

Friends, I made a quick function to check the tasks by constraining only input and output fluxes by creating temporary exchange reactions involving only the metabolites in either inputs or outputs of each task (not by relaxing pseudo-steady state) and I get the same results (errors and fails). As @JonathanRob had explained, both ways of checking the tasks seem to be equivalent since relaxing the pseudo-steady state constraint for a metabolite (when the system is in closed form) is equivalent to keeping the constraint, but having an unbalanced reaction. That being said, I took a closer look at task 15 as a starting point (Deoxyguanosine triphosphate synthesis (dGTP)) and relaxed the bounds of the outputs (lower 0 - upper 100). What I found is that the problem is then feasible, albeit with some different fluxes! The production of H+[c] should be constrained to 15, not 14. For Pi[c] should be 6, not 4, For PPi[c] it should be 1, not 2. With those changes that task passes! I haven't checked other tasks, but it might just be that the stoichiometry they are using is not exactly satisfied with the HUMAN-GEM.

haowang-bioinfo commented 2 years ago

@CadavidJoseL excellent!

Please keep up with the good work, look forward to a thorough investigation of the failed Cellfie Consensus Tasks using your function.

CadavidJoseL commented 2 years ago

Using this approach of relaxing the boundaries on the outputs, the following tasks now pass:

'Deoxyguanosine triphosphate synthesis (dGTP)' 'Deoxyuridine triphosphate synthesis (dUTP)' 'Deoxythymidine triphosphate synthesis (dTTP)' 'Arginine synthesis' 'Aspartate synthesis' 'Glutamate synthesis' 'Glutamine synthesis' 'Synthesis of palmitoyl-CoA'

I am attaching an excel file with the fluxes returned by the LP in each task (highlighted in red). Bear in mind that these are likely not the only boundary values that work, and defining a proper range of boundaries for the outputs can be done by FVA, but I haven't done so yet. Just wanted to highlight that the model is indeed capable of passing these task. Will investigate the other failed tasks a bit further.

metabolicTasks_Cellfie_corrected.xlsx

rasools commented 2 years ago

@CadavidJoseL it is a very interesting finding that by changing outbounds for some of the metabolites in the tasks, the task will pass by Human1. Do you have any idea what could be the origin of this difference between Human1 and the model that has been used in Cellfie paper? Although by this approach the tasks could be passed, but I wonder what should be the correct value among different outbound values that could potentially pass tasks? @JonathanRob do you have any though on this observation?

JonathanRob commented 2 years ago

Thank you for all of your help on this, @cherkaos, @Rasools, and @CadavidJoseL.

The tightly defined flux bounds on the tasks is a bit more constrained that the typical tasks that I have used, since it will enforce very specific flux ratios on the model. However, as long as the bounds are justified, then this is OK, but be aware that setting such rigid bounds can sometimes results in numerical errors (i.e., a problem should be solvable, but rounding errors result in an infeasibility). Generally, I would recommend relaxing the input/output bounds for tasks where the flux ratios are not critical to the definition of the task (e.g., allow a range of fluxes, or even just set the minimum flux values).

Regarding some of the differences highlighted by @CadavidJoseL in terms of flux bounds that had to be changed for Human-GEM, it seems that many of these involve reactants/products related to ATP phosphorylation and hydrolysis. Since Human-GEM uses some different coefficients in ATP synthesis and proton pumping reactions compared to other models in order to satisfy the overall energy balance of this process, it may result in slightly different production/consumption of some related metabolites (e.g., protons, phosphate, etc.). Maybe the source of the discrepancy is elsewhere in the model, but this is my first guess.

haowang-bioinfo commented 2 years ago

random thoughts: might be good to convert Metabolic Tasks files from excel to other plain text format (e.g. tsv).

rasools commented 2 years ago

thanks @JonathanRob for clarifications on this.

rasools commented 2 years ago

random thoughts: might be good to convert Metabolic Tasks files from excel to other plain text format (e.g. tsv).

I also like the idea, by this change task files could be modified in text-based environments.

JonathanRob commented 2 years ago

@Hao-Chalmers @Rasools, I believe @johan-gson has already modified the parseTaskList function to enable the use of .tsv files, so maybe take a look and try out that functionality.

haowang-bioinfo commented 2 years ago

@JonathanRob good to know this.

@Rasools could you also add a txt version of "CellfieConsensus" task file to #327.

mihai-sysbio commented 1 year ago

The Cellfie package now has a web interface at http://immcellfie.renci.org (see article).

rasools commented 1 year ago

It is really cool! Could be a nice teaching material for the Cobra section in the NBIS Omics integration course.

cherkaos commented 1 year ago

Thanks, Mihai. Congrats on the fast tINIT paper!