Closed edkerk closed 1 year ago
@edkerk note that measureAbundance
is not meant for constraining the model with proteomics, but to read Pax-DB data and estimate the fraction [g/g] of a group of enzymes respect to the total, which is needed to (later) set a constrain on the protein pool for enzymes which don't have proteomic measurement. The function you would want to use for the purpose you describe is constrainEnzymes
, which accepts inputs of pIDs
and data
as you suggest.
That being said, I agree with the problem of hard-coded locations (e.g. of prot_abundance.txt
), and we will change this at the toolbox level. For this the idea would be to add /geckomat*
and /databases*
paths as a requirement for using GECKO, that way we can avoid any relative path and use those functions/data from any other folder. Let us know if you have any thoughts on this proposal.
Thanks for the explanation, this could go directly in the documentation! :) What if no Pax-DB data is available for my organism of interest?
Requiring defining the location of the /geckomat*
and /databases*
paths as parameters sounds like a good solution.
What if no Pax-DB data is available for my organism of interest?
I guess it would have to be replaced with some proxy, e.g. the fraction of enzymes from the total, although that would underestimate metabolic enzymes... @IVANDOMENZAIN any ideas here?
@edkerk I have used relative proteomics datasets (when available) as a substitution for the prot_abundance.txt
file, the f
values that I have obtained for different organisms with this approach range from 0.3 to 0.48. As the f
value is used for constraining the protein pool, I think that using a high value such as 0.48 or 0.5 also makes some sense because the protein pool becomes a limitation just for growth at very high rates (simulating batch conditions) but not for chemostat simulations with microbial models.
Something to address here brought up by @sulheim:
parameter file can also provide the paths for the required database-files, cultivation data etc.
This will be completely revamped with GECKO3, the discussion here is obsolete.
In
measureAbundance
, the location of the protemics data is hard-coded. This is inconvenient if one would for instance have proteomics data for many conditions and want to make models for each of these conditions: the user would have to replace thedatabases/prot_abundance.txt
file. Probably better to havegenes
andabundance
as parameters to the function.