anna-pa-m / Metabolic-EP

Code and data for analysis of metabolic networks
Other
9 stars 3 forks source link

Question about Humongous outputs 'a' generated from MetabolicEP #15

Open kath000 opened 3 years ago

kath000 commented 3 years ago

Dear Dr. Muntoni,

This is Kao Wei Chen from the Maastricht Centre for Systems Biology (MaCSBio) at Maastricht University, Netherlands. We are currently using the analytic flux space approximation method that you published in Nature Communications in 2017, and came across some questions that we hope you can help us with. We also already emailed Prof. Braunstein in this regard, but figured we might also ask you directly on GitHub.

The problem we came across lies in humongous output values being returned by the “metabolicEP” function in the output variable ‘a’. From tracing back the execution of the code, it seems the origin of these humongous output values lies in the update statement in line 98 of metabolicEP.m:

new_a = av + (av – mu) . new_d . s1

It turns out new_d occasionally becomes extremely large, 1e50. In our calculation for a human-size network, this happened 162 times. Out of these, in 160 cases the term av – mu was exactly zero, but in two cases it did not exactly cancel out, leaving a small but nonzero result of 1e-19; giving a product (av - mu) .* new_d that still comes out very large, thus presumably propagating to the final result. As new_d is capped by maxvar in line 97, we have been able to circumvent the issue by decreasing maxvar from 1e50 to 1e19, which prevented this blowup. However, we are now wondering if this “runaway” new_d can happen with certain models (we observed the blowup with several models), or if this should not actually occur.

In this context, we were also wondering about the line 96, which gives new_d as 1./(1./va – 1./s). In our calculation, there are some values of va and s that are the the same, if that occurs new_d would become infinite and get capped by maxvar in the next line. We were wondering whether this was in fact expected behavior for which we can simply decrease maxvar if problems occur, or if something might in fact have gone wrong if new_d became infinite to begin wit

Best regards,

Wei Chen, Kao

anna-pa-m commented 3 years ago

Dear Kao Wei Chen, this is a known issue of the iterative method. The "d" variables can take huge/very small values in the first iterations of the algorithm: this explains why we impose a minimum and a maximum value for them. As a consequence also the "a" variables are affected by the behavior of the "d" variables within the first iterations. Nevertheless, this issue cures itself, and at convergence (a,d) should well behave. The value 1e50 is arbitrarily large and of course, you can change it: however, the results you get at convergence (i.e. the means and the variances "av" and "va" of the marginal probability densities, and the marginal profiles parametrized by "mu" and "s" and the lower/upper flux bounds) should be (reasonably) independent of the large/small number you pick (1e19 could work). Anyway, to avoid this kind of issue, I suggest pre-processing your model before applying MetabolicEP and using the implementation in MetabolicEPT0 which seems to be more numerically stable than our first implementation.

Best! A