BioSTEAMDevelopmentGroup / Bioindustrial-Park

BioSTEAM's Premier Repository for Biorefinery Models and Results
MIT License
36 stars 17 forks source link

Consultation about Spearman results in uncertainty analysis and global sensitivity analysis #129

Open zasddsgg opened 1 month ago

zasddsgg commented 1 month ago

Hello, may I ask you some questions about Spearman results in uncertainty analysis and global sensitivity analysis? Thanks for your help. Wish you a good day.

a) In Spearman results, could I consult you why is GWP correlated with an economic indicator (e.g., the price of a stream) (rho is not 0)? 图1

b) In uncertainty analysis and global sensitivity analysis, the number of parameters seems to affect the sampled data and then the Spearman result. As shown in the figure below, the rho of the same parameter is different with different number of parameters (for example, oxygen unit price, when the Blank parameter is retained, rho is -0.009, and when there is no Blank parameter, rho is 0.14698)? There is even a positive and negative situation (such as plant uptime, when the Blank parameter is retained, the rho is -0.049, and when there is no Blank parameter, the rho is 0.008)? I have set the random number seed, why does this happen? Could I consult you which result should prevail? In uncertainty analysis and global sensitivity analysis, a similar situation occurs when I try to compare whether to keep the remaining parameters, such as Feedstock unit price, may I ask you why? Could I consult you which result should prevail? In addition, the different number of parameters also has an impact on the uncertain results (5%, 25%, 50%, 75%, 95% quantile), so could I consult you which result should prevail? If the number of parameters affects the Spearman result and uncertain results (5%, 25%, 50%, 75%, 95% quantile), should Blank parameter be added? 图2

图3

c) In Spearman, the positive and negative values of some indicators seem to be inconsistent with the reality. For example, if the feedstock unit price goes up, the MPSP should go down, so the rho should be negative, but positive in the result? 图2

d) May I ask if the number of simulations has any effect on Spearman's results and uncertain results (5%, 25%, 50%, 75%, 95% quantile)? Should I choose 200 times, 1,000 times, or 2,000 times?

e) For “A total of 27 parameters were selected for sensitivity analysis (full list, distributions, and ρ values included in Table 5 in the Supporting Information) and only parameters with absolute values of ρ ≥ 0.1 for MPSP are shown here. Crossed circles indicate that WP100 or FEC were not appreciably (absolute value of ρ < 0.05) affected by the parameters” in paper “Sustainable Lactic Acid Production from Lignocellulosic Biomass”, should parameters with rho between 0.05 and 0.1 be shown in the figure? May I ask you the meaning of rho in each range (such as less than 0.05, 0.05-0.1, greater than 0.1)? What does it mean to be just above 0.1, or close to 0.1 (but less than 0.1)? Does it have a significant impact on the indicators? Should it be shown in the graph? It seems that Spearman's p value is not used in the article, but only rho is used. Is it only through rho to judge whether the parameter has a significant impact on the indicator? Don't look at the p-value? Could I consult you what does the value of p-value mean?

f) For “A fake parameter serving as a "blank" in sensitivity analysis to capture fluctuations due to converging errors” in https://github.com/BioSTEAMDevelopmentGroup/Bioindustrial-Park/blob/master/biorefineries/lactic/models.py#L224-L225, what is the rho value of the blank parameter higher than that indicating a high error due to convergence errors? If the error caused by the convergence error is high, can the rho values of the other parameters and uncertain results (5%, 25%, 50%, 75%, 95% quantile) still be used? Or do I need to go through any calculation process to get the new rho value of the other parameters and uncertain results (5%, 25%, 50%, 75%, 95% quantile)?

g) If I add the code bst.ntutorial(), can I still receive error message if there is error in Monte Carlo simulation?

h) I ran 1000 simulations in Monte Carlo simulation, but why only 998 simulations in raw data in excel results?

i) Does the data displayed in raw data in excel after the Monte Carlo simulation mean that no errors were reported in the simulation corresponding to these data? If some times of data are missing in raw data, does it mean that the simulation of these times has simulation error?

j) In triangular distribution, does the midpoint (i.e., the point highlighted on the graph) represent the point most likely to be reached? 图

k) Is the following representation correct, that is, in a triangular distribution, the midpoint (the same as the baseline value) is one of the vertices of the triangle, and the other two vertices are the lower and upper bound respectively? 1