Best practice for evaluating TEA parameters

yalinli2 commented 4 years ago

Description Assume I want to evaluate how minimum product selling price (MPSP) changes with internal rate of return (IRR) under uncertainty, and I want to depict this using a figure where the x-axis is IRR, y-axis is MPSP, and the IRR vs. MPSP correlation is a curve with error band to express uncertainty (similar to Figure 4 in the paper Cortes-Peña et al., ACS Sustainable Chem. Eng. 2020, 8 (8), 3302–3310).

I can use the evaluate_across_coordinate function for Model object, but I'm wondering if this is the best practice? I ask because unlike Figure 4 in the paper where the change of feedstock lipid content affects the system, change of TEA parameters like IRR won't affect the system. So for each of the Monte Carlo scenario, there's no need to re-simulate the system to get the MPSP, re-calculating the cashflow should be enough.

My impression from reading the document is that BioSTEAM creates Block objects that only simulate the system downstream of the change, so I'm wondering if for my question, this means that only the cashflow will be re-calculated when change IRR (which is what we want)?

Thanks!!!

yoelcortes commented 4 years ago

Great question, in the evaluate_across_coordinate method, each Monte Carlo is evaluated in sequence. So with every change in the coordinate (in this case the IRR) all samples are evaluated. So we are doing Monte Carlo for each coordinate point.

You could save a lot of time if you perform this evaluation without using the evaluate_across_coordinate method. One way of working around this is to make an MESP metric for each IRR point and evaluate Monte Carlo only once (using the evaluate method):

import numpy as np
import biosteam as bst
from biosteam.evaluation.evaluation_tools import plot_montecarlo_across_coordinate
def create_MPSP_metric(IRR):
    def MPSP_metric():
        tea.IRR = IRR
        return tea.solve_price(stream)    
    return Metric(f'MPSP; IRR={IRR}', MPSP_metric, 'USD/kg')
IRRs = np.linspace(start, stop)
MPSP_metrics = [create_MPSP_metric(i) for i in IRRs]
# Then create your model with these metrics, load samples, and evaluate.
# Afterwords, we can make the plots by getting the metrics values.
MPSP_indices = [metric.index for metric in MPSP_metrics]
MPSP_data = model.table[MPSP_indices]
plot_montecarlo_across_coordinate(IRRs, MPSP_data.values) # You should see a nice plot :)

This would actually save you a lot of RAM too. Although it really doesn't matter however you get it done, I think this would be the best practice.

Thanks for asking!

yalinli2 commented 4 years ago

Oh yeah indeed setting IRR as metrics is a much better/ingenious way! Thanks!!!

yalinli2 commented 4 years ago

A side question that almost drove me crazy, I don't think it's caused by BioSTEAM, but would really appreciate your help.

So I'm wondering if you know why this worked:

def create_IRR_metric(IRR):
    def get_IRR_based_MSP():
        # Is it necessary to change all IRRs? Or just IRR for CombinedTEA is enough?
        orgacids_tea.IRR = orgacids_sys_no_boiler_tea.IRR = boiler_sys_tea.IRR = IRR
        return orgacids_tea.solve_price(lactic_acid, orgacids_sys_no_boiler_tea)
    return Metric(f'MSP at IRR={IRR}', get_IRR_based_MSP, 'USD/kg')

But this would lead to an error as in the screenshot? (I wanted to use the int function so that I wouldn't have some weird numbers like "0.35000000000000003" in the column label).

def create_IRR_metric(IRR):
    def get_IRR_based_MSP():
        orgacids_tea.IRR = orgacids_sys_no_boiler_tea.IRR = boiler_sys_tea.IRR = IRR
        return orgacids_tea.solve_price(lactic_acid, orgacids_sys_no_boiler_tea)
    return Metric(f'MSP at IRR={int(IRR*100)}%', get_IRR_based_MSP, 'USD/kg')

yalinli2 commented 4 years ago

Oh BTW, using IRR as metrics indeed is very fast and I do get a pretty plot! 🎉 Figure 2020-05-12 160408

yoelcortes commented 4 years ago

A float cannot be converted to an int unless its safe. For example, int(1.0000000001) is not safe and int(1.0) is safe. Also, biosteam assumes that the metric results are floats (values are stored in arrays). Best if you keep the results as floats. Instead, its better to format the string:

>>> f"IRR={0.01:.0%}"
'IRR=1%'

You can read up online on formating numbers, its pretty useful.

yoelcortes commented 4 years ago

Ahh, just noticed your other question.

Is it necessary to change all IRRs? Or just IRR for CombinedTEA is enough? We only need to change the CombinedTEA IRR.

Thanks!

yoelcortes commented 4 years ago

@yalinli2 I just realized what your error is. You must have multiple metrics with the same name. Try adding more significant figures to the name:

>>> f"IRR={0.01123:.4%}"
'IRR=1.1230%'

yalinli2 commented 4 years ago

I think you meant that I had multiple metrics with the same name so I ran into this error (not that I should have metrics with the same name?)

@yalinli2 I just realized what your error is. You must have multiple metrics with the same name. Try adding more significant figures to the name:
>>> f"IRR={0.01123:.4%}"
'IRR=1.1230%'

Anyway, thanks for the clue! I was initially confused by this because I printed the length of var_indices(self._metrics) and values and they were the same, then I realized the problem is in pandas. When the error was triggered, my metrics looked like this with four 28% (length=44):

Whereas using orgacids_model_IRR.metrics gave me two 28% (length=42):

This was because I didn't know int(0.99) would return 0 instead of 1 😂. So when generating the IRR metrics, there were two 28% because the latter 28% was supposed to be 29% with a real value of probably 28.9999999+%. At this stage this shouldn't give the error because the metric length was the same (42). But somewhere in pandas, because the real value was lost, so the two metrics became the same and somehow the indices repeated themselves for the two 28%, so the total length of metrics became 44 and therefore triggering the error.

Anyway, spent like six hours to learn the importance of data type, guess still worth it haha. Thanks for the help!!!

yalinli2 commented 4 years ago

Ahh, just noticed your other question.

Is it necessary to change all IRRs? Or just IRR for CombinedTEA is enough? We only need to change the CombinedTEA IRR.

Thanks!

Just a quick note that changing the CombinedTEA is enough to get the correct MSP, but NPV of the CombinedTEA won't be correct because BioSTEAM calculates combined NPVs by adding up NPV of the individual TEA objects in the CombinedTEA. Thanks!!!

I've tested with my codes, all MSPs are the same for both cases, but NPVs not. (Also noticed that I'd have to simulate the cashflow for five times to get a good NPV, although I'm not sure whether that has to do with the super large molar_tolerance I set for the system for some temporary purposes)

Only changing IRR of CombinedTEA:

Changing all IRRs:

yoelcortes commented 4 years ago

@yalinli2 Thanks for finding this bug! I fixed it so that a CombinedTEA object calculates NPV based on its IRR (and not the IRR of its components).

If you pip install biosteam==2.17.6, you'd be installing the latest version of thermosteam too. There have been couple of bug fixes:

No more cache when creating chemicals
Fixed problem with loading "radical" chemicals from database. This was not a problem for most chemicals we currently use, but some chemicals I am using had this problem. I basically improved the algorithm for identifying chemicals (its faster and shorter too).
Fixed an energy balance problem, it may or may not be affecting your biorefinery. It actually didn't affect the lipidcane or cornstover results, but in some cases it creates a problem when setting stream enthalpies.

Also, BioSTEAM can now create fitted models (basically machine learning models) with high R2s (~0.95). But its best if you don't look into this for another month or two, once I got good examples and a tutorial on this.

Thanks!

yalinli2 commented 4 years ago

@yalinli2 Thanks for finding this bug! I fixed it so that a CombinedTEA object calculates NPV based on its IRR (and not the IRR of its components).

Yep it's fixed! Thanks!

Though I think you need to fix some bugs in chemicals.py for the cornstover biorefinery, now there are properties missing for P4O10:

If you pip install biosteam==2.17.6, you'd be installing the latest version of thermosteam too. There have been couple of bug fixes:

No more cache when creating chemicals

I like this very much! In the past I had to restart the kernel/reload chemicals from time to time because once I modified chemical properties, properties of tmo.Chemical('ChemicalName') would be change as well, now tmo.Chemical('ChemicalName') gives the properties in the database 👏

Also, BioSTEAM can now create fitted models (basically machine learning models) with high R2s (~0.95). But its best if you don't look into this for another month or two, once I got good examples and a tutorial on this.

Looking forward to this!!!

yoelcortes commented 4 years ago

Ahh, I forgot to update the biorefineries package. Its a pretty easy fix. You can update to biorefineries==0.13.0 and it should work

yalinli2 commented 4 years ago

Yep it’s fixed! Thanks!

On May 13, 2020, at 2:33 PM, Yoel notifications@github.com wrote:

Ahh, I forgot to update the biorefineries package. Its a pretty easy fix. You can update to biorefineries==0.13.0 and it should work

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BioSTEAMDevelopmentGroup/biosteam/issues/23#issuecomment-628200921, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALV5VLNALT4L2VDDXDOP4PTRRLYX3ANCNFSM4M62KAKQ.

BioSTEAMDevelopmentGroup / biosteam

Best practice for evaluating TEA parameters #23