Compute FIM with Pyomo.DoE using class for regression model

adowling2 commented 1 year ago

Action items:

[x] @bbefort prototype class to generalize PR_model for the cases, place in same folder as current semi-generalized functions (linear, cubic)
[x] @jialuw96 demonstrate Pyomo.DoE with the function for one system and one experiment (in notebook)
[x] @jialuw96 creates a function in the generalized class that calls Pyomo.DoE
[ ] Code review

bbefort commented 1 year ago

@jialuw96 @adowling2 I think I have fixed up a generalized model .py file that Jialu should be able to use. Jialu, please take a look and let me know what you think and what else you might need from me. Thanks!

jialuw96 commented 1 year ago

@bbefort I am using the generalize_function to create a model and am having an attribute error as the following:

I am accomplishing the PR-3params-Opt1 model; The way I set up the parameter inputs are: # Model estimated params, read from csv params = pd.read_csv('./emimtf2n/R32/Final_Results/MBDoE/Params/PR_params_3params_Opt1.csv',header=None)

# This contains 3 parameters, kappa A [comp1, comp2], kappa A [comp2, comp1], kappaB [comp1, comp2]. I use 0 to fill for other parameters: all_params = [params[0][1], params[0][0], 0, params[0][2], 0, 0, 0, 0]

And then I tested the create_model function, getting an error as: ---------------------------------------------------------------------------AttributeError Traceback (most recent call last)/tmp/ipykernel_844149/251971107.py in 5 print(model_creation.PR_kappa_A_comp_1_comp_2) 6 ----> 7 mod = model_creation.create_model(data_exp.iloc[1])/media/psf/Home/dowlinglab/extractive-distillation2/modsel/paramest/generalize_functions.py in create_model(self, data, eps) 94 m.fs.properties.PR_kappa_C[self.comp_1, self.comp_1].fix(0) 95 m.fs.properties.PR_kappa_C[self.comp_1, self.comp_2].fix(self.PR_kappa_C_comp_1_comp_2)---> 96 m.fs.properties.PR_kappa_D[self.comp_2, self.comp_2].fix(0) 97 m.fs.properties.PR_kappa_D[self.comp_2, self.comp_1].fix(self.PR_kappa_D_comp_2_comp_1) 98 m.fs.properties.PR_kappa_D[self.comp_1, self.comp_1].fix(0)~/anaconda3/envs/pyomo/lib/python3.7/site-packages/pyomo/core/base/block.py in getattr(self, val) 520 # throw the "normal" AttributeError 521 raise AttributeError("'%s' object has no attribute '%s'"--> 522 % (self.class.name, val)) 523 524 def setattr(self, name, val):AttributeError: '_ScalarGenericParameterBlock' object has no attribute 'PR_kappa_D'

Do you see this error message before? Seems like IDAES cannot find PR_kappa_D.

I already commited code where I used this generalization function to this repo; The notebook is MBDOE-3Params-Opt1.ipynb in modsel/paramest.

Thank you!

bbefort commented 1 year ago

@jialuw96 I think what is happening is that the non-polynomial models do not have configurations which include the D parameter. I just pushed a new version of generalize_functions.py that uses an if statement to screen whether there is a polynomial model. So I think you will just need to add this if statement in as false for all of the models except the polynomial and then add it in as true for the polynomial models. Let me know if this solves the problem or if there are more issues. Thanks!

jialuw96 commented 1 year ago

@bbefort Thank you for adding this! I now get another error about the generalize_function.create_model() about initialization:

InitializationError Traceback (most recent call last) /tmp/ipykernel_884292/251971107.py in 5 print(model_creation.PR_kappa_A_comp_1_comp_2) 6 ----> 7 mod = model_creation.create_model(data_exp.iloc[1])

/media/psf/Home/dowlinglab/extractive-distillation2/modsel/paramest/generalize_functions.py in create_model(self, data, eps, polynomial) 121 # Initialize the flash unit 122 #m.fs.state_block.initialize(outlvl=idaeslog.CRITICAL) --> 123 m.fs.state_block.initialize(outlvl=50) 124 125 # Fix the state variables on the state block

/media/psf/idaes-pse/idaes/generic_models/properties/core/generic/generic_property.py in initialize(blk, state_args, state_vars_fixed, hold_state, outlvl, solver, optarg) 1435 res.solver.status != SolverStatus.ok)): 1436 raise InitializationError( -> 1437 f"{blk.name} failed to initialize successfully. Please check " 1438 f"the output logs for more information.") 1439

InitializationError: fs.state_block failed to initialize successfully. Please check the output logs for more information.

It looks this is because we use unproper value to initialize IDAES model?

Also, if needed, the part before "use pyomo.doe" in the notebook MBDOE-3params-Opt1.ipynb can be used to debug the generalize_function; the inputs are defined for this function and I used this to test if the model can be created.

Thank you!

bbefort commented 1 year ago

@jialuw96 I think this issue is that your parameter values are switched, which is my fault because I saved them in a different way than before to attempt to make this more straightforward. Now, the csv you are drawing the parameters from all have the same form and the parameters are listed in the order of (see below) with comp1=R32, comp2=emimtf2n:

kappa_A[comp1,comp2] kappa_A[comp2,comp1] kappa_B[comp1,comp2] kappa_B[comp2,comp1] kappa_C[comp1,comp2] kappa_C[comp2,comp1] kappa_D[comp1,comp2] kappa_D[comp2,comp1]

For a given model, if a parameter does not have a value (ie wasnt fit), it is listed as 0 in the csv. So, what I think you need in cell 7 of the MBDOE-3params-Opt1.ipynb notebook is:

new_params = [params[0][0], params[0][1], params[0][2], 0, 0, 0, 0, 0]

which will be in the correct parameter order you need.

Hopefully this works, and if it doesn't I have one more idea of what it could be. Let me know! Thanks!

bbefort commented 1 year ago

@jialuw96 I just updated the generalize_functions.py file with the correct PR model type names. Hopefully this helps, please let me know if you have any questions. Thanks!

jialuw96 commented 1 year ago

@adowling2 @bbefort After going over all experiments, we get a total FIM and its information (for the PR_3params_opt1 model):

======Result summary====== Four design criteria log10() value: A-optimality: 12.492773871971401 D-optimality: 32.01044208719705 E-optimality: [12.48982448 9.23427853 10.28633908] Modified E-optimality: 3.2555459511933216 FIM: [[ 6.76486223e+11 -1.00621591e+11 1.27065009e+12] [-1.00621591e+11 3.13878627e+10 -1.78804528e+11] [ 1.27065009e+12 -1.78804528e+11 2.40222247e+12]]

The condition number is around 1000 so it's fine. I have two questions right now:

What do we need to record for results? I am thinking one csv file for the total FIM (do we need to record FIM for each experiment or just one total FIM)?
Some experiments failed to initialize in IDAES. Please see the following error message:

I used a catch block to skip those experiments which cannot be initialized. For this specific system, data_exp [4, 10. 11, 12, 18, 19, 26] cannot initialize.

Thank you!

adowling2 commented 1 year ago

@jialuw96 Let's store in a json file with:

a list containing the FIM for each experiment. If you had to skip it, store None
the total FIM

I prefer a json file because is stores more digits than a csv file.

E-optimality should be a single number, not three numbers. (Or am I missing something?)

What version of IDAES are you using? Let's make sure it is the same as @bbefort.

What happens if you run your "single experiment" function for one of the data entries where the IDAES initialization failed? Does the initialization fail still? (I expect it will.) You can work with @bbefort to look at the log. It is possible there is a simple bound error. Another possibility is that we are reading parameters from a csv file. Writing to a csv file lost some of the digits of accuracy (although I hope this is not the case). Are you using MA27 or MA57 with Ipopt?

adowling2 commented 1 year ago

Also, can you push your code if you have not already?

bbefort commented 1 year ago

@jialuw96 This is good!

I am using idaes-pse version: 1.13.0.dev0

The initialization is very finicky, I tried to find this file where this initialization error is being located, but I don't see it. Can we meet sometime today to chat about this? Usually, we just need to initialize the flash column for a given point using slightly different values and it will run.

jialuw96 commented 1 year ago

====A note for initialization points working for models====

Initial point for PR_linTdep: init_temp = 303, init_pressure = 399300 , init_x_c1 = 0.45, polynomial = False For all other PR models except for poly: init_temp = 283.1, init_pressure = 399300 , init_x_c1 = 0.45, polynomial = False For PR_poly: init_temp_option=343, init_pressure_option= 150000, init_x_c1_option = 0.45,

Initial point for all SRK models except for poly: init_temp_option=323, init_pressure_option= 399800,init_x_c1_option = 0.5 SRK poly: init_temp_option=343, init_pressure_option= 150000, init_x_c1_option = 0.4

jialuw96 commented 1 year ago

@jialuw96 Let's store in a json file with:

a list containing the FIM for each experiment. If you had to skip it, store None

the total FIM

I prefer a json file because is stores more digits than a csv file.

E-optimality should be a single number, not three numbers. (Or am I missing something?)

What version of IDAES are you using? Let's make sure it is the same as @bbefort.

What happens if you run your "single experiment" function for one of the data entries where the IDAES initialization failed? Does the initialization fail still? (I expect it will.) You can work with @bbefort to look at the log. It is possible there is a simple bound error. Another possibility is that we are reading parameters from a csv file. Writing to a csv file lost some of the digits of accuracy (although I hope this is not the case). Are you using MA27 or MA57 with Ipopt?

E-optimality: I am printing all eigenvalues, so the minimal one should be E-optimality (changing this in code) IDAES version: we use the same version 1.13.0 dev 0 Initialization: we are working on it. Bridgette adds toggles to change initialization point in create_model. I am recording a list of initialization numbers working for different models.

bbefort commented 1 year ago

To Discuss in 10/12 Meeting:

[x] JW checkin with AD on updates
[x] Discuss BB thoughts -- what to include for the paper? DOE analysis for just one system, a couple of models? Note: there may be other changes to initialization values that JW and I will have to look at for individual models/systems -- but, how to calculate identifiability for all the models? BB thought: we just do this for one system to limit general initialization hassle for JW

bbefort commented 1 year ago

Summary:

-- BB: update init in the generalize_functions.py, add in SRK to the general functions

-- Paper: do full FIM analysis for R32/emimTF2N systems; apply FIMs to the weird cases for the other two systems

-- JW: calculate FIMs for R32/emimTF2N models, send results to BB for identifiability analysis in json file form

-- BB, JW: send KW data for one model for info content analysis

-- JW: heat map for one model (same as KW) for MBDoE

"favorite model": PR-quad

bbefort commented 1 year ago

@jialuw96 I updated generalize function with the initialization changes and to include the SRK model. Please let me know if you have any issues using this or any other questions. Thanks!

jialuw96 commented 1 year ago

@bbefort Thank you for adding SRK modules! The json files are in paramest/emimtf2n_FIM_info.

Currently all the PR model FIMs are ready in the folder except the one for the PR_polyTdep model (Wrong initialization; we need to find an initialization point for it). Other PR models can run all experiments with no error.
SRK model has an error message:

==========Experiment index: 0 ===============

AttributeError Traceback (most recent call last) /tmp/ipykernel_1217775/639202642.py in 5 6 #exp_set = 0 ----> 7 total_fim = mbdoe_obj.sumDOE(exp_set, scale_opt=True, record_name="SRK_1Param_Opt1") 8 #total_fim = mbdoe_obj.doe(exp_set, scale=True)

/media/psf/Home/dowlinglab/extractive-distillation2/modsel/paramest/MBDOE.py in sumDOE(self, exp_idx_set, scale_opt, record_name, init_temp_option, init_pressure_option, init_x_c1_option) 60 61 res = self.doe(i, scale=scale_opt, init_temp_opt = init_temp_option, ---> 62 init_pressure_opt = init_pressure_option, init_x_c1_opt = init_x_c1_option) 63 64 record[str(i)] = res.FIM.tolist()

/media/psf/Home/dowlinglab/extractive-distillation2/modsel/paramest/MBDOE.py in doe(self, exp_idx, scale, init_temp_opt, init_pressure_opt, init_x_c1_opt) 99 100 createmod = self.create_model_object.create_model(self.data_exp.iloc[exp_idx], init_temp=init_temp_opt, --> 101 init_pressure = init_pressure_opt, init_x_c1 = init_x_c1_opt) 102 103 # Control time set [h]

/media/psf/Home/dowlinglab/extractive-distillation2/modsel/paramest/generalize_functions.py in create_model(self, data, eps, init_temp, init_pressure, init_x_c1, polynomial) 272 if polynomial == False: 273 # parameters --> 274 m.fs.properties.SRK_kappa_A[self.comp_2, self.comp_2].fix(0) 275 m.fs.properties.SRK_kappa_A[self.comp_2, self.comp_1].fix(self.SRK_kappa_A_comp_2_comp_1) 276 m.fs.properties.SRK_kappa_A[self.comp_1, self.comp_1].fix(0)

~/anaconda3/envs/pyomo/lib/python3.7/site-packages/pyomo/core/base/block.py in getattr(self, val) 520 # throw the "normal" AttributeError 521 raise AttributeError("'%s' object has no attribute '%s'" --> 522 % (self.class.name, val)) 523 524 def setattr(self, name, val):

AttributeError: '_ScalarGenericParameterBlock' object has no attribute 'SRK_kappa_A'

I've committed all code in my machine. Thank you!

adowling2 commented 1 year ago

@jialuw96 I think you can start making the heatmap for FIM while Bridgette debugs the SRK model. We want to consider the current FIM (all experiments) as the prior for the heatmap. @bbefort can give you the ranges for the design parameters. We might have to iterate a few times.

bbefort commented 1 year ago

@jialuw96 Thanks!

a) polynomial initialization: let's try initializing at T = 323 K. l think this will work, but let me know if it doesn't.

b) SRK model error: this is coming from the fact that you are not using the correct configuration file. You can find the correct file at emimtf2n/R32/FinalResults/MBDoE/hfc32_emimTf2n_SRK.py. However, I was looking for where you called the configuration for the PR systems and couldn't find it... we should take a look at this.

c) Let's try these ranges for the T, x design parameters: T= 273-360 K x=0.1-0.9

I can meet this afternoon to discuss these, just let me know. Thanks!

bbefort commented 1 year ago

Also, @jialuw96, where is the folder/file in which you are running the FIM code? I see the json results, but not where you ran the code to get them.

jialuw96 commented 1 year ago

Also, @jialuw96, where is the folder/file in which you are running the FIM code? I see the json results, but not where you ran the code to get them.

@bbefort They are in the same repository as generalize_function.py; Codes are in MBDOE.py, fim_doe.py, and they are run in MBDOE-system-PR3paramOpt1.ipynb.

jialuw96 commented 1 year ago

@jialuw96 Thanks!

a) polynomial initialization: let's try initializing at T = 323 K. l think this will work, but let me know if it doesn't.

b) SRK model error: this is coming from the fact that you are not using the correct configuration file. You can find the correct file at emimtf2n/R32/FinalResults/MBDoE/hfc32_emimTf2n_SRK.py. However, I was looking for where you called the configuration for the PR systems and couldn't find it... we should take a look at this.

c) Let's try these ranges for the T, x design parameters: T= 273-360 K x=0.1-0.9

I can meet this afternoon to discuss these, just let me know. Thanks!

@bbefort Thank you!

b) I now define configuration files as this:

model_option = "PR" if_poly = True

if model_option=="PR": if not if_poly: from hfc32_emimtf2n_PR import configuration else: from hfc32_emimtf2n_PR_polynomial import configuration

elif model_option=="SRK": if not if_poly: from hfc32_emimtf2n_SRK import configuration else: from hfc32_emimtf2n_SRK_polynomial import configuration

Is this right?

a) 323 K still doesn't work; I am looking at if I am setting the polynomial model settings wrong. Let's talk this noon.

c) Get it.

jialuw96 commented 1 year ago

@bbefort .json files for all models are in the emimtf2n_fIM_info folder, excpet for one model: SRK polyTdep. The model parameters are:

{'fs.properties.SRK_kappa_A["R32", "emimTf2N"]': 2.918662512031448, 'fs.properties.SRK_kappa_A["emimTf2N", "R32"]': 4.749648760703987, 'fs.properties.SRK_kappa_B["R32", "emimTf2N"]': -7.785346397132874, 'fs.properties.SRK_kappa_B["emimTf2N", "R32"]': -10.1775966763575, 'fs.properties.SRK_kappa_C["R32", "emimTf2N"]': -2.0625627470895584, 'fs.properties.SRK_kappa_C["emimTf2N", "R32"]': -1.93209724179498, 'fs.properties.SRK_kappa_D["R32", "emimTf2N"]': 0.0, 'fs.properties.SRK_kappa_D["emimTf2N", "R32"]': 0.0}

There are two 0.0s here, is this right? Thank you!

bbefort commented 1 year ago

Awesome, thanks @jialuw96. I just pushed the updated parameters for the polyTdep SRK system. Let me know if there are problems!

jialuw96 commented 1 year ago

@adowling2 @bbefort Here are heatmaps for PR_quad model. Fixed design variables are Pressure (fixed at 549300). Prior information is the total FIM we added up for all experiments (The json file 'total'):

I am wondering:

Do you have questions about the heatmaps?
Is there anything I should change about the figure format? I found that the Temperature axis labels can be too crowded and am going to figure this out. Do you prefer more grids here? (I used ~10*10 grids and can draw maybe 20*20 to make this heatmap look more smooth)
For next step, I plan to: 1. save the heatmap data; 2. should I begin to do the summing up FIM for other systems , or continue to do heatmaps for other models of this system 'emimtf2n'?

Thank you!

adowling2 commented 1 year ago

@jialuw96 A few comments:

How is the FIM scaled? 10^68 is a huge number. It may raise concerns about the reliability of the calculation.
How long does it take to generate the heatmap?
Please separate the calculations from the plotting. We want to save the total FIM for each point separately. This way we can fine tune the plotting code.
For the publication (and my talk early next week), we will want to improve the axis labels. For example, the numbers under temperature a running together.
Instead of heatmaps, can we make contour plots? See https://ndcbe.github.io/data-and-computing/notebooks/01/Publication-Quality-Figures.html for good examples.
Overall, we should follow the plotting guidelines on the above link.

@bbefort

Can you think of a physical justification that 360 K and 0.8 (mole fraction??) is the most informative next experiment?

bbefort commented 1 year ago

This is interesting, thanks @jialuw96 for pulling these together!

@adowling2 what I am thinking is that higher temperatures/mole fractions could be getting us into an LLE region? I'm not sure why 0.8 specifically here, but that would likely be solidly in the LLE region. I know they have looked at LLE for other systems (R134a) with the emimTF2N and this T/x combination is where it would about lie.

@jialuw96 when you run this again to fix the plots, could you increase the temperature range from 273-360K to 273-400K? I think this may be more inclusive of potential HFC/IL temperature measurement ranges (I have an email out to our KU collaborator to confirm this).

@jialuw96 if you can focus on getting these plots cleaned up by tomorrow, we can then chat in our meeting about what we want to focus on next. Thanks!

adowling2 commented 1 year ago

@bbefort Does our model predict LLE? (What is the phase prediction at these conditions?). Is this far away from our previous data?

@jialuw96 @bbefort When we clean up the plots, we can overlay the training data onto the plot. We could also make a heatmap/contour plot that shows the predicted pressure as a function of composition and temperature. Those predictions should be calculated already as part of the calculations that go into the FIM heatmaps. Moreover, checking the predicted pressure would help give us confidence in the calculations.

bbefort commented 1 year ago

@adowling2 we have not tried to do LLE predictions with our model, but Yokozeki has previously showed that cubic EoS (RK, vdW) can make predictions in the LLE range. Our current data goes up to 348 K and the largest mole fraction measured at 348K is ~0.3. I will see how our model performs at 360K, 0.8 mole fraction. It might explode and this might not be feasible.

Also, I would like to chat in our meeting tomorrow (JW is attending) about what it means to fix pressure at a certain value, eg ~0.5 MPa. I'm not sure how that impacts the results if we are just checking the Tx combinations at a certain pressure?

jialuw96 commented 1 year ago

@adowling2 @bbefort I use ranges for Temperature. from 273 to 400 K here:

Answers to questions:

Let's talk in our meeting. Thank you!

dowlinglab / HFC-IL-thermodynamic-model-selection

Compute FIM with Pyomo.DoE using class for regression model #13

==========Experiment index: 0 ===============