Pre-Processing: Sensitivity Analysis #17

Closed p-lauer closed 5 years ago

In GitLab by @hehnen1 on Nov 22, 2017, 18:49

Based on the input file, perform sensitivity analysis on the input parameters. Aim is to reduce the amount of input parameters for the optimisation process. Furthermore, get an idea on what parameters have the largest influence on the result.

Possible methods could be ANOVA or FAST.

Create artificial TGA case to check if the used method is correct/useful.

In GitLab by @hehnen1 on Nov 22, 2017, 18:49

assigned to @vinayak1

In GitLab by @hehnen1 on Nov 23, 2017, 10:07

@vinayak1 There is already a script propti_pre-processing.py in the directory propti. Maybe you can expand on it.

In GitLab by @vinayak1 on Nov 23, 2017, 10:28

@hehnen1 Thanks for the help !

In GitLab by @vinayak1 on Nov 23, 2017, 11:32

Are we open to using SALib (http://salib.readthedocs.io/en/latest/#) ? It seems to provide a good functionality and Git seems to be functional. It provides a rather good set of Sensitivity Estimation Algorithms.

In GitLab by @hehnen1 on Nov 23, 2017, 11:38

@vinayak1 I was thinking a little about this topic. Up to now the pre-processing script is only a collection of different functions/methods without direct connection to the pickle files, like propti_run.py.

However, the sensitivity analysis we are envisioning would be connected to pickle.init since it is supposed to use information from there. Thus, it raises the question if it makes sense to split the pre-processing into a "direct" and "indirect" script. I could imagine situations where I would need functions from pre-processing without having set up a pickle.init, because I would need the information from pre-processing to do so.

In GitLab by @hehnen1 on Nov 23, 2017, 11:43

Well, I think if it provides us with good functionality then it is fine.

It should then be implemented in a similar way as the connection to SPOTPY, thus we could add connections to different packages as well.

Maybe the different functions should be prepended with the package's name, like "SALib_FAST". By this we could prevent mixup when multiple packages are connected that might provide (different) implementations of the same algorithm/method.

In GitLab by @vinayak1 on Nov 23, 2017, 12:45

Okay. I'll think about this a little more and come up with a UML diagram to represent what we can do. SALib seems to provide good functionality and its philosophies are similar to those provided by SPOTPY and it seems to be a good fit at the first look.

In GitLab by @vinayak1 on Nov 29, 2017, 16:09

My idea about this is that we initially run the propti_prepare.py script and use the data from the pickle file as required for the sensitivity analysis.

Sensitivity data analysis requires only a range around which the analysis must be performed and as we already provide an input file with this data, we could use it directly.

I am currently researching methods by which time-series data can be correctly analysed globally for sensitivity.

See summary in attachment. sensitivity_idea

In GitLab by @hehnen1 on Nov 29, 2017, 16:42

That seems to be a good plan!

In general, I would like to use the pickle file as the start (point 1 actually, point 0 would be the input.py of course...) for most, possibly all, the propti funcionality. In case one needs to provide more information for the sensitivity analysis, than already present in the propti.pickle.init, I would propose we create a new class within DataStructures.py.

EDIT: Corrected the pickle file name.

In GitLab by @hehnen1 on Nov 29, 2017, 16:47

On a slightly different note: Would it maybe useful to have the possibility to just update propti.pickle.init, instead of perform an full propti_prepare.py run?

In GitLab by @lauer1 on Nov 30, 2017, 11:09

http://fb09-pasig.umwelt.uni-giessen.de/spotpy/Tutorial/7-Advanced_hints/#fast-sensitivity-analysis

Aus spotpy/spotpy/algorithms/fast.py:

:author: Tobias Houska and the SALib team

The presented code is based on SALib

In GitLab by @hehnen1 on Nov 30, 2017, 11:37

How about, three different options:

1) optimisation without sensitivity analysis (current process)
2) sensitivity analysis and stop (store sensitivity analysis data into a pickle file),
3) automatically run sensitivity analysis and afterwards inverse modelling process (future implementation).

EDIT: Wording

In GitLab by @hehnen1 on Nov 30, 2017, 15:10

In cases 2 and 3 (from above), propti should create a summary of the sensitivity analysis and store it in a file.

In GitLab by @vinayak1 on Dec 7, 2017, 08:51

Hey there, this seems to be coming to a point where I need some good data for a test example.

I am almost done with writing the code and would like to know if anyone of you have some kind of FDS file/ input.py where you have varied only 2-3 parameters and have a good idea about sensitivity manually so that I can test my code. (I think @trettin1 had something to offer here, but any thing would do :)

Cheers!

In GitLab by @vinayak1 on Dec 14, 2017, 21:55

Hey there, I've implemented the fast algorithm using Spotpy into a branch. I haven't merged it yet. Tried it today on tga_analysis_01 and it seems to give good results. You will find the plot below and a few other data.

The first py plot shows the comparison of various optimization parameters in the experiment with its nominal(average) value. The 'ranks' in the legend indicate sensitivity of parameter. For example: rank1 means highest sensitivity (the sensitivity values can be seen in figure 2, higher total value means more sensitive). The plots didn't make a lot of sense to me because they didn't have any measure of error associated with them. So I plotted RMS values (figure 3) which indicate error of simulation with average values.

I want to ask you guys whether it makes sense to implement such kind of functionality (or any other ideas) where plots and error values will be shown. I'm still thinking about how to implement that and am open to new ideas. The idea currently is to include another convenience function in propti_postprocessing.py that will handle plots.

Figure 1: Plot of various TGA simulations Figure_1

Figure 2: Sensitivity Values Screen_Shot_2017-12-14_at_21.44.55

Figure 3: RMSE Screen_Shot_2017-12-14_at_21.46.16

In GitLab by @hehnen1 on Dec 15, 2017, 09:23

It looks like a good start. I think it is a good idea to provide functionality with propti, to process the results of a sensitivity analysis. Maybe we could provide functionality to store the sensitivity results also in a CSV file, on the same level (directory) as we store the propti_db.csv. How about creating a new directory there for output. In there, a sub-directory will be created with propti_sensitivity_db.csv, as well as possible plots.

The propti_sensitivity_db.csv could contain the information to create the plot and the data provided in figures 2 and 3.

In GitLab by @hehnen1 on Dec 15, 2017, 09:23

Just to be sure, in figure 1 the different ranks are showing the influence of a specific parameter, why does rank 3 have two peaks?

In GitLab by @vinayak1 on Dec 15, 2017, 10:00

The problem currently is that the sample function in FAST Algorithm implemented in SPOTPY doesn't return anything! It only prints output to the screen. I'm speaking to Thouska to try and get this implemented. If that happens then we could do the csv file implementation for sure. Additionally, I asked him to implement a restart function in FAST and he has done that as well, so in case of sensitivity simulations (large number of simulations) we should be good on the cluster. There is still a bug in the restart function for fast in Spotpy and should be up on GITHUB by next week I think (depends completely on Thouska :/ )

As far as plotting goes, I was thinking if it was a good idea to have a PyQt based GUI to do plotting? This connects to #25 as well, and will give us a good functionality for doing plots based on clicks and not having to mess around with Python every time a plot is needed.

Done

FireDynamics / propti

Pre-Processing: Sensitivity Analysis #17