equinor / semeio

Semeio is a collection of jobs and workflow jobs used in ert (https://github.com/equinor/ert).
https://github.com/equinor/semeio
GNU General Public License v3.0
10 stars 27 forks source link

AHM Analysis Performance #318

Open lars-petter-hauge opened 3 years ago

lars-petter-hauge commented 3 years ago

The work in https://github.com/equinor/semeio/pull/231 will introduce a workflow with some possible caveats in regards to performance.

1) The workflow runs the update step n*2 amount of times, where n represents the number of observations. This will take a significant amount of time for cases with a high number of observations.

2) We are loading the grid into a pandas dataframe and applying functions. For large grid files this could result in memory issues.

3) The Field parameters are also not available in current ERT API, thus the job needs to export parameters to disk and load them.

lars-petter-hauge commented 3 years ago

To address the amount of observations, we could consider allowing the user to specify a list of observations which they would like to include in the analysis (the remaining observations could all be included or excluded based on preference)

mareco701 commented 3 years ago

We could also give the choice to the user to include or not Field parameter in the evaluation to at least get access to the other parameters results if FIELD parameter makes the script to fail.

oyvindeide commented 2 years ago

Now that this has been tested a while, are there any observations in terms of the performance we should address @mareco701?

mareco701 commented 2 years ago

Hei, yes the memory issue is quite a problem and makes the script fail for cases with Field parameters in quite some cases (for instance for the Drogon synthetic case where several field parameters are used and also if a large grid file is used). The way the script is today all the field grids parameters for all the n*2+1 observations are stored in one dataframe.

berland commented 11 months ago

@dafeda , do you have any possible input on whether there is a reason to keep this issue open?

dafeda commented 11 months ago

Performance of the AHM analysis was recently discussed so this might still be relevant. Any thoughts @oyvindeide ?