Closed markusdregi closed 4 years ago
Why are they writing to the same scaling constant? The decision was made quite recently in semeio, to use the same scaling constant. What was the decision-making process like?
If I can scale pre-simulation, post-simulation, pre-update, post-update, and at arbitrary times outside the hooking system—seems to be we need to discuss if the scaling constant is to intersect or not in these cases.
Note that both the change of operator and the second constant would induce changes in libres. If we change operator then we need to change the operator both in the scaling job and the ES MDA algorithm. In addition we would have to make sure that for each iteration resets the scaling factor to 1
.
A deep dive into the ES_MDA code in libres reveals that the scaling constant is seemingly only updated in a obs_data
struct that is later discarded. So the scaling constant is not updated on the real observation. @markusdregi @sondreso could probably ack/nack this.
Why are they writing to the same scaling constant? The decision was made quite recently in semeio, to use the same scaling constant. What was the decision-making process like?
Note that there already exists a correlation scaling job in libres: enkf_main_std_scale_correlated_obs_JOB
This is the job that will be superseded with SC
from semeio. So the design decision might have been influenced a bit ;)
If I understand correctly, for every iteration in one of the updates, enkf_main will copy the observations from the previous iteration. This is done in static void enkf_main_update__
. The allocation of the observation in this function use the global_std_scaling
from the analysis_config
.
double global_std_scaling = analysis_config_get_global_std_scaling(analysis_config);
obs_data_type * obs_data = obs_data_alloc(global_std_scaling);
It is the global_std_scaling
factors in the analysis_config
that has been set when running es_mda from ert:
In particular, the algorithm overwrites the scaling from the scaling job
This is true because the update
section in the RunModel
sets the global_std_scaling
I suppose, and the workflow with the job itself is run in advance?
ert/ert_shared/models/multiple_data_assimilation.py
:
EnkfSimulationRunner.runWorkflows(HookRuntime.PRE_UPDATE, ert=ERT.ert)
self.update( run_context , weights[iteration])
EnkfSimulationRunner.runWorkflows(HookRuntime.POST_UPDATE, ert=ERT.ert)
There is an enkf_obs_scale_correlated_std
(https://github.com/equinor/libres/blob/4e90285db47057096d958decb1f76a126d6bbbfb/lib/enkf/enkf_obs.cpp#L1166)
But this is only run if the std_scale_correlated_obs
in the analysis_config
is True
(False
by default). Or when run through the workflow job enkf_main_std_scale_correlated_obs_JOB
Was wondering about enkf_obs_scale_std
as well, as this calls enkf_obs_local_scale_std
which seems to do some scaling. But the enkf_obs_scale_std
is only called in a test ( and I searched across all relevant github repo's).
Otherwise the enkf_obs_local_scale_std
is only called with the two aformentioned functions enkf_obs_scale_std
and enkf_obs_scale_correlated_std
. Although being exposed in the python api as localScaleStd
, I can't find it being called anywhere (again, search across relevant repositories). Could be that it is called within a user-made job?
introduce a second scaling constant in storage is possible.
Is the std_scaling written to storage? Not even the observations hit disk if I recall correctly?
In addition we would have to make sure that for each iteration resets the scaling factor to 1.
From what I've seen so far, it doesn't. Rather the scaling is set explicitly in both code paths and never reset. But shouldn't be much problem to change it (given that none of our users expect the behaviour)
Was wondering about
enkf_obs_scale_std
as well, as this callsenkf_obs_local_scale_std
which seems to do some scaling. But theenkf_obs_scale_std
is only called in a test ( and I searched across all relevant github repo's).
enkf_obs_local_scale_std
is the one you want, I´m assumingenkf_obs_scale_std
is the result of append only, with the intent of expanding the scaling functionality.Otherwise the enkf_obs_local_scale_std is only called with the two aformentioned functions enkf_obs_scale_std and enkf_obs_scale_correlated_std. Although being exposed in the python api as localScaleStd, I can't find it being called anywhere (again, search across relevant repositories). Could be that it is called within a user-made job?
There is a python test calling the python API IIRC, but my opinion is that these functions should be deprecated as soon as the new jobs are past the testing phase. The way the job was called in the past is through enkf_main_std_scale_correlated_obs_JOB
.
enkf_obs_local_scale_std
is the one you want
This is used neither by the es_mda
approach, nor with the CS
job we have made in semeio, correct? The former sets the global_std_scaling
for the observation, which is used on allocation. While the latter uses the updateStdScaling
which sets the scaling factors directly on the observation node. Unless I'm missing something?
enkf_obs_local_scale_std
is the one you wantThis is used neither by the
es_mda
approach, nor with theCS
job we have made in semeio, correct? The former sets theglobal_std_scaling
for the observation, which is used on allocation. While the latter uses theupdateStdScaling
which sets the scaling factors directly on the observation node. Unless I'm missing something?
While studying the code I came to the same conclusion, i.e.: final_obs = global_std_scaling * (std_scaling * obs)
, where global_std_scaling
is set in the udpate step of es_mda
and std_scaling
in CS
job via updateStdScaling
in pre-step: EnkfSimulationRunner.runWorkflows(HookRuntime.PRE_UPDATE, ert=ERT.ert)
Therefore shouldn't it work as is currently internally?
enkf_obs_local_scale_std
is the one you wantThis is used neither by the
es_mda
approach, nor with theCS
job we have made in semeio, correct? The former sets theglobal_std_scaling
for the observation, which is used on allocation. While the latter uses theupdateStdScaling
which sets the scaling factors directly on the observation node. Unless I'm missing something?
Right, a bit unclear on my part, it´s the one previously used, not in use now, no. But of the two, that´s the one that was actually used. All of that should (hopefully) be deprecated now.
While studying the code I came to the same conclusion, i.e.: final_obs = global_std_scaling (std_scaling obs)
Yeah, that was kind of what I got the feeling of as well. But I couldn't find the place where this is returned/calculated..? Is there a line that actually does this part?
What I did find was that the only place the global_std_scaling
is used after it is applied to an observation is in obs_data.cpp:L179
:
double obs_block_iget_std( const obs_block_type * obs_block , int iobs) {
return obs_block->std[ iobs ] * obs_block->global_std_scaling;
}
Given that the obs_block_iget_std
is the function that is used further down in the analysis of course..
The updateStdScaling
sets the observation vector's std_scaling
, and from what I can see, the std_scaling
is used in each of the getter functions for summary_obs
, gen_obs
and block_obs
summary_obs.cpp
:
summary_obs_get_observations
...
obs_block_iset( obs_block , 0 , summary_obs->value , summary_obs->std * summary_obs->std_scaling);
block_obs.cpp
:
block_obs_get_observations
...
obs_block_iset(obs_block , i , point_obs->value , point_obs->std * point_obs->std_scaling );
gen_obs.cpp
gen_obs_get_observations
...
obs_block_iset( obs_block , iobs , gen_obs->obs_data[iobs] , IGET_SCALED_STD( gen_obs , iobs ));
where
static double IGET_SCALED_STD(const gen_obs_type * gen_obs, int index) {
return gen_obs->obs_std[index] * gen_obs->std_scaling[index];
}
So it kind of looks like depending on which getter function is used, either the std_scaling
or global_std_scaling
is applied.. But I do have a feeling that this is not completely true..
..This is used neither by the
es_mda
approach, nor with theCS
job we have made in semeio, correctRight, a bit unclear on my part, it´s the one previously used, not in use now, no. But of the two, that´s the one that was actually used. All of that should (hopefully) be deprecated now.
We agree, the old job uses it:
enkf_main_std_scale_correlated_obs_JOB
calls enkf_obs_scale_correlated_std
which then uses enkf_obs_local_scale_std
All of that should (hopefully) be deprecated now.
As in deprecation announcements/warnings have been made?
The old job has not been removed though..
Either way, the std_scaling
vs global_std_scaling
still confuses me a bit
As in deprecation announcements/warnings have been made?
The old job has not been removed though..
No, no deprecation warnings have been made. While the new job is in stable, I think it´s still considered under testing, so when that is done, the old should be deprecated.
Yeah, that was kind of what I got the feeling of as well. But I couldn't find the place where this is returned/calculated..? Is there a line that actually does this part?
What I did find was that the only place the
global_std_scaling
is used after it is applied to an observation is inobs_data.cpp:L179
:double obs_block_iget_std( const obs_block_type * obs_block , int iobs) { return obs_block->std[ iobs ] * obs_block->global_std_scaling; }
After some digging starting from enk_main.update
the global_std_scaling
is applied
as you said via obs_block_iget_std
.
That is used in enkf_analysis_deactivate_outliers
(https://github.com/equinor/libres/blob/b80fac1230db7bdb64511f4d5af364135929fa06/lib/enkf/enkf_main.cpp#L995)
When it comes to obs_node->std_scaling
this one is actually applied in enkf_obs_get_obs_and_measure_data
when loading the obs_data
vector. Such scaled obs_std
data is further sent into the aforementioned function enkf_analysis_deactivate_outliers
:
https://github.com/equinor/libres/blob/4e90285db47057096d958decb1f76a126d6bbbfb/lib/enkf/enkf_obs.cpp#L467
So it kind of looks like depending on which getter function is used, either the
std_scaling
orglobal_std_scaling
is applied.. But I do have a feeling that this is not completely true..
The getter function, used also in the enkf_obs_get_obs_and_measure_data
, is then determined here:
https://github.com/equinor/libres/blob/4e90285db47057096d958decb1f76a126d6bbbfb/lib/enkf/obs_vector.cpp#L151
where vector->get_obs =
either summary_obs_get_observations, block_obs_get_observations
or gen_obs_get_observations
.
To summarize:
std_scaling
(set by scaling job) is applied by the following chain:
enkf_obs_get_obs_and_measure_data
-> enkf_obs_get_obs_and_measure_node
-> obs_vector_iget_observations
-> obs_vector->get_obs
global_std_scaling
(modified by ES MDA) is applied by:
enkf_analysis_deactivate_outliers
-> obs_block_iget_std
Meaning this should work as is even for ES MDA, but I might got it wrong too of-course.
In the enkf_analysis_deactivate_outliers
, the std
is only used to calculate whether or not an observation is too far away from the prediction (and then deactivate if so). It isn't used any further. Do you agree?
The obs_block_iget_std
is used when assembling the R (obs_block_initR
) and E (obs_block_initE
) matrices, I think this is where it comes in use.
I agree with you on how the std_scaling
is applied.
But yes, I agree that they don't seem to overwrite directly. But this could be done implicitly somehow?
In the
enkf_analysis_deactivate_outliers
, thestd
is only used to calculate whether or not an observation is too far away from the prediction (and then deactivate if so). It isn't used any further. Do you agree?The
obs_block_iget_std
is used when assembling the R (obs_block_initR
) and E (obs_block_initE
) matrices, I think this is where it comes in use.I agree with you on how the
std_scaling
is applied.But yes, I agree that they don't seem to overwrite directly. But this could be done implicitly somehow?
Right, I thought enkf_analysis_deactivate_outliers
is something principal in the process, but it only deactivates outliers. Anyway when assembling R and E matrices by using obs_block_iget_std
with obs_data
that had been scaled previously by std_scaling
as described before, I don't see an issue here. As you said this can happen if they really get overwritten by one another - this needs to be tested first.
@dotfloat @xjules and I have continued looking at this. We agree that the two scaling values are applied at different sections of the code:
global_std_scaling
is included when allocating the observation vectors and then used when assembling the matricesstd_scaling
is directly multiplied when populating the observation vector through the calling of enkf_obs_get_obs_and_measure_data
(when collected using obs_vector->get_obs
, where all of the implementations use the std_scaling
)So we set up the poly case now. We split the observations in order for the COS to run, and we set the std to 1,2,3,4,5.
Observations: POLY_OBS:
2.8532509308 1
7.20311703432 2
21.3864899107 3
POLY_OBS_TWO:
31.5145559347 4
53.5676660405 5
We also hard coded the scaling from the COS to be 50 for all.
And then we run the MDA ES:
Running MDA ES on (weights normalized) 4.583, 2.291, 1.146
All 6 active jobs complete and data loaded.
Scaling factor calculated from (named_dict(index=None, key='POLY_OBS'), named_dict(index=None, key='POLY_OBS_TWO'))
Calculation scaling factor, nr of primary components: 1, number of observations: 5
Keys: ['POLY_OBS', 'POLY_OBS_TWO'] scaled with scaling factor: 50
===============================================================================================================================
Report step...: 0000
Ministep......: ALL_ACTIVE
-------------------------------------------------------------------------------------------------------------------------------
Observed history | Simulated data
-------------------------------------------------------------------------------------------------------------------------------
1 : POLY_OBS 2.853 +/- 229.129 Active | 2.880 +/- 0.830
2 : ... 7.203 +/- 458.258 Active | 5.751 +/- 1.497
3 : ... 21.386 +/- 687.386 Active | 10.808 +/- 3.563
4 : POLY_OBS_TWO 31.515 +/- 916.515 Active | 18.050 +/- 7.112
5 : ... 53.568 +/- 1145.644 Active | 27.479 +/- 12.092
===============================================================================================================================
Note that the uncertainty printed is 229.129 (1 * 4.583 * 50
)
The printing is performed in the enkf_analysis_fprintf_obs_summary
(enkf_analysis.cpp
)
So it looks like both scaling in fact have been applied. And we can't find anywhere where one is overwritten by the other
Do we have a user that has a setup they believe does not include both scalings?
We are going to create a test in libres that demonstrates the different outputs given that each of the scalings are applied.
Perfect! I've never investigated this, I've only been told numerous times that the existing scaling job (the one implemented in libres) and ES MDA did not mix at all and did overwrite each others scaling factors. If this does not seem to be a problem anymore, then all is good 🚀
We should of course investigate this further to be sure, but it could be that this is a non-issue.
that the existing scaling job (the one implemented in libres) and ES MDA did not mix at all and did overwrite each others scaling factors.
I can't see that the way it is implemented in libres is any different when it comes to the scaling variables used and how they were applied. Could be other differences though in the past, who knows. But yes, it would be good if this is a non-issue! And I also think it makes sense as is, with a "global std scaling" set from the es_mda (as that really should be global), and a per-obs scaling used by the COS job.
What might not make as much sense is how difficult it was to see when they were applied :yum:
never the less, I agree this should be tested as well.
We are going to create a test in libres that demonstrates the different outputs given that each of the scalings are applied.
Shouldn't this test be part of semeio.cos_job
representing an integration test that exploits both ES MDA
and COS
?
I am a bit stuck on this issue because as far as I can tell, there is no easy way for the workflow to know it is running an iterative algorithm.
First, the scaling factor is preserved between iterations, so if it is possible for the workflow to identify that it is running an iterative algorithm, the problem is solved.
However, that is the current problem.
1) The run context resides in ert
and is not available to the workflow.
2) It is not possible to pass <ITER>
to the config as that only works for forward models (and also, ARGLIST
is not recognized by internal job config.
3) Libres does not know if it is running iteratively, and that is the information the job has access to.
4) Can not do caching as we cant differentiate between running multiple times from different types of experiment.
5) Parsing RUNPATH
seems like a really error prone approach.
Already discussed this with @lars-petter-hauge, @markusdregi, any thoughts?
@oyvindeide Did you come up with a brilliant conclusion when discussing? :) I'm a bit surprised that it is difficult to get the iteration number into a workflow...
No, I laid out the results of my experimentation, and we tried to brainstorm other ideas, I have summed up all the things we discussed above. According to both my experimentation and the docs (https://github.com/equinor/ert/blob/b380ec5cca37918320405a93781e483868fb172f/docs/rst/manual/reference/configuration/magic_strings.rst) the magic strings will not work with workflows. We can of course change this on the ert
side of things, which is the only solution I can think of at the moment.
Another option is to add additional hooks to ert
, I know that has been talked about in the past? Seems like that is a natural way to solve this problem.
I'm a bit surprised that it is difficult to get the iteration number into a workflow...
So were we..
The initially expected problem of mixing scaling factors seems to not be an issue
Today both the scaling job and the ES MDA algorithm are writing to the same scaling constant in storage. In particular, the algorithm overwrites the scaling from the scaling job. This should be resolved. Investigate if a "quick fix" where we either change the operator or introduce a second scaling constant in storage is possible. Or possibly something completely different?Run scaling job only once The ACO workflow should only be executed after the first iteration to ensure algorithmic stability. This can probably be resolved as a hack (for now) by passing the iteration number to the workflow under the hood. We need to check whether the scaling factors survive from one iteration to the next.