equinor / webviz-ert

ERT webviz plugins
GNU General Public License v3.0
12 stars 23 forks source link

How do we calculate the misfit that is plotted? #377

Open hnformentin opened 1 year ago

hnformentin commented 1 year ago

Is your feature request related to a problem? Please describe. The definition of the misfit calculation is important because there are many ways to do it. I saw another vizualisation tool from the Drogon tutorial only plotting (obs-realization). It seems webvis-ert adopts another equation, and it would be nice that the user could find this information easily.

Describe the solution you'd like I would have the equation in the x-axis as pointed in the image, I suspect the equation is simply |observation-realization|/standard_deviation.

Screenshot from 2022-08-18 15-13-07

oysteoh commented 1 year ago

misfits is calculated with this implementation if i understand correctly in ert_storage/compute/misfits.py

With regards to displaying the actual equation i'm not sure if i agree. I would like to think of it as an internal thing to our application and nothing a user should care about. On the other hand i guess i understand people would like to know what is the basis of the calculations..... 🤔 Do you @sondreso have any input to this?

hnformentin commented 1 year ago

I think there are many reasons why one would want to analyze the misfits, particularly this one that seems to be standardised misfit (response-observed)/standard_deviation. One inference that can come from it is if the estimation of the uncertainty (standard deviation) is appropriate. For example, if the misfits are too large, that would mean that the uncertainty is probably underestimated. With too large, we take in consideration that a normal distribution encompasses 99.7% of its points within +- 3 standard_deviation. There are other things that can be analyzed, for example, is a particular time that has high misfit or all the observation in a given time series have similar values and distribution. For this kind of analysis, I think these graphics are important and as important is to know what is being plotted.

oysteoh commented 1 year ago

Do you also then mean that we might should have different implementations of the misfit and the user should be able to choose / configure the calculation?

hnformentin commented 1 year ago

Good question! First, I think it is good to have well defined what it is already plotted. Second, different implementations could be a nice feature, specially if the users are used to evaluate different misfit functions. In this example that I saw, it was plotted only response minus observation. I honestly don't know how the users consider it, but the users may have a reason. For me, this difference alone does not make a lot of sense. A key specification of the problem is the uncertainty around the observations...if having different implementations is a good feature, I think it would be good to observe the users. Third, if we observe that users are analyzing the misfit carefully, we would find a couple of additional misfit calculations to be included. Particularly an aggregated measure of misfit, for example, per time series or per realization, could be useful.