equinor / iterative_ensemble_smoother

Algorithms for data assimilation using ensemble methods.
GNU General Public License v3.0
17 stars 12 forks source link

Possible speedup of localization by looping observations #175

Open tommyod opened 11 months ago

tommyod commented 11 months ago

@Blunde1 writes:

One of the great things about the KF is that it runs on $O(m)$ time for updating instead of $O(m^3)$ when we have $m$ measurements. This is possible when the measurement error is independent, and consequently the covariance of the error is diagonal. See Understanding the EnKF Section 4.2 Serial Updating. https://folk.ntnu.no/joeid/Katzfuss.pdf it refers to this paper that I believe is the origin paper for covariance localization https://www.researchgate.net/publication/228817876_A_Sequential_Ensemble_Kalman_Filter_for_Atmospheric_Data_Assimilation

Our measurement error is so independent so that we have even built it into the api I believe. At least for current practical purposes, $\Sigma_\epsilon$ will be diagonal. Then why can we too not do sequential updates? There probably is a good reason and I am not seeing it.

Thus throwing something wild out there: how about iterating over observations assimilating them sequentially, and then iterate over parameters when doing this sequential update?

for (dj, yj, Sigma_eps_j) in (observations, responses, variance):
    H = LLS coefficients yj on X[mask,:] # 1xp but using X which could be p2xn for p2<p depending on mask
    Sigma_yj = H @X @ X.T @ H.T # 1x1
    Sigma_d = Sigma_yj + Sigma_eps_j # 1x1
    for i in realizations
        T_ji = (dj - yji) / Sigma_d # 1x1
        for X_ki in realization_i_of_parameters:
            if dj not masked for xk:
                X_ki = X_ki + cov_xk_yj*T_ji

Is this easily wrong? Or wrong for some other reason? I tried to adjust for the weird reasons that using the full transition matrix was wrong (writing H = LLS coefficients yj on X[mask:,]). It therefore assumes the masking is pre-computed. Is this unreasonable?

Edit: If you do not have immediate objections (either theoretically or computationally) I could try implementing it for the Guass-Linear notebook example and see if results makes sense.

tommyod commented 11 months ago

$$X_\text{updated} = \text{cov}(X, X) G^T (G \text{cov}(X, X) G^T + \alpha C_D)^{-1} (D - Y)$$

Let there be $n$ parameters, $m$ responses and $N$ ensemble members. Consider a single $y_j$, a row vector corresponding to resonse $j$. Then $C_D$ becomes a number (diagonal $(j, j)$ ) and we have shapes

X_\text{updated} = \underbrace{\text{cov}(X, X)}_{n \times n} \underbrace{G^T}_{n \times 1} (\underbrace{G}_{1 \times n} \underbrace{\text{cov}(X, X)}_{n \times n} \underbrace{G^T}_{n \times 1} + \underbrace{\alpha C_D}_{1 \times 1})^{-1}  \underbrace{(d_j - y_j)}_{1 \times N}

or

X_\text{updated} = \underbrace{\text{cov}(X, X)}_{n \times n} \underbrace{G^T}_{n \times 1} 
\underbrace{(G \text{cov}(X, X) G^T + \alpha C_D)^{-1}}_{1 \times 1}
\underbrace{(d_j - y_j)}_{1 \times N}

If we know that $X$ is sampled from a multivariate normal, then we know $\text{cov}(X, X)$ and it becomes diagonal. Assume it's the identity, then

X_\text{updated} =  \underbrace{G^T}_{n \times 1} (\underbrace{G}_{1 \times n}  \underbrace{G^T}_{n \times 1} + \underbrace{\alpha C_D}_{1 \times 1})^{-1}  \underbrace{(d_j - y_j)}_{1 \times N}

which has the structure of a rank-1 update.

tommyod commented 10 months ago

Så litt på å loope over observasjoner $y_i$. Problemet slik jeg ser det er at vi ikke vet hvilke parametre som korrelerer med $y_i$. Når vi vet det må disse lastes fra disk. Så hvis $y_1$ er korrelert med ett parametersett $X_1$, og $y_2$ med $X_2$, etc så må vi laste disse fra disk for å oppdatere.

Se for deg en situasjon der det er lite overlapp. Det blir mye lasting og skriving til disk. Eksempel: vi har $10$ parametre i $X$. Vi kan kun holde $3$ i minnet samtidig. $y_1$ sier at $\{1, 2, 3, 4\}$ skal oppdateres, da må vi laste 2 grupper, f.eks. $\{1, 2\}$ og $\{3, 4\}$. Så sier $y_2$ at $\{1, 2\}$ skal oppdateres - selv om vi hadde disse i minne for litt siden er de nå skrevet til disk, så da må vi laste de på nytt. Så sier $y_3$ at $\{1, 7\}$ skal oppdateres, da må vi nesten hente 7 fra disk også - sikkert ved å skrive $\{1,2\}$ først så laste $\{1, 7\}$ på nytt.

Blunde1 commented 10 months ago

Problemet du beskriv er ikkje gitt er eit problem, men ein trade-off mellom skriving til og frå disk vs. invertering av dense store matriser svært mange gangar. I algoritmen over har vi ein loop over observasjonar, så ein loop over realisasjonar, så ein loop over parametrar. Parametrar kan lastast i store batchar, så det bør ikkje vere eit kjempeproblem?

Der er to problem eg ser som er urelaterte til det over:

Blunde1 commented 10 months ago

Did a first stab at this here: https://github.com/Blunde1/iterative_ensemble_smoother/blob/es-observation-loop/docs/source/LinearRegression.py