EdiGlacUQ / fenics_ice

GNU Lesser General Public License v3.0
5 stars 7 forks source link

new run_* source file needed for VAF sensitivity to velocities #120

Closed dngoldberg closed 5 months ago

dngoldberg commented 1 year ago

A new source file is needed in order to calculate VAF sensitivity to velocities. It will closely model run_errorprop.py, and will likely be very similar up to this line where the equation

$$\sigma_T^2 = \bigg(\frac{\partial QT}{\partial\hat{m}}\bigg)^T H{approx}^{-1} \bigg(\frac{\partial Q_T}{\partial\hat{m}}\bigg)$$

is implemented. This final inner product needs to be replaced by a multiplication of the mixed Hessian to get the result:

$$-H{\hat{p}\hat{m}} H{approx}^{-1} \bigg(\frac{\partial Q_T}{\partial\hat{m}}\bigg)$$

where

$$H_{\hat{p}\hat{m}} = \frac{\partial^2 J}{\partial p_i \partial m_j}$$

where $\hat{m}$ are controls and $\hat{p}$ are velocity cloud values.

update 30 Nov to give origin of above equation

Starting with

$$\frac{\partial J}{\partial\hat{m}} = 0,$$

a total differential yields

$$ \frac{\partial^2 J}{\partial\hat{m}^2} \delta\hat{m} + \frac{\partial^2 J}{\partial\hat{m}\partial\hat{p}} \delta\hat{p} \equiv H{\hat{m}\hat{m}} \delta\hat{m} + H{\hat{m}\hat{p}} \delta\hat{p} = 0. $$

Solving for $\delta\hat{m}$ yields

$$ \delta\hat{m}=- H{\hat{m}\hat{m}}^{-1} H{\hat{m}\hat{p}}\delta\hat{p}, $$

which upon rearrangement gives $\frac{ \delta\hat{m}}{\delta\hat{p}}$. The expression above is then found by

$$ \frac{\partial Q}{\partial \hat{p}} = \frac{\partial Q}{\partial \hat{m}}\frac{\partial \hat{m}}{\partial \hat{p}}$$

as $Q$ has no direct dependence on $\hat{p}$.

end update 30 Nov

There are 2 key things to note: (1) $H_{\hat{p}\hat{m}}$ is not symmetric/self adjoint. So it is not clear how to construct the mixed Hessian operator, and (2) $\hat{p}$ lives in a bespoke function space which was created solely to manipulate point clouds.

This issue depends on annotation of observation points in comp_J_inv.

UPDATE

the cost function formed in comp_J_inv is $J_u + J_v$ where

$$Ju = \frac{1}{2}(\mathbf{P}\mathbf{D}u - u{obs})^T \mathbf{R} (\mathbf{P}\mathbf{D}u - u_{obs})$$

where $\mathbf{D}$ is a projection to a DG1 space, and similar for $J_v$. Expanding this out, the only relevant term is

$$-u^T\mathbf{D}^T\mathbf{P}^T\mathbf{R}u_{obs}$$.

$H_{\hat{p}\hat{m}}$ is thus

$$\frac{\partial}{\partial m}(\mathbf{A}u) = \mathbf{A}\frac{\partial u}{\partial m}$$

where $\mathbf{A} = -\mathbf{R}\mathbf{P}\mathbf{D}$.

update 29 November

The target derivative sought is therefore

$$\mathbf{R}\mathbf{P}\mathbf{D} \frac{\partial u}{\partial m} H_{approx}^{-1} \bigg(\frac{\partial Q_T}{\partial\hat{m}}\bigg)$$

bearecinos commented 1 year ago
reset_manager()
stop_manager(tlm=False)
configure_tml((m, zeta))
solver.forward(m)
tau = function_tml(solver.u, (m, zeta))

Then we compute $$-A.tau$$

Amat_obs_action(solver._cached.P_mat, R_vec, u, solver._cache....)
dngoldberg commented 1 year ago
reset_manager()
stop_manager(tlm=False)
configure_tml((m, zeta))
solver.forward(m)
tau = function_tml(solver.u, (m, zeta))

Then we compute −A.tau

Amat_obs_action(solver._cached.P_mat, R_vec, u, solver._cache....)

I cannot remember how the u_stderr and v_stderr are domain-decomposed.. i wonder if i can just use the variables solver.py or need to cache them?

dngoldberg commented 1 year ago

@jrmaddison I have drafted new source to address this but I am so unsure about the types that Im not prepared to test yet, as errors could arise for so many reasons.

I have pushed a new runscript to my branch, which diverges from run_errorprop.py here where i read the cached variables from solver.py. I would be grateful if you could actually look at lines 162-180 to let me know what you think. I suspect that the direction of the TLM gradient must be a function, which is why I have defined P3. I also assume the return of function_tml (tau) is a function in the same space as solver.u.

Is function_tml in the previous comment a typo?

jrmaddison commented 1 year ago

Yes, function_tlm to access a tangent-linear variable (https://tlm-adjoint.github.io/autoapi/tlm_adjoint/manager/index.html#tlm_adjoint.manager.function_tlm).

Tangent-linear variables are objects in the same space as the original forward, so a tangent-linear variable associated with a primal Function is a primal Function. This is because when we perturb $m \rightarrow m + \varepsilon \delta m$, with $\delta m$ the direction defining the tangent-linear and $\varepsilon$ some scalar, the linear change to a forward variable $u$ is $u \rightarrow u + \varepsilon \tau_u$ where $\tau_u$ is the tangent-linear variable. For the addition to make sense they must be in the same vector space.

Some comments on the draft:

EDIT: Code fix

dngoldberg commented 1 year ago

many thanks for looking james. ill try to create the new python function as suggested.

one thing -- i thought that, as coded, the direction is a (fenics) function:

P3 = Function(space)
P3.vector() = P2.vector() - P1.vector()
...
configure_tml((cntrl, P3))
tau = function_tml(solver.U, (cntrl, P3))

were you implying that this was done incorrectly? or that it is OK but not best practice?

jrmaddison commented 1 year ago

Oops, yes it is.

dngoldberg commented 1 year ago

Hi @jrmaddison @bearecinos i have run the new script with the ISMIPC experiment in serial mode. As of now i can only say that it does not throw runtime errors. James's suggestion to create a separate function for the TLM calculation turned out to be necessary (thanks James!) to avoid TLM runtime errors the second time the TLM is calculated.

I am not creating a PR now -- i will leave this now and work on visualising the outputs for ISMIPC.

James: am I correct that the utilities for Taylor testing will not work for this gradient (of VAF with respect to velocities) calculation, and a test would have to be created manually?

jrmaddison commented 1 year ago

James: am I correct that the utilities for Taylor testing will not work for this gradient (of VAF with respect to velocities) calculation, and a test would have to be created manually?

Taylor verification would probably need to be hand coded, as the observations are not `annotatable' objects and the tlm_adjoint interface functions cannot interact with them.

dngoldberg commented 1 year ago

hi @jrmaddison @bearecinos there are 2 functions nested within fenics_ice/model/vel_obs_from_data(): interpolate() and interp_weights(). I can use them to visualise the observation dependencies, but not if they are nested.. is there any good reason i should not bring them outside of class model? the alternative is caching vertices and weights returned by interp_weights() and duplicating interpolate().

jrmaddison commented 1 year ago

Looks like model.vel_obs_from_data should be factored out. It's both reading data and referencing the data on the model. Those should be separate operations, in separate functions/methods.

dngoldberg commented 1 year ago

ok i will try to do so and then ask you guys to take a look..?

dngoldberg commented 1 year ago

hi @jrmaddison @bearecinos the code is now "working" in parallel, as in, it is not crashing. But the simple test i was using wont' work in parallel (for other reasons) so Im fumbling a bit blindly just now. I have one issue and one query:

ISSUE: the end result is a numpy array of the length of obs_local (the number of observations belonging to the proc). Im not sure if it NEEDS to be this way, but it is the way i wrote it. The issue is that I need to somehow to create a global array from the local arrays. Is there a correct way to do this other than just using mpi4py calls?

UPDATE: i have used an mpi reduce call to gather the local results. this is unreviewed and untested.

QUESTION: @jrmaddison this line from Amat_obs_action contains of matrix-vector multiplication of the local interpolation matrix P by the local vector of the DG(1) function. I assume this is OK, as (by construction) all columns of P corresponding to non-local observation points are zero?