Closed kakila closed 2 years ago
Hi, thanks for the issue. Yes, this is correct. Please refer to the original paper (eg, eq 4.1 and next two equations)
So the initial formulation of the problem does not have a knob to control the L1 regularization, but the modified one does. Can one not just re-write the original formulation where the weighting of L1 appears? Or put it other way around. How do you select the weight for the L1 regularization if it is not in the original problem formulation?
This is a good question, which goes a bit beyond Pylops scope. Whilst I agree it makes it generally a bit harder to choose this weighting factor, this is what the authors of the original paper came up with. We don’t want our implementation to diverge from the paper (unless you think we are doing something inconsistent with the paper, in that case please let us know and we will be happy to change it accordingly :) ).
Technically speaking you could have a weighing on the original term but then because of the splitting strategy you will also have one more appearing, so maybe they just didn’t want to have too many.
PS: may I ask which kind of problem you are trying to solve with Split-Bergman. Whilst this is a powerful solver, I found out that for some of the problems I work with other solvers of the family of proximal solvers are slightly more performant. You may want to take a look at this sister library we recently created https://github.com/PyLops/pyproximal
This solver, that was recently developed by a colleague, may actually do what you want: https://pyproximal.readthedocs.io/en/latest/api/generated/pyproximal.optimization.sr3.SR3.html#pyproximal.optimization.sr3.SR3
Even though note that a second knob will always appear to relax the intermediate constrained problem.
Thanks for the detailed answer. i will definitely will check that information. I wanted to compare SPlitBergman with Lagged Diffusivity Fixed Point Method, implemented to obtained the total variation regularized derivative directly form the data (in plops, the TVR example recovers the function, not the derivative). I was trying to reproduce the results from the 2011 publication https://www.hindawi.com/journals/isrn/2011/164564/ I noticed that SplitBergman is super sensitive to the two weights, and the results is really hard to control. The original implementation of Diffusion TVR is very robust (https://github.com/smrfeld/Total-Variation-Regularization-Derivative-Python/tree/main/python). I implemented the algorithm using pylops, but still can't reproduce the results. I think it is due to the structure of the derivative and integral operators in pylops, but I haven't spotted the issue yet. Below is some code I developed, it is still drafty, but maybe interest you. (btw, it would be great to have Galerkin based derivative operators or FD with boundary conditions, will post if I manage to implement one)
import numpy as np
from scipy import integrate
from scipy import optimize
from functools import partial
import pylops
class TrapezoidalIntegration(pylops.LinearOperator):
r"""Integration using trapezoidal rule.
Apply trapezoidal integration to a multi-dimensional array along ``dir`` axis.
"""
def __init__(self, N, dims=None, dir=-1, sampling=1.0, dtype='float64'):
super().__init__()
self.N = N
self.dir = dir
self.sampling = sampling
if dims is None:
self.dims = [self.N, 1]
self.reshape = False
else:
if np.prod(dims) != self.N:
raise ValueError('product of dims must equal N!')
else:
self.dims = dims
self.reshape = True
self.shape = (self.N, self.N)
self.dtype = np.dtype(dtype)
self.explicit = False
self._integrator = partial(integrate.cumtrapz, dx=self.sampling, axis=-1,
initial=0)
def _matvec(self, x):
if self.reshape:
x = np.reshape(x, self.dims)
if self.dir != -1:
x = np.swapaxes(x, self.dir, -1)
y = self._integrator(x)
if self.dir != -1:
y = np.swapaxes(y, -1, self.dir)
return y.ravel()
def _rmatvec(self, x):
if self.reshape:
x = np.reshape(x, self.dims)
if self.dir != -1:
x = np.swapaxes(x, self.dir, -1)
xflip = np.flip(x, axis=-1)
y = self._integrator(xflip)
y = np.flip(y, axis=-1)
if self.dir != -1:
y = np.swapaxes(y, -1, self.dir)
return y.ravel()
class DiffTVR:
def __init__(self, dx: float, *, n: int = None):
"""Differentiate with TVR.
Args:
dx (float): Spacing of data.
n (int): Number of points in data.
"""
self.Dop = None
self.CIop = None
self.eps = 1e-10
self._n = n
self._dx = dx
if self._n is not None:
self.build_matrices()
@property
def n(self): return self._n
@n.setter
def n(self, value):
self._n = value
self.build_matrices()
@property
def dx(self): return self._dx
@dx.setter
def dx(self, value):
self._dx = value
if self._n is not None:
self.build_matrices()
def build_matrices(self):
self.Dop = pylops.FirstDerivative(self._n, kind='centered', edge=True,
sampling=self._dx)
self.CIop = TrapezoidalIntegration(self._n, sampling=self._dx)
def __call__(self, data: np.array, alpha: float, *,
initial_guess: np.array, steps: int,
dx: float = None) -> np.array:
"""Get derivative via TVR over optimization steps
Args:
data (np.array): Data
initial_guess (np.array): Guess for derivative
alpha (float): Regularization parameter
steps (int): No. opt steps to run
Returns:
np.array: estimated derivative
np.array: integrated derivative
"""
self._dx = dx if dx is not None else self.dx
self.n = data.size # triggers update of matrices
KTK = self.CIop.T * self.CIop
deriv_curr = initial_guess
for s in range(0, steps):
# Compute update
en_diag = self._dx / np.sqrt((self.Dop * deriv_curr) ** 2 + self.eps)
hnOp = KTK + alpha * self.Dop.T * pylops.Diagonal(en_diag) * self.Dop
b = hnOp * deriv_curr - self.CIop.T * data
update = - hnOp / b
# Update solution
deriv_curr += update
# Estimate integral
func = self.CIop * deriv_curr
return deriv_curr, func
import numpy as np
import pylops
nx = 100
x = np.linspace(0, 1, nx)
dx = x[1] - x[0]
Sop = TrapezoidalIntegration(nx, sampling=dx)
Iop = pylops.Identity(nx)
Dop = pylops.FirstDerivative(nx, edge=True, kind='centered', sampling=dx)
rnd = np.random.default_rng(12345)
y0 = np.abs(x - 1/2)
dy0 = np.where(x < 1/2, -1, 1)
sy = 0.05
y = Iop * (y0 + rnd.normal(0, sy, nx))
mu = 4e4
niter_out = 100
niter_in = 5
dy_l1, niter = SplitBregman(Sop, [Dop], y - y[0],
niter_out, niter_in,
mu=mu,
epsRL1s=[5],
tol=1e-3,
**dict(iter_lim=30, damp=1e-5)
)
y_l1 = Sop * dy_l1 + y[0]
print(f'std resid L1: {(y - y_l1).std()}')
diff_tvr = DiffTVR(dx)
dy_tv, y_tv = diff_tvr(data=y - y[0], initial_guess=np.zeros(nx), alpha=2e-3, steps=100)
y_tv += y[0]
print(f'std resid TV: {(y - y_tv).std()}')
Nice :)
Indeed the tutorials in our documentation use TV to recover a blocky signal, there is no example to recover the derivative of the signal. Let me take a look at the paper you linked but the idea of casting derivative as the inverse of causal integration is logical and should work with pylops :) give me a couple of days to look at your codes and the paper to see if I can spot anything that explains the problem you have. One thing that we realized thanks to another user is that TV likes forward/backward derivatives whilst when using central you get some unwanted ringing in the solution. That may be a quick thing to try and see if your results improve after that change :)
And you are right: so far our Derivative/Integration operators do not accommodate for any type of boundary conditions, it would definitely be a great contribution!
Yes, the casted problem is what i implemented in these lines
dy_l1, niter = SplitBregman(Sop, [Dop], y - y[0], ...)
Sop is the causal integral (trapezoidal rule in my case) and Dop is the derivative for the L1 norm
I think the like of froward derivatives has not to do with TV but with SplitBerman. The iteration fixed point implemented in the repository I linked uses central derivatives (with extra point outside domain to improve boundary value). It is very robust and works well. I also tried using operators of the form 0.25 * R.H D R, where R is the symmetrization operator to improve boundary values of the derivative, but it did not improve the behavior of SplitBergman
Hi, I think I have a few ideas why both of your pylops codes don't work as well as you expected.
Every time you write a dense matrix, like in the Diffusion TVR you pointed me at, the adjoint is very easy to obtain, just transpose the matrix. But when you write linear operators this is not the case anymore. I tried to check if your TrapezoidalIntegration
passes the dottest (https://pylops.readthedocs.io/en/latest/api/generated/pylops.utils.dottest.html#pylops.utils.dottest) and I see this is not the case. This means that you don't have any guarantee that iterative solvers would converge as they are now working with a broken pair of forward-adjoint. This is I think due to the fact that in the adjoint you don't handle properly the fact that the first sample in trapezoidal integration is summed with a 0.5 scaling in the forward. I modified the original CausalIntegration
(not yet pushed to the main repo):
import numpy as np
from pylops import LinearOperator
class CausalIntegration(LinearOperator):
r"""Causal integration.
Apply causal integration to a multi-dimensional array along ``dir`` axis.
Parameters
----------
N : :obj:`int`
Number of samples in model.
dims : :obj:`list`, optional
Number of samples for each dimension
(``None`` if only one dimension is available)
dir : :obj:`int`, optional
Direction along which smoothing is applied.
sampling : :obj:`float`, optional
Sampling step ``dx``.
halfcurrent : :obj:`bool`, optional
Add half of current value (``True``) or the entire value (``False``)
trapezoidal : :obj:`bool`, optional
Apply trapezoidal rule (``True``) or not (``False``)
dtype : :obj:`str`, optional
Type of elements in input array.
Attributes
----------
shape : :obj:`tuple`
Operator shape
explicit : :obj:`bool`
Operator contains a matrix that can be solved explicitly (``True``)
or not (``False``)
Notes
-----
The CausalIntegration operator applies a causal integration to any chosen
direction of a multi-dimensional array.
For simplicity, given a one dimensional array, the causal integration is:
.. math::
y(t) = \int x(t) dt
which can be discretised as :
.. math::
y[i] = \sum_{j=0}^i x[j] dt
or
.. math::
y[i] = (\sum_{j=0}^{i-1} x[j] + 0.5x[i]) dt
or
.. math::
y[i] = (\sum_{j=1}^{i-1} x[j] + 0.5x[0] + 0.5x[i]) dt
where :math:`dt` is the ``sampling`` interval. In our implementation, the
choice to add :math:`x[i]` or :math:`0.5x[i]` is made by selecting
the ``halfcurrent`` parameter and the choice to add :math:`x[0]` or
:math:`0.5x[0]` is made by selecting the ``trapezoidal`` parameter.
Note that the integral of a signal has no unique solution, as any constant
:math:`c` can be added to :math:`y`, for example if :math:`x(t)=t^2` the
resulting integration is:
.. math::
y(t) = \int t^2 dt = \frac{t^3}{3} + c
If we apply a first derivative to :math:`y` we in fact obtain:
.. math::
x(t) = \frac{dy}{dt} = t^2
no matter the choice of :math:`c`.
"""
def __init__(self, N, dims=None, dir=-1, sampling=1,
halfcurrent=True, trapezoidal=False, removefirst=False,
dtype='float64'):
self.N = N
self.dir = dir
self.sampling = sampling
self.trapezoidal = trapezoidal
self.halfcurrent = halfcurrent if not trapezoidal else False
self.removefirst = removefirst
if dims is None:
self.dims = [self.N, 1]
self.reshape = False
else:
if np.prod(dims) != self.N:
raise ValueError('product of dims must equal N!')
else:
self.dims = dims
self.reshape = True
self.shape = (self.N-self.dims[self.dir] if self.removefirst else self.N,
self.N)
self.dtype = np.dtype(dtype)
self.explicit = False
def _matvec(self, x):
if self.reshape:
x = np.reshape(x, self.dims)
if self.dir != -1:
x = np.swapaxes(x, self.dir, -1)
y = self.sampling * np.cumsum(x, axis=-1)
if self.halfcurrent or self.trapezoidal:
y -= self.sampling * x / 2.
if self.trapezoidal:
y[1:] -= self.sampling * x[0] / 2.
if self.removefirst:
y = y[1:]
if self.dir != -1:
y = np.swapaxes(y, -1, self.dir)
return y.ravel()
def _rmatvec(self, x):
if self.reshape:
x = np.reshape(x, self.dims)
if self.removefirst:
x = np.insert(x, 0, 0, axis=self.dir)
if self.dir != -1:
x = np.swapaxes(x, self.dir, -1)
xflip = np.flip(x, axis=-1)
if self.halfcurrent:
y = self.sampling * (np.cumsum(xflip, axis=-1) - xflip / 2.)
elif self.trapezoidal:
y = self.sampling * (np.cumsum(xflip, axis=-1) - xflip / 2.)
y[-1] = self.sampling * np.sum(xflip, axis=-1) / 2.
else:
y = self.sampling * np.cumsum(xflip, axis=-1)
y = np.flip(y, axis=-1)
if self.dir != -1:
y = np.swapaxes(y, -1, self.dir)
return y.ravel()
and this is now passing the dottest
Cop = CausalIntegration(4, sampling=1, trapezoidal=True, removefirst=True)
dottest(Cop)
After this I focused on your reimplementation of the original algorithm using the new fixed integrator. I checked that we get the exact same matrices of the original code (by calling todense
on the operators... this is a good check to do all the time when starting from very small examples) and there everything works fine - see https://github.com/PyLops/pylops_notebooks/blob/master/developement/DerivativeInversion.ipynb.
It was however to my surprise that the final results were very different. I then started to worry that for this algorithm to work you cannot accept inaccurate solutions of the Hs=g
system, which is what you inevitably get when working with operators and iterative solvers (note that \
in pylops overloads an iterative solver, eg lsqr). So to check I tried to densify the operator and use the same direct solver scipy.linalg.solve
and then again things seem to work fine.... I want to play a bit more as I can't believe that small errors can throw the entire algorithm off, but so far this is what I see :)
Finally, I haven't tried Split-Bregman again but since you used an operator with wrong adjoint I suggest to discard anything you saw... for the reason I explain, this is not to be trusted as no solver has guarantees if the forward-adjoint pair is incorrect...
PS: you say The iteration fixed point implemented in the repository I linked uses central derivatives
Assuming this is the code you refer to:
"""Make differentiation matrix with central differences. NOTE: not efficient!
Returns:
np.array: N x N+1
"""
arr = np.zeros((self.n,self.n+1))
for i in range(0,self.n):
arr[i,i] = -1.0
arr[i,i+1] = 1.0
return arr / self.dx
Indeed they say they use central difference but this is not what they implement. What they implement is a first-order forward derivative - see https://en.wikipedia.org/wiki/Finite_difference_coefficient
Thanks! Indeed this was my suspicion, the adjoint was wrongly implemented. Thanks!
And you are also right about the derivative operator in the original code, it is not central.
I wasn't aware of the dot test, thanks, now that the adjoint forward op. are right I will try again splitBergman. And compare with my reimplantation of the fixed point with central difference, maybe i see the oscillations too, confirming your suspicion that it is a relation between the functional and the discretized Op, and not the solver
BTW, instead of the keyword trapezoidal in the causal integral, better use "kind" as in the derivative, this will make easier integration of the "galerkin* kind of op i want to contribute
Finally, are you aware of the python package derivative? It might be an excellent way to complement pylops https://pypi.org/project/derivative/
BTW, instead of the keyword trapezoidal in the causal integral, better use "kind" as in the derivative, this will make easier integration of the "galerkin* kind of op i want to contribute
Totally agree. Let me follow this route, I’ll push soon a version with kind and 3 options implemented and you can add more at any time :) - I may need to leave half for backward compatibility but I can encourage users to prefer using kind…
Finally, are you aware of the python package derivative? It might be an excellent way to complement pylops https://pypi.org/project/derivative/
I didn’t know about it, looks interesting. I wonder if we coudl ask the authors if they/we can add adjoints and then pylops would simply wrap them?
@kakila I just added kind
to CausalIntegration https://github.com/PyLops/pylops/pull/251 so that as you suggested it is easy to extend beyond these 3 kinds :)
Looking forward to any contribution from you on galerkin kinds of derivative/integration.
I also looked again at the DiffTVR with PyLops operators and it seems to me that iterative solvers can be used but require very long number of iterations for the overall method to be stable. Using:
update = lsqr(HOp, -g, atol=0, btol=0, x0=update1, iter_lim=5000)[0]
gives good results but of course this is no where near direct solvers speed.
I think this is the prime example where approximated solutions and non dense matrices should not be used unless needed: and for needed I mean, if you problem is so huge that storing the dense matrices and solving a dense system as required by the DiffTVR is not feasible then you may want to accept using much cheaper (memory wise) operator equivalents at the cost of needing an iterative solver for each update... so if you stay in 1D I think you will never face this, if you move to 2D and 3D it may come the time you need it :)
Hope this helps!
In the formulation of the objective functional, the L1 regularization dampings do not appear. They also are not in the constrained problem. They appear only on the iteration algorithm. Is this correct?