FGP_TV Lipschitz constant 3D

epapoutsellis commented 12 months ago

In the FGP_TV class for the CCPi-Regularisation Toolkit the step-size (Lipschitz constant) is wrong.

At the moment is 1/26 but the correct value for 3D arrays with unitary grid (voxel-size=1.0) is 1./12. Because in this case the $$||\nabla|| = \sqrt{12}$$

This can cause problems when Total Variation prior is used with PDHG/SPDHG/FISTA. @Letizia97 @paskino

When tolerance is 0.0 (default), for every outer iteration we run 100 iterations (default) for the proximal of TV (which is the FISTA algorithm applied on the dual-ROF problem). Since every time a smaller step-size (1/26) is used, it is possible that we have not reached to the desired solution for every iteration, and in some cases we may observe divergence, see the figure below by BeckTeboulle. Higher step size (1/12) will give better results.

Warm-starting can help but still the subproblem is not solved optimally, i.e., with the right step size.

Note: I have tried many times to make this simple change and build it locally. But failed every time. Actually, I managed to make it work with cpu but building it with gpu it was a complete failure. For the datasets that I am working, it was usually 15-20sec per iteration and after building it with this change it was 100sec per iteration.

epapoutsellis commented 12 months ago

There are two alternatives for this problem. One is for free and is from the TIGRE toolbox which we support.

One can create easily a wrapper for im_3d_denoise. Similar to FGP_TV the dual ROF is solved but not with an accelerated gradient descent. The algorithm is basically a projected gradient descent described in here

The step-size is not fixed in every iteration but is changing. https://github.com/CERN/TIGRE/blob/dbcd848671c96a42b7c9669e75c8e7e19e55ce6d/Common/CUDA/tvdenoising.cu#L445-L446

I believe the implementation is based on ZhuChan maybe @AnderBiguri can confirm.

It works with multiple gpu and is useful for large 3D datasets.

The other option is to use cucim and dask-cuda.

In both cases, one needs to implement a __call__ method that computes the Total variation in every iterate, $$||\nabla x{k}||{2,1}$$ which can be done by the GradientOperator and MixedL21Norm.

AnderBiguri commented 12 months ago

@epapoutsellis I can confirm! that is the paper indeed. I believe the paper that was exactly followed was: https://link.springer.com/article/10.1007/s10334-010-0207-x

epapoutsellis commented 12 months ago

Thank you @AnderBiguri!!!

In the im_3d_denoise do you normalize in order to default lambda value? I do not think is necessary.

AnderBiguri commented 12 months ago

@epapoutsellis honestly I wrote that 8 years ago, so I have no idea why I do it. Perhaps its indeed to default lambda, or perhaps the source code doesn't handle large numbers, no idea really. If you get to test it let me know if I need to remove it.

paskino commented 12 months ago

I suppose the real solution is to fix the CCPi-Regularisation-Toolkit, see https://github.com/vais-ral/CCPi-Regularisation-Toolkit/pull/179

paskino commented 8 months ago

Currently I am failing at building the CCPi-Regularisation Toolkit...

paskino commented 8 months ago

Lots of changes are happening in the CCPi-Regularisation Toolkit to allow building. https://github.com/vais-ral/CCPi-Regularisation-Toolkit/pull/183

TomographicImaging / CIL

FGP_TV Lipschitz constant 3D #1562