dipy / dipy

DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic methods for spatial normalization, signal processing, machine learning, statistical analysis and visualization of medical images. Additionally, it contains specialized methods for computational anatomy including diffusion, perfusion and structural imaging.
https://dipy.org
Other
699 stars 436 forks source link

Issue with CWLS in dki #3222

Open samcoveney opened 3 months ago

samcoveney commented 3 months ago

I need some help figuring out if there is a problem with constrained WLS in the DKI module. See dipy/reconst/dki.py

https://github.com/dipy/dipy/blob/e5cc120bf6c5c271f32cced62150157abd15a321/dipy/reconst/dki.py#L2611-L2619

So here we see that, in order to solve a weighted least squares problem, we define W as "signal" (such that W is that square root of the weight matrix that would be applied to the squared residuals, which is "signal squared"). Then we redefine y and A so that we can solve the weight least squares problem. This is fine as a way to solve WLS, see https://en.wikipedia.org/wiki/Weighted_least_squares

The problem is that we are performing constrained weighted least squares. The positivity constraints are set up here:

https://github.com/dipy/dipy/blob/e5cc120bf6c5c271f32cced62150157abd15a321/dipy/reconst/dki.py#L1795-L1814

But my understanding is that these constraints are set up with the understanding that y is log(signal). But we redefine y in the constrained weighted least squares.

As detailed in #3170 I am having problems writing the tests for CWLS (several of which were missing...) using noisy data - the tests keep failing. No problem with non-noisy data, probably because the solver detects that the problem is already solved (weighting zero residuals results in zero-residuals i.e. no effect).

For example:

    # testing return of S0 when mask is inputted
    mask_test = dki_S0F.fa > 0
    for fit_method in tested_methods:
        kwargs = {}
        if fit_method in ["CLS", "CWLS"]:
            kwargs = {"cvxpy_solver": cvxpy.CLARABEL}
        dki_S0M = dki.DiffusionKurtosisModel(gtab_2s, fit_method=fit_method,
                                             return_S0_hat=True, **kwargs)
        dki_S0F = dki_S0M.fit(DWI + np.random.normal(size=DWI.shape), mask=mask_test) # NOTE: added noise to track down the problem - if noise added, CWLS solve fails here! Weighted problem not working!
        dki_S0F_S0 = dki_S0F.model_S0

Resulting in:

_solver_opts = {}

    def solve(_solver_opts):

        _settings = CLARABEL.parse_solver_opts(verbose, _solver_opts)
        _solver = clarabel.DefaultSolver(P, c, A, b, cones, _settings)
>       _results = _solver.solve()
E       pyo3_runtime.PanicException: Eigval error: Eigen(1)

Note that I have rebased on master today, and the problem persists.

Given my worries about the problems with the theory (that we have supplied constraints designed for log(signal) to a solver using signal*log(signal), I think we should be worried that the constrained solver is not really working properly for this case.

edit: in addition to my theoretical worries about CWLS, note that adding noise to the signal also produces this error, regularly, for CLS... the plot thickens

skoudoro commented 3 months ago

Thank you for this complete report @samcoveney.

This worries me also. Can you have a look @RafaelNH and @arokem since you are more familiar with DKI?

if it is a solver issue, we might have to contact cvxpy by giving the simplest example as possible. They promoted a lot CLARABEL and deprecated ECOS.

Thank you very much.

samcoveney commented 3 months ago

Thanks. As a point, a lot of work has gone into #3170 which generalized the weights usable in WLS routines in both dti.py and dki.py, so if we can figure out the issue, please don't race ahead and implement a fix that it gonna be difficult for (or incompatible with) #3170 - if we can figure the problem, I can probably fold it into my PR

samcoveney commented 3 months ago

Sorry if not helpful, but trying to help with brain-storming the problem early on. By first doing export RUST_BACKTRACE=1 in the shell, and then running CLS on noisy signal several times until this exact fail, I can get a trace for pyo3_runtime.PanicException as the following:

thread '<unnamed>' panicked at src/solver/core/cones/psdtrianglecone.rs:464:35:
Eigval error: Eigen(1)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/result.rs:1649:5
   3: <clarabel::solver::core::cones::psdtrianglecone::PSDTriangleCone<T> as clarabel::solver::core::cones::Cone<T>>::step_length
   4: <clarabel::solver::core::cones::supportedcone::SupportedCone<T> as clarabel::solver::core::cones::Cone<T>>::step_length
   5: <clarabel::solver::core::cones::compositecone::CompositeCone<T> as clarabel::solver::core::cones::Cone<T>>::step_length
   6: <clarabel::solver::core::solver::Solver<D,V,R,K,C,I,SO,SE> as clarabel::solver::core::solver::IPSolver<T,D,V,R,K,C,I,SO,SE>>::solve
   7: clarabel::python::impl_default_py::_::<impl clarabel::python::impl_default_py::PyDefaultSolver>::__pymethod_solve__
   8: pyo3::impl_::trampoline::trampoline

and so on... (there is more to it)

Sometimes, I get failure of the solver with cvxpy.error.SolverError instead.

samcoveney commented 3 months ago

Okay, no movement on this. I'm chalking it up to the solver failing for now, and so i will turn off the noise in tests in #3170 that is causing failure. I will try to return to the matter in the future.

samcoveney commented 3 months ago

Update: I have reached out to Tom Dela Haije and Evren Özarslan, hopefully I can make some progress

Edit: @RafaelNH and @arokem any thoughts yet?

samcoveney commented 2 months ago

I've now spoken with Tom and Evren separately, and performed some additional experiments with multiple different solvers.

The solution seems to be: scale the problem The consistent way to do this, which should not undermine the constraint formulation (which relies on the parameter vector) is to scale the units i.e. scale the b-values (I have tried with scaling the signal, but it doesn't seem to make any difference).

Given the way things are setup in DiPy, it is probably best for the user to do this scaling before calling the fit function i.e. scale their own b-values into different units... although then they would need to unscale their results. (The reason it's untidy to try to put the scaling into the routines, is that the design matrix and its inverse are calculated outside of the solvers... I will look into a solution that does not require the user to do anything).

Thanks @skoudoro

RafaelNH commented 2 months ago

Hi all! Sorry for my silence lately - I am just passing through some hectic times. My wife is pregnant (at the moment at the 35th week) so please expect that I will be slow in replying in the following months. Anyway, just a suggestion for normalizing the units of the b-values: I actually had to do a similar thing to find the unique b-values of a dataset for the mean signal dki reconstruction. Perhaps you can reuse some of the functions there to normalize the b-value. Give a look to the beginning of msdki.py in dipy.reconst.

samcoveney commented 2 months ago

I understand, have been there myself, and congrats and good luck to both.

Thanks for the advice, I'll take a look at this as an approach. I've got some test branches that I'm unhappy with, hopefully your advice will help.

Either way, I'm glad to have determined the source of the problem.