Open samcoveney opened 3 months ago
Thank you for this complete report @samcoveney.
This worries me also. Can you have a look @RafaelNH and @arokem since you are more familiar with DKI?
if it is a solver issue, we might have to contact cvxpy
by giving the simplest example as possible. They promoted a lot CLARABEL
and deprecated ECOS
.
Thank you very much.
Thanks. As a point, a lot of work has gone into #3170 which generalized the weights usable in WLS routines in both dti.py and dki.py, so if we can figure out the issue, please don't race ahead and implement a fix that it gonna be difficult for (or incompatible with) #3170 - if we can figure the problem, I can probably fold it into my PR
Sorry if not helpful, but trying to help with brain-storming the problem early on. By first doing export RUST_BACKTRACE=1
in the shell, and then running CLS
on noisy signal several times until this exact fail, I can get a trace for pyo3_runtime.PanicException
as the following:
thread '<unnamed>' panicked at src/solver/core/cones/psdtrianglecone.rs:464:35:
Eigval error: Eigen(1)
stack backtrace:
0: rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
2: core::result::unwrap_failed
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/result.rs:1649:5
3: <clarabel::solver::core::cones::psdtrianglecone::PSDTriangleCone<T> as clarabel::solver::core::cones::Cone<T>>::step_length
4: <clarabel::solver::core::cones::supportedcone::SupportedCone<T> as clarabel::solver::core::cones::Cone<T>>::step_length
5: <clarabel::solver::core::cones::compositecone::CompositeCone<T> as clarabel::solver::core::cones::Cone<T>>::step_length
6: <clarabel::solver::core::solver::Solver<D,V,R,K,C,I,SO,SE> as clarabel::solver::core::solver::IPSolver<T,D,V,R,K,C,I,SO,SE>>::solve
7: clarabel::python::impl_default_py::_::<impl clarabel::python::impl_default_py::PyDefaultSolver>::__pymethod_solve__
8: pyo3::impl_::trampoline::trampoline
and so on... (there is more to it)
Sometimes, I get failure of the solver with cvxpy.error.SolverError
instead.
Okay, no movement on this. I'm chalking it up to the solver failing for now, and so i will turn off the noise in tests in #3170 that is causing failure. I will try to return to the matter in the future.
Update: I have reached out to Tom Dela Haije and Evren Özarslan, hopefully I can make some progress
Edit: @RafaelNH and @arokem any thoughts yet?
I've now spoken with Tom and Evren separately, and performed some additional experiments with multiple different solvers.
The solution seems to be: scale the problem The consistent way to do this, which should not undermine the constraint formulation (which relies on the parameter vector) is to scale the units i.e. scale the b-values (I have tried with scaling the signal, but it doesn't seem to make any difference).
Given the way things are setup in DiPy, it is probably best for the user to do this scaling before calling the fit function i.e. scale their own b-values into different units... although then they would need to unscale their results. (The reason it's untidy to try to put the scaling into the routines, is that the design matrix and its inverse are calculated outside of the solvers... I will look into a solution that does not require the user to do anything).
Thanks @skoudoro
Hi all! Sorry for my silence lately - I am just passing through some hectic times. My wife is pregnant (at the moment at the 35th week) so please expect that I will be slow in replying in the following months. Anyway, just a suggestion for normalizing the units of the b-values: I actually had to do a similar thing to find the unique b-values of a dataset for the mean signal dki reconstruction. Perhaps you can reuse some of the functions there to normalize the b-value. Give a look to the beginning of msdki.py in dipy.reconst.
I understand, have been there myself, and congrats and good luck to both.
Thanks for the advice, I'll take a look at this as an approach. I've got some test branches that I'm unhappy with, hopefully your advice will help.
Either way, I'm glad to have determined the source of the problem.
I need some help figuring out if there is a problem with constrained WLS in the DKI module. See
dipy/reconst/dki.py
https://github.com/dipy/dipy/blob/e5cc120bf6c5c271f32cced62150157abd15a321/dipy/reconst/dki.py#L2611-L2619
So here we see that, in order to solve a weighted least squares problem, we define W as "signal" (such that W is that square root of the weight matrix that would be applied to the squared residuals, which is "signal squared"). Then we redefine
y
andA
so that we can solve the weight least squares problem. This is fine as a way to solve WLS, see https://en.wikipedia.org/wiki/Weighted_least_squaresThe problem is that we are performing constrained weighted least squares. The positivity constraints are set up here:
https://github.com/dipy/dipy/blob/e5cc120bf6c5c271f32cced62150157abd15a321/dipy/reconst/dki.py#L1795-L1814
But my understanding is that these constraints are set up with the understanding that
y
is log(signal). But we redefiney
in the constrained weighted least squares.As detailed in #3170 I am having problems writing the tests for CWLS (several of which were missing...) using noisy data - the tests keep failing. No problem with non-noisy data, probably because the solver detects that the problem is already solved (weighting zero residuals results in zero-residuals i.e. no effect).
For example:
Resulting in:
Note that I have rebased on master today, and the problem persists.
Given my worries about the problems with the theory (that we have supplied constraints designed for
log(signal)
to a solver usingsignal*log(signal)
, I think we should be worried that the constrained solver is not really working properly for this case.edit: in addition to my theoretical worries about CWLS, note that adding noise to the signal also produces this error, regularly, for CLS... the plot thickens