Open RussTedrake opened 6 months ago
I ran the code with deg=4
on my computer, it can proceed to clarabel iterations. I got the print out below. But as you mentioned, it is very slow and consumes huge amount of RAM (> 100GB)
Specifically, this line takes about 3 minutes, and consumes about 90GB of RAM https://github.com/RobotLocomotion/drake/blob/659cf702dff93ea20ee87b18e77169be404568ef/solvers/clarabel_solver.cc#L442
And when executing this line, my RAM usage jumps to 120GB. https://github.com/RobotLocomotion/drake/blob/659cf702dff93ea20ee87b18e77169be404568ef/solvers/clarabel_solver.cc#L444
-------------------------------------------------------------
Clarabel.rs v0.6.0 - Clever Acronym
(c) Paul Goulart
University of Oxford, 2022
-------------------------------------------------------------
problem:
variables = 81927
constraints = 84772
nnz(P) = 0
nnz(A) = 207177
cones (total) = 15
: Zero = 1, numel = 4384
: Nonnegative = 1, numel = 0
: PSDTriangle = 13, numel = (8001,8001,8001,8001,...,1596)
settings:
linear algebra: direct / qdldl, precision: 64 bit
max iter = 200, time limit = Inf, max step = 0.990
tol_feas = 1.0e-8, tol_gap_abs = 1.0e-8, tol_gap_rel = 1.0e-8,
static reg : on, ϵ1 = 1.0e-8, ϵ2 = 4.9e-32
dynamic reg: on, ϵ = 1.0e-13, δ = 2.0e-7
iter refine: on, reltol = 1.0e-13, abstol = 1.0e-12,
max iter = 10, stop ratio = 5.0
equilibrate: on, min_scale = 1.0e-4, max_scale = 1.0e4
max iter = 10
iter pcost dcost gap pres dres k/t μ step
---------------------------------------------------------------------------------------------
I waited for 20 minutes and Clarabel hasn't finished the first iteration yet.
I did have some success with Clarabel on mid-sized SOS problems. (The PSD matrix with < 30 rows). This problem (deg=4
) has pretty large PSD matrices, here is my print out
INFO:drake:PSD cone size 126
INFO:drake:PSD cone size 126
INFO:drake:PSD cone size 126
INFO:drake:PSD cone size 126
INFO:drake:PSD cone size 126
INFO:drake:PSD cone size 21
INFO:drake:PSD cone size 273
INFO:drake:PSD cone size 21
INFO:drake:PSD cone size 21
INFO:drake:PSD cone size 21
INFO:drake:PSD cone size 21
INFO:drake:PSD cone size 21
INFO:drake:PSD cone size 56
With a psd matrix having 273 rows, it probably is too large for Clarabel.
Thanks @hongkai-dai . Can you think of any way of capturing this complexity in our ChooseBestSolver logic? Otherwise, I guess it's just a "won't fix"?
cc @goulart-paul , in case it's of interest?
I can capture the complexity in ChooseBestSolver, but I wonder what is the criteria to say that Clarabel will consume too much memory. So far as I see, it depends on both the number of PSD constraints, and the size of each PSD constraint, so it is not a single number as the criteria. Maybe the Clarabel authors have a better sense on determining what problem is too large for the solver?
The issue comes down to the dimension of the PSD cones being passed and how the per-iteration cost of an interior point solve scales w.r.t. that number.
We store only the upper triangular part of a PSD matrix, so if you have a PSD cone in $\mathbb{R}^{n\times n}$ then each cone will have $M = n(n+1)/2$ elements. This number $M$ is what is reported in the output header of the solver, e.g. where it says something like:
cones (total) = 15
: Zero = 1, numel = 4384
: Nonnegative = 1, numel = 0
: PSDTriangle = 13, numel = (8001,8001,8001,8001,...,1596)
$M = 8001$ there implies that $n = 126$ for the corresponding cone.
When we form the Hessian of the barrier function for such a cone, we get a dense block of size $M \times M$ in the KKT system that we factor at each each iteration. The actual storage for that block is again $M (M+1)/2$; for a large cone that is a lot of memory. Factoring a system with very large blocks like that will be very slow because:
we don't presently do anything to exploit sparsity on the PSD constraints. If we add a chordal decomposition method to the solver it would be a lot faster for PSD constraints that have a lot of zeros.
the method we use for solving the KKT system is based on direct LDL factorisation, which is not the best choice given the presence of really huge blocks. Some block elimination or QR-based method might be better.
the LDL solver we currently have implemented is a simplicial one, so is quite a bit slower then a supernodal method for large problems or problems with a lot of dense columns.
The first item above already exists in our COSMO solver in Julia, so we intend to port it also to Clarabel this year. We have an interface to a pure Rust supernodal LDL solver in prototype now and are just waiting for the interface to that to stabilise.
In terms of predicted memory use : it is not straightforward to compute since it is would require computation of the KKT elimination tree at least, but memory use scales roughly like $n^4$ for $n \times n$ PSD cones.
We could maybe provide some facility for predicting memory total use based on problem input, but would have to think a bit about the best way of doing this without actually directly allocating a lot of memory.
If you are able to do so, it might be helpful to see an example problem. Can you export to something like AMPL format, or some python pickle file? I would be interested to see if chordal decomposition and supernodal options are likely to help you here. I'd also like to put it through Mosek to see if we can work out what is different there.
Thanks @goulart-paul for the detailed explanation.
I think I can roughly compute ∑ᵢ nᵢ⁴ with nᵢ being the number of rows in the i'th psd constraint, and set a tolerance for this summation. If the problem size exceed this tolerance, I can either throw a warning to the user, or select an alternative solver at the moment.
It makes a lot of sense if we could speed up the computation (and save the memory) by exploiting the sparsity in the PSD constraint. I believe for our SOS problem we would have abundant sparsity. I look forward to seeing this feature ported to Clarabel!
For the problem data, I currently cannot export it to AMPL format. But I can save the problem
min c'x
s.t A*x + s = b
s in K
I can dump the A
, b
, and c
matrices into pickle files. and store the type of the cone K
and size of each cone. Would that be OK with you? Do you have a preferred format for saving the cone K
?
What happened?
When I upgraded the drake version in my underactuated notes, one of the examples (the cart-pole example from here) which works fine with Mosek and solves slowly with CDSP now crashes with Clarabel.
Here is the code to reproduce it:
Note: By changing to
deg=2
in the last line, I can get it to run. But it's shockingly slow. (Many minutes per iteration). Whendeg=4
, it crashes before ever showing the clarabel console output; so I'm not sure if it's actually getting to clarabel or not.I intend to guard this example to make it "mosek only", which is fine for now. In practice, it's pretty horrible without mosek.
Version
No response
What operating system are you using?
macOS 13 (Ventura)
What installation option are you using?
compiled from source code using Bazel
Relevant log output
when
deg=2
(after 10+ minutes):