JuliaSymbolics / Symbolics.jl

Symbolic programming for the next generation of numerical software
https://docs.sciml.ai/Symbolics/stable/
Other
1.36k stars 153 forks source link

`symbolic_solve` using Groebner OOM crashes #1284

Open AayushSabharwal opened 4 weeks ago

AayushSabharwal commented 4 weeks ago

MWE:

julia> A = [variable(Symbol(:A, i)) for i in 1:6]
julia> b = [variable(Symbol(:b, i)) for i in 1:4]
julia> polys =  [(-(1//1) + b[1] + b[2] + b[3] + b[4])^2
 (-(1//2) + A[1]*b[2] + (A[2] + A[4])*b[3] + (A[3] + A[5] + A[6])*b[4])^2
 (-(1//6) + A[1]*A[4]*b[3] + (A[1]*A[5] + (A[2] + A[4])*A[6])*b[4])^2
 (1//4)*((-(1//3) + (A[1]^2)*b[2] + ((A[2] + A[4])^2)*b[3] + ((A[3] + A[5] + A[6])^2)*b[4])^2)]
julia> symbolic_solve(polys, [A; b])

Memory usage quickly climbs to ~7.5GB for me, and eventually it crashes

n0rbed commented 4 weeks ago

symbolic_solve(polys, [A; b])

can u instead run

symbolic_solve(polys, [A, b])

?

AayushSabharwal commented 4 weeks ago

A and b are arrays of symbolics, not array symbolics. It doesn't work:

julia> symbolic_solve(polys, [A, b])
ERROR: AssertionError: Expected a variable, got Num[A1, A2, A3, A4, A5, A6]
Stacktrace:
 [1] check_x(x::Vector{Num})
   @ Symbolics ~/Julia/SciML/Symbolics.jl/src/solver/solve_helpers.jl:91
 [2] symbolic_solve(expr::Vector{Num}, x::Vector{Vector{Num}}; dropmultiplicity::Bool, warns::Bool)
   @ Symbolics ~/Julia/SciML/Symbolics.jl/src/solver/main.jl:154
 [3] symbolic_solve(expr::Vector{Num}, x::Vector{Vector{Num}})
   @ Symbolics ~/Julia/SciML/Symbolics.jl/src/solver/main.jl:145
 [4] top-level scope
   @ REPL[78]:1

I make them array symbolics, I get the following:

julia> symbolic_solve(polys, [A, b])
┌ Warning: This system can not be currently solved by `symbolic_solve`.
└ @ Symbolics ~/Julia/SciML/Symbolics.jl/src/solver/main.jl:198
n0rbed commented 4 weeks ago

i see. Im off my laptop for the moment but i have the feeling that the problem is with the input. Shouldnt there be commas between the polys here? afaik this looks like an array of one poly, hence our solved cant solve for all those variables

polys = [(-(1//1) + b[1] + b[2] + b[3] + b[4])^2 (-(1//2) + A[1]*b[2] + (A[2] + A[4])*b[3] + (A[3] + A[5] + A[6])*b[4])^2 (-(1//6) + A[1]*A[4]*b[3] + (A[1]*A[5] + (A[2] + A[4])*A[6])*b[4])^2 (1//4)*((-(1//3) + (A[1]^2)*b[2] + ((A[2] + A[4])^2)*b[3] + ((A[3] + A[5] + A[6])^2)*b[4])^2)]

AayushSabharwal commented 4 weeks ago

It's 3 polys with newlines in between, commas are not necessary in this case. It's like writing

x = [1
2
3]

I realize it's still an underdetermined system, but it should be able to solve for a subset of the variables given all others?

n0rbed commented 4 weeks ago

aha ok. Ill check it out asap

AayushSabharwal commented 4 weeks ago

I think it's a Groebner.jl thing. I get the same result if I create the polynomials with DynamicPolynomials.jl and call groebner.

AayushSabharwal commented 4 weeks ago

Smaller MWE:

polys =  [ (-(1//2) + A[1]*b[2] + (A[2] + A[4])*b[3] + (A[3] + A[5] + A[6])*b[4])^2
 (-(1//6) + A[1]*A[4]*b[3] + (A[1]*A[5] + (A[2] + A[4])*A[6])*b[4])^2
 (1//4)*((-(1//3) + (A[1]^2)*b[2] + ((A[2] + A[4])^2)*b[3] + ((A[3] + A[5] + A[6])^2)*b[4])^2)]

symbolic_solve(polys, b[2:4])
AayushSabharwal commented 4 weeks ago

However, for the latter MWE if I solve polys[1] for b[2], substitute that into polys[2] and solve it for b[3], and substitute both into polys[3] and solve for b[4] it solves fine.

n0rbed commented 4 weeks ago

This looks like a Symbolics.groebner_basis issue. Thats the function that hangs.

However, for the latter MWE if I solve polys[1] for b[2], substitute that into polys[2] and solve it for b[3], and substitute both into polys[3] and solve for b[4] it solves fine.

I assume you do this via

symbolic_solve(polys[1], ...)

then resub into polys[2], which uses the univar solver, hence not hanging as it does not compute a groebner basis. Here i added some comments inside the code which computes the groebner basis in our multivar solver.

        @info "before g"
        new_eqs = Symbolics.groebner_basis(new_eqs, ordering=Lex(vcat(vars, params)))
        @info "after g"
julia> symbolic_solve(polys, b[2:4])
[ Info: before g
... hangs

Open an issue in Groebner.jl?

AayushSabharwal commented 4 weeks ago

Thanks for taking a look. Will open an issue there as well.

On a sidenote, what are the tradeoffs of the groebner basis method versus solve-and-substitute one at a time?

n0rbed commented 4 weeks ago

Good question; Lets review a simple example:

julia> eqs = [x^2 + y + z - 1, x + y^2 + z - 1, x + y + z^2 - 1]
3-element Vector{Num}:
 -1 + y + z + x^2
 -1 + x + z + y^2
 -1 + x + y + z^2

julia> symbolic_solve(eqs[1], x)
2-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 (1//2)*√(4 - 4y - 4z)
 (-1//2)*√(4 - 4y - 4z)

We first solve for x in eqs[1] and lets say we ignore one root and assume x only has one solution x = (1//2)*√(4 - 4y - 4z). We substitute this solution into eqs[2]:

julia> eqs[2] = substitute(eqs[2], Dict(x=>symbolic_solve(eqs[1], x)[1]))
-1 + z + (1//2)*√(4 - 4y - 4z) + y^2

And here we see that we'll have to deal with this square root and be careful about squaring everything. The Groebner basis makes this stuff easier to deal with.

When we use the Groebner basis, we feed it an extra polynomial (extra variable included) we generate which creates a separating form. This attempts to separate all the independent variables' roots so they become linear in relation to our separating variable. For example:

┌ Info: before g
│   new_eqs =
│    4-element Vector{Num}:
│     -1 + y + z + x^2
│     -1 + x + z + y^2
│     -1 + x + y + z^2
└          _T - x - 2y

┌ Info: after g
│   new_eqs =
│    4-element Vector{PolyForm{Real}}:
│     -36(_T^2) + 132(_T^3) - 185(_T^4) + 120(_T^5) - 32(_T^6) + (_T^8)
│     -1764 + 1764z + 588_T - 2459(_T^2) + 24315(_T^3) - 35102(_T^4) + 15960(_T^5) - 995(_T^6) - 543(_T^7)
│     588y - 196_T - 3159(_T^2) + 11799(_T^3) - 13374(_T^4) + 5376(_T^5) - 267(_T^6) - 179(_T^7)
└     294x - 98_T + 3159(_T^2) - 11799(_T^3) + 13374(_T^4) - 5376(_T^5) + 267(_T^6) + 179(_T^7)

The separating variable here is _T. If you checkout the output basis, we see that the only difficult equation to solve is the first one, which is a polynomial of degree 8 (which is easily factorized)

If we are successful in solving this, the rest of the basis are simple linear equations! checkout how [x,y,z] are of degree one and so after substituting _T the equation gets simplified into mx + c. You can read more here:

Rouillier, F. Solving Zero-Dimensional Systems Through the Rational Univariate Representation. AAECC 9, 433–461 (1999). https://doi.org/10.1007/s002000050114

Let me know if anything sounds wrong, as i havent formally studied this, this is just what i learned over the summer.

n0rbed commented 4 weeks ago

However, i do suppose multivariable linear systems are better solved by gaussian elimination. The groebner basis is unnecessary in that case.

n0rbed commented 4 weeks ago

try symbolic_linear_solve

AayushSabharwal commented 4 weeks ago

Thanks for the clarification!