Interacting with the ordering algorithms

baharev / sdopt-tearing

Exact and heuristic methods for tearing

https://sdopt-tearing.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

13 stars 5 forks source link

Interacting with the ordering algorithms #2

Closed EDWhyte closed 5 years ago

EDWhyte commented 7 years ago

@baharev I want to use sdopt-tearing in my Master's project but I am having some difficulty interacting with your code. Currently, I manually create the incidence matrix and pass it to the methods in heap_md.py to get the orderings using either the Hessenberg or Dulmage-Mendelsohn algorithms.

The problem that I have is ordering variables calling a function eg.

a = f(x, y, z)

My initial plan was to add the equation to the incidence matrix having the variable a, x, y, z but I run into situations where this equation is used (for example) to calculate the variable x.

I have read your documentation and see that there are methods that cater to for "bad" eliminations

determine which variables can be explicitly and safely eliminated from which equations

Is it possible for me to use this property to flag variables as bad eliminations? My plan is to create a new variable f and dummy equations eg. f = x*y*z (1) a = (x + y + z)*f (2) and flag variable f as a "bad" elimination in equation 2. I create equation 1 to have the Degrees of Freedom of the system 0.

As a side note, these algorithms are extraordinary Thanks in advance Edgar

baharev commented 7 years ago

Hi Edgar,

Thanks for your feedback, and I apologize that my code is giving you such a hard time. I am sorry!

Note that the code here at GitHub is just the snapshot of what I have in a private repo, and this is a work in progress.

It is not necessary to introduce dummy variables: You can manually flag any elimination you like as "bad" and then those eliminations will be avoided by the ordering algorithms. However, the idea is that the algorithm can figure out automatically which eliminations are "bad", so that you do not have to specify anything.

There is an easier to use API for this code. There you provide your incidence matrix in coordinate format, that is, as (i, j, a) where the row and column indices are i and j, respectively, and the value of a tells the algorithm whether an elimination is considered "bad" or not.

Please be more specific, and specify what you need:

What sparse matrix forms do you need exactly? With the Dulmage-Mendelsohn decomposition you can get the bordered lower triangular form (BLT). It has nothing to do with the (generalized) Hessenberg forms.
Are you trying to implement tearing? For that you would probably need bordered lower triangular forms.
What is your application? What would you like to do with the ordered system?
How are your equations coded? In a modeling language such as AMPL or Modelica? In Python? Or how?

Best wishes,

Ali

EDWhyte commented 7 years ago

Thanks for the speedy response @baharev

I am simulating a chemical plant and have a set of equations and variables which I want to solve in the most efficient way. This is why I want to order the equations.

I want to use the Dulmage-Mendelsohn decomposition as well as the Lower Hessenberg in the spiked form.

I want to use the Lower Hessenberg form with a numerical solver to completely solve the system of equations but want the option to speed up the simulation by using a lower triangular form with a full diagonal. To date, I have tested the hessenberg function in heap_md.py and fine_dulmage_mendelsohn in rpc_api.py

The equations are generated using python and are sympy equations.

I am still working through your paper

DECOMPOSITION METHODS FOR SOLVING SPARSE NONLINEAR SYSTEMS OF EQUATIONS

I apologise if my technical knowledge is still lacking

baharev commented 7 years ago

OK, so you are implementing tearing apparently.

How do you pass the re-ordered equations to the numerical solver? Those are sympy expression trees. Do you generate C code from them? What solver or solvers do you use?

A very important question: Do you have meaningful lower and upper bounds on all of your variables?

I am asking this because I use sympy to automatically find the good and bad eliminations. However, it only works if all variables are bounded from below and above, which, in most engineering applications is either already the case or it is easily achievable. This method that automatically finds the safe eliminations is not exposed through the RPC API yet.

How many equations and variables do you have?

Sympy is unfortunately not fast... It may work acceptably for your Master's project, but I would not recommend it for an industrial strength implementation.

Two side notes:

I am simulating a chemical plant and have a set of equations and variables which I want to solve in the most efficient way.

"The most efficient way" is hard to define. On your machine or on my machine? Which solver? Which compiler? There are way to many factors here, and it is next to impossible to find "the most efficient way". You can only hope that eliminating many variables will speed up solving the problem, but there is absolutely no guarantee. Eliminating variables usually helps and makes things quicker, but nothing is guaranteed.

Please also read Failure modes of tearing and a novel robust approach. It will tell you why tearing is fundamentally flawed, and why it cannot be fixed in the traditional setup.

Don't get me wrong, apply tearing if you have good reasons to think that it is likely to help you, but beware of its fundamental flaws.

EDWhyte commented 7 years ago

The re-ordered equations are derived from the permuted incidence matrix which is passed to a code generator which generates code (python) to simulate the system.

I a still searching for a good open source numerical solver but for my smaller test cases, I am using scipy.optimize.fsolve.

I do have meaningful lower and upper bounds.

For my larger simulations, I have in the vicinity of 1000 equations.

What I meant by

the most efficient way

was instead of solving a whole set of equations numerically to order the equations in such a way that most of the variables can be solved sequentially and only smaller blocks of equations are solved using numerical methods.

baharev commented 7 years ago

OK, this is a sensible approach. scipy.optimize.fsolve is fine.

Could you show me some code how you create and store the sympy expression trees for your equations?

Having meaningful lower and upper bounds is good news. You can probably use my existing algorithm to find the safe eliminations automatically. That's good. I still have to find some time to expose the algorithm through the API though.

order the equations in such a way that most of the variables can be solved sequentially and only smaller blocks of equations are solved using numerical methods.

Hmmm, then it is not clear to me again. We should clear this up.

Do you want traditional tearing (bordered lower triangular form)?
Or do you really want to do block eliminations (spiked form)?

Implementing block eliminations is much harder than traditional tearing, and if you are doing block eliminations, you probably do not need this safe vs. "bad" elimination stuff. In any case, the requirements are very different if you really want some fancy tearing algorithm with block eliminations.

How much time are you supposed to spend on this Master's project?

What should you deliver at the end of this project?

Could you send me some document in e-mail that describes the goals of your Master's project?

Then I could suggest an appropriate ordering algorithm accordingly.

EDWhyte commented 7 years ago

@baharev I appreciate the time you are willing to spend on me

I create the equations by performing mass balances and other related equations given a connectivity graph. The equations are in string form but are converted to sympy equation using sympy.sympify. I know it is easy to get the expression tree by using sympy.srepr.

Just some background I am creating stochastic simulation over a time span using simple Euler integration

Initially, traditional tearing was what I wanted to implement but if I understand correctly it will make use of previous time step data to solve the equations. This is fine but sometimes a more accurate solution may be required. That is why I am searching for other ordering that will make it possible to fully solve the system of equations at any given time step.

I have until the end of October for my master's project.

Thanks

baharev commented 7 years ago

I will do 3 things, the first of which is already done.

(1) Expose that functionality that automatically identifies bad eliminations and flags them when given a sympy expression tree. The code with a simple example is now available here:

https://gist.github.com/baharev/17e9a6b077915d1c00b4471ebbd5d13d

The caveats mentioned in the last two paragraphs of 2.2. Identifying numerically troublesome assignments in Ordering matrices to bordered lower triangular form with minimal border width do apply!

Please let me know whether the example and the code are understandable.

(2) Create a greedy heuristic that takes bad eliminations into account and orders to bordered lower triangular form (without nesting). If I remember correctly, I only have an MILP algorithm that can do it, and that requires Gurobi.

(3) Another greedy heuristic that takes bad eliminations into account and orders to nested bordered lower triangular form (spiked form).

I believe these 3 things should be sufficient to solve your problems.

baharev commented 7 years ago

I am more or less done with (2) too. Please try it out and provide feedback:

https://github.com/baharev/sdopt-tearing/blob/master/bordered.py#L11

It accepts a sparse matrix in coordinate format. The entries of the matrix (values) must be integers. If the entry is not positive, the corresponding (i, j) is considered as a forbidden elimination. The function returns the row and column permutation vectors, and the size of the leading lower triangular part.

The heuristic can be tweaked in many ways, but I would like to get some feedback so that I can adjust it to your needs. I am not going to work on (3) until (1) and (2) are ironed out.

baharev commented 7 years ago

I am summarizing the conclusions of our discussion in e-mail to other future users who might be interested.

The first issue (Expose that functionality that automatically identifies bad eliminations and flags them when given a sympy expression tree) will be resolved in a separate project:

https://github.com/baharev/safe-eliminations