Looping structure and some benchmarks #72

Closed spine-o-bot closed 3 years ago

In GitLab by @DillonJ on Mar 13, 2019, 11:24

Hi folks,

I think we need to put a lot of thought into the looping structure and how to do it efficiently. In my model, I could think of no other way to do it, except to recreate the model in each loop step. I did some benchmarks, see below.

These benchmarks are for 20 steps each consisting of 48 hours. I have set the optimality gap quite high, so the solver run time will increase in actual runs.

Obviously, recreating the model each time is very inefficient and time consuming, but how else can we do it?

Also, the results writing is very time consuming also - and I'm only writing out a single variable.

So, looking at these times, from a performance point of view, our focus should be on the looping structure and the results writing.

In GitLab by @DillonJ on Mar 13, 2019, 11:25

changed the description

In GitLab by @DillonJ on Mar 13, 2019, 11:26

changed the description

In GitLab by @DillonJ on Mar 13, 2019, 11:55

Also wondering if #90 has any bearing on this and if it would help?

In GitLab by @manuelma on Mar 13, 2019, 12:19

at the moment JuMP_results_to_spine_db creates a new DiffDatabaseMapping object each time, with all the overhead (connecting to the database, creating the diff tables, generating the object relational mapping, etc). But it would be easy to add an option to use an already created DiffDatabaseMapping.

Also, for each run we are creating relationships between the result object and each other object in the variable's dimensions. So reusing the same results object in between runs might save time too.

In GitLab by @DillonJ on Mar 17, 2019, 11:44

So I've just added base case loadflow equations to the model using ptdfs (calculated by powersystems.jl)

The ptdf calculation itself is very fast, but the constraint generation in JuMP takes forever (3.5 minutes, which is over 20 hours for a yearly run).

In the looping structure I am recreating the model each time. My conclusion is that this simply isn't feasible. It's a pity.

One way of doing this is dividing constraints into those which will change in each loop and those which won't. For the moment, anything that has a time varying parameter in it will have to be rewritten (deleted and recreated) whereas anything with no time varying data can remain untouched, if you have fixed time indices... this means writing the constraints in a way where you have moving offsets for data time references but time indices for variables which don't change from one loop to the next.

This is a pity as it restricts the generality of the model and we are forced to choose which parameters can vary with time and those which don't.... for example, I can't have seasonal line ratings...

Perhaps a way is to check if a constraint needs to be rewritten, but even this will impose overhead? Can you modify a constraint for a single of subset of indices?

It makes me think that you can be quite lazy in GAMS and don't have to pay so much attention to these things, except in certain cases.

In GitLab by @DillonJ on Mar 17, 2019, 18:13

So... a little update - I refactored the model and divided constraints into those that only needed to be created once off and those that need to be re-created each loop step. This was just a test, it's not a solution because this predetermines which parameters can be time varying and which cannot - which in my view would be a deal-breaker.

So it turns out that deleting the time variant constraints (only 3 in my case) and re-adding them takes even longer than recreating the model from scratch.

One reason for this is that you don't have a handle on a group of parameterized constraints and have to delete them all individually in a loop as follows: for con in q_control_area_balance delete(m,con) end

Maybe someone can think of a better way than this?

Model modification:

There are two things you can do here: modify the constraint RHS, and this is not straight forward (www.juliaopt.org/JuMP.jl/v0.19/constraints/) And you can modify constraint coefficients.

Neither of these are ideal, because it forces us to aggregate terms for each variables and we lose the power of a high level modelling language entirely... we may as well write the MPS file ourselves from scratch.

I think this is a serious issue... at the moment it would mean that the model isn't usable and I don't see an easy way around it...

In GitLab by @DillonJ on Mar 18, 2019, 09:38

Ok, so I did a major refactoring of the code and put everything in functions. The model generation time has now decreased from 3.5 minutes to 30 seconds... it's a big improvement but we still need to do better. This is for completely regenerating the model each loop step.

In GitLab by @DillonJ on Mar 18, 2019, 10:43

So I've timed each individual constraint - this one is taking 60-70% of the model creation time:

function gen_q_baseflow(v_baseflow,v_net_injection,ptdf_b_n)
    @constraint(m,
        q_baseflow[
            b in branch(),
            (t,s) in T_S;
            all([
                monitored(branch=b)==1,
                settings_loadflow_method(settings=:TRANSMISSION)!=:loadflow_disabled
            ]) || continue
        ],
        + v_baseflow[b,t,s]
        ==
        + sum(  ptdf_b_n[b,n] * v_net_injection[n,t,s] for n in injection_nodes if abs(ptdf_b_n[b,n]) > tx_pdtf_tolerance(settings=:TRANSMISSION))
    )
end

here ptdf_b_n[b,n] is AxisArray{Float64,2,Array{Float64,2},Tuple{Axis{:branch,Array{Symbol,1}},Axis{:node,Array{Symbol,1}}}}

@pvper , @manuelma Anyone any ideas for how to improve this formulation ?

In GitLab by @manuelma on Mar 18, 2019, 12:00

Did you try @butcher?

what you can do is use @macroexpand, as in @macroexpand @constraint(...) to see the 'expanded' code from the macro.

(Btw, I actually think || continue shouldn't be there...)

In GitLab by @DillonJ on Mar 18, 2019, 12:01

I tried without the || continue and it didn't work?

I also tried @butcher in a few places but I got errors...

The @macroexpand is a good idea - I'll have a look at that.

In GitLab by @manuelma on Mar 18, 2019, 12:03

Maybe we can try and debug @butcher errors? Just create issues. When it works, it saves like a lot of time.

In GitLab by @DillonJ on Mar 19, 2019, 18:37

So thanks to some major refactoring and @butcher we have the constraint generation time down to just over 4 seconds for a model with base load flows. With N-1 SCUC constraints it does up to 23 seconds but I think this is competing very well with GAMS. So, some lessons:

We have to be very conscious of sparsity - GAMS is clever in this department, but with JuMP we have to consider it explicitly. For example I created a set of indices for only the non-zero elements of the ptdf matrix and that sped things up considerably.
No global variables
All constraints and variables generated inside functions

In GitLab by @DillonJ on Mar 21, 2019, 09:56

For reference and hopefully this will be useful to others:

You get a performance improvement if you use non-vectorized constraints... for example, if you write
```
for u in unit()
@ constraint(m, ...
end
```
instead of:
```
@ constraint(m, [for u in unit()] ...
```
it's less readable and you can't name the constraint nicely (that I can see) in the more performant version, butmaybe worth considering for costly constraints. See more in this discussion: https://github.com/JuliaOpt/JuMP.jl/issues/969
You also get a performance improvement using m = JuMP.direct_model(CPLEX.Optimizer(CPX_PARAM_SCRIND=1)) but you lose some functionality.

More here: http://www.juliaopt.org/JuMP.jl/v0.19.0/solvers/#Direct-mode-1

In GitLab by @Poncelet on Mar 21, 2019, 13:42

Very nice work Jody and Manuel!

As we are coming at the point of drastically increasing the number of constraints in Spine model, I would like to fully understand the important aspects of the refactoring you did.

Is it correct to say that mainly the way you do the looping to generate similar constraints is absolutely crucial? And if so, what can we consider as the best way to do it? Could you copy a piece of code that reflects best practice?

In GitLab by @DillonJ on Mar 21, 2019, 14:18

I will try to summarize the important things to be considered - but one thing is for sure - you need to pay more attention to constraint formulation in JuMP than you need to do in GAMS.... with my refactoring and some reformulation, I got problem formulation (without SCUC) down from 3.5 minutes to under 4 seconds. But with SCUC, it's still 18 seconds which is the best part of 2 hours for an hourly yearly run... and that's before any solve time or results writing.

So, it's really important to pay attention to these things to get reasonable model creation times, but even then, I have the feeling this may still be an issue for large models and we need to consider other approaches.

I looked at deleting just the constraints that change and recreating them, but this is not viable either. See here: https://discourse.julialang.org/t/deleting-containerized-constraints/21971

I have the feeling that for large models we should consider problem modification.

Here, we can modify the constant term or coefficients of constraints that change from iteration to iteration. Having a look at my model, the vast majority of constraints involve just a change to the constant term.

Since we have a function to generate each constraint - I wonder if it's worth also defining a function that will update the constraint also?

With this in mind, we could try and write constraints so all the constant terms are on the RHS... and even write a separate function that calculates the RHS?

@manuelma @pvper do you guys think it may be possible to automate something like this? For example - JuMP_all_out will know what parameter values change with time and which don't... can we use this somehow?

In GitLab by @manuelma on Mar 21, 2019, 14:59

Ok, problem modification is a new challenge for us.

I guess we can have functions revisit_constraint...(... iteration=i) that look at the involved parameter value, check if the value changed from the previous iteration, and make the proper adjustments via, say, JuMP.set_coefficient. I don't think it's too crazy.

But what I (at least) completely lack is some experience on what's the best way to modify constraints JuMP-side. It looks like JuMP.delete is pretty underwhelming at the moment. So is set_coefficient the way to go?

In GitLab by @DillonJ on Mar 21, 2019, 15:10

set_coefficient will work for coefficients but it seems like it will be the constant term that most often needs to be changed between iterations.

.fix is apparently the current official way to change the constant term, but this involves adding dummy constraints to the problem which is not efficient.

This issue is looking at better ways to do it: https://github.com/JuliaOpt/JuMP.jl/issues/1890

Whatever way we do it, we should do it in a way that doesn't limit what can be time varying - we never want to have to hard code that into the model.

In GitLab by @manuelma on Mar 21, 2019, 15:12

I think something like revise_constraint_... could work on our end.

In GitLab by @Poncelet on Mar 21, 2019, 15:39

But with SCUC, it's still 18 seconds which is the best part of 2 hours for an hourly yearly run...

I see two elements here: 1) Model generation is slow. This is a problem for ANY user 2) Model generation is slow, and the model is used in a looping structure.

In the second case, we can indeed try to get it resolved by modifying the problem instead of re-creating.

But for the first case, I have a number of questions:

how slow is model construction now compared to e.g., GAMS?
Is this a known JuMP issue?
Refactoring constraint generation had helped drastically. Can we go even further?

In GitLab by @DillonJ on Mar 22, 2019, 08:46

@Poncelet

I think that's a fair breakdown of the issues. I would rather say in 1 that in JuMP it is easy to write a model with poor constraint generation performance... and this has been a problem for other users that have had to make changes to improve performance. See for example: https://discourse.julialang.org/t/slow-generation-of-a-large-lp-model-in-jump/3034 and
https://github.com/JuliaOpt/JuMP.jl/issues/969

The good news for Spine is that thanks to all the talented coders on the project, many of the performance enhancing measures are already implemented. For example:

No global variables
Constraint and variable generation within functions
@butcher gave a 40% performance increase in my case.

When I implemented Epiphron, I was a bit lazy with the structure so I had more to do.

From what I've learned so far, these are some of the things that could be done to further improve performance for SpineModel:

Develop indices to reference only the non-zero elements of large sets, to prevent looping through them all unnecessarily. This is something GAMS already does automatically
Use JuMP.direct_model()
Use the non vectorised format for constraint generation for heavy constraints (i.e. for u in unit() @constraint(...

I will try and get the equivalent timing for generation of the model in GAMS so we can compare.

In GitLab by @DillonJ on Mar 22, 2019, 08:51

I've just come across ParameterJuMP.jl: https://github.com/JuliaStochOpt/ParameterJuMP.jl

This looks really interesting - I think it's doing what we want to do... but I think we could even go a step further and somehow integrate JuMP_all_out parameters with this JuMP extension... or even develop our own!!!

Thoughts @manuelma @pvper ?

In GitLab by @DillonJ on Mar 22, 2019, 14:49

So... the same model takes 3 to 4 seconds to generate in GAMS and around 23 seconds in JuMP, and that's after quite a bit of tuning...

In GitLab by @manuelma on Mar 22, 2019, 15:01

GAMS has been tuning for years, we have tunned for a couple of weeks. I think we are only beginning. The JuMP community is very active at the moment, they are solving many issues themselves.

Also, some of the tunning we have manually done so far can be integrated into @butcher so the future users don't need to worry?

In GitLab by @DillonJ on Mar 22, 2019, 15:04

I'm thinking - could we extend @constraint and work it into JumpAllOut so that it only generates constraints when the coefficients are non-zero? Then, that way - JuMP_all_out will always know what parameters change with time so it will know too what constraints need to be modified.

@manuelma you could look at ParameterJuMP.jl for some ideas?

In GitLab by @DillonJ on Mar 22, 2019, 15:11

.... so basically combining the functions of Jump_all_out and ParameterJuMP.jl and constraint generation optimisation... that would be quite an attractive package!

In GitLab by @manuelma on Mar 22, 2019, 15:49

could we extend @constraint and work it into JumpAllOut so that it only generates constraints when the coefficients are non-zero?

The @constraint macro already supports skipping a combination of indexes through its second argument [i in indexes(); coefficients(i) != 0]

Can't we just use that, or I'm missing something important?

In GitLab by @DillonJ on Mar 22, 2019, 16:01

But that might be different each iteration and that's why now we just regenerate all the constraints... but if JuMP_all_out only regenerated the constraints which change from iteration to iteration...

Since deleting constraints appears quite expensive, we would probably have to generate the full set of constraints and modify them each iteration using functionality that is essentially a combination of JuMP_all_out and ParameterJuMP.jl

In GitLab by @DillonJ on Mar 22, 2019, 16:49

Look at this (Parametron)... exactly what we want... but what if we could combine this somehow with JuMP all out... if we could do this I think we're onto something very powerful...

https://discourse.julialang.org/t/deleting-containerized-constraints/21971/13?u=di11on

Edit: here's the repo in question... https://github.com/tkoolen/Parametron.jl

Looking at the README - it is exactly what we want and I'm pretty sure it could be combined with JumP_all_out so that parameters are updated for time in each iteration automatically.

In GitLab by @manuelma on Mar 22, 2019, 17:32

Parametron.jl is a very good idea, if only was an extension to JuMP instead of an entire rewriting of the interface with MOI. Anyway the code is there, perhaps we can port it into JuMP.

In GitLab by @Poncelet on Apr 2, 2019, 10:57

Parametron.jl is a very good idea, if only was an extension to JuMP instead of an entire rewriting of the interface with MOI.

This indeed seems a bit risky to me...

Regarding ParameterJuMP: this seems to be nice. However, some thoughts:

I understand that it only allows replacing the RHS (constant terms) of the constraints. I believe we should also support problem modification for parameters which appear in constraints as coefficients. There seems no support for this anywhere? So for these constraints, it would mean regenerating constraints after all?
How stable/well maintained will the package be?

In GitLab by @manuelma on Apr 2, 2019, 11:13

I believe we should also support problem modification for parameters which appear in constraints as coefficients. There seems no support for this anywhere?

Parametron allows it, and of course MathOptInterface. Even JuMP provides some interface, checkout set_coefficient. I feel the preferred option would be to keep using JuMP and then interact directly with MOI for problem modification.

So the interface is there, we just need to figure out how to use it.

On top of that, we need to come up with a process to check if a constraint needs to be modified, that is cheaper than generating the whole constraint anew. @DillonJ has some ideas about that I want to try. Of course, all this depends on our ability to reuse the same variables in different solves.

How stable/well maintained will the package be?

I believe we can trust MOI and JuMP, they have a community. AFAIK, ParameterJuMP is a couple of guys and Parametron is one genius guy doing his PhD. But we can always steal some ideas from them, even if we don't use those packages. What others think?

In GitLab by @Poncelet on Apr 2, 2019, 14:53

I believe we can trust MOI and JuMP, they have a community. AFAIK, ParameterJuMP is a couple of guys and Parametron is one genius guy doing his PhD. But we can always steal some ideas from them, even if we don't use those packages. What others think?

Fully agree that we can trust MOI and JuMP. Given that JuMP might evolve, I'm not sure how much we should invest now in other packages like ParameterJuMP and Parametron. My feeling says it would be preferable that JuMP integrates these packages or develop similar functionality, rather than we would...

In GitLab by @Poncelet on Apr 2, 2019, 14:54

Of course, all this depends on our ability to reuse the same variables in different solves.

I understand. I am having a look at this at the moment. Maybe we could exchange ideas on that tomorrow afternoon?

In GitLab by @DillonJ on Apr 2, 2019, 15:41

My idea is basically this - for a simple implementation:

I believe you can create a copy of all the constraints. So we start off by generating the full problem as a template so to speak.
Jump all out knows the functions which call it, so we can make a mapping between constraint generation functions and jump_all_out functions/parameters.
We simulate the loop structure of the model and find out which jump_all_out parameters will change and (possibly) when
We then know what constraints will need to be regenerated each loop.
Each loop we create a new model object, but for the time-invariant constraints we just reuse the constraint copies... we don't regenerate them, and the same for the variables.

That's loosely the idea

edit:

http://www.juliaopt.org/JuMP.jl/v0.19.0/constraints/

Constructing constraints without adding them to the model For advanced use cases.

JuMP.@build_constraint — Macro. @build_constraint(constraint_expr) Constructs a ScalarConstraint or VectorConstraint using the same machinery as @constraint but without adding the constraint to a model.

So the idea is to use this build_constraint function to build all the time invariant constraints and then when we create a new model each iteration we just add these pre-built constraints. For the time variant constraints - we have to regenerate those and add them and do the same - i.e. we build them for use in future iterations then add them to the model

In GitLab by @manuelma on Apr 2, 2019, 15:47

JuMP.@build_constraint should help.

Jump all out knows the functions which call it, so we can make a mapping between constraint generation functions and jump_all_out functions/parameters

This is the idea. Something to figure out is where does this mapping live.

In GitLab by @manuelma on Apr 2, 2019, 17:18

Maybe we could exchange ideas on that tomorrow afternoon?

Do you mean a call? I'm available...

In GitLab by @jkiviluo on Apr 3, 2019, 06:57

@DillonJ proposal above seems like it would achieve the things we'd like to have. If I understand correctly it would avoid copying variable values and that would be great.

Just a clarifications: what do you mean by time invariant and time variant? Almost everything varies over time, maybe time invariant here mean those that will be realized and don't change with new forecasts?

I've been thinking that the realization is something that will we need to keep and where we would benefit from being able to roll and keep the variables with their original time indexes. So that would be something that might be generated for the whole model period in the start (the 'full' problem template). On the other hand, the forecast branches (not realized) are something that we don't necessarily need to keep and even cannot keep since they will take too much memory and disk space (sometimes you want to see them in debugging, but not typically for more than couple of solves). So, could those use a time index that gets overwritten by new values? It would be kind of rolling, but going circular. I'm sure nobody understands this... The point is to keep the variables in place that point to the same absolute time, but when a time step recedes into history, then it can be re-used as a new time step (in order to avoid constraint creation).

In GitLab by @Poncelet on Apr 3, 2019, 07:07

If I understand correctly it would avoid copying variable values

I don't understand how we could avoid copying variable values if we plan to reuse that variable.

In GitLab by @jkiviluo on Apr 3, 2019, 07:29

Maybe I didn't understand Jody right.

I thought he was proposing that we generate a full model template. Then we can copy that template every time we run the model (as long as the session is live). As a downside, this is probably not helpful in runs parallelized to multiple computers.

There is potential for very nice performance gain on problems that are run multiple times with minor changes. However, I don't know how much that awkward RHS modification will hinder this.

In GitLab by @DillonJ on Apr 3, 2019, 07:51

To clarify a couple of points...

We wouldn't be using the rhs modification functionality - instead we would use @build_constraint to essentially pre-formulate constraints and add them to the model each loop. For constraints that need to be regenerated, we @build_constraint a new version of it.
While every parameter could be time varying, in practice, not every parameter will be. For example in my model, constraints that take very long time to formulate are the post-contingency flow SCUC constraints. If the ratings of the lines don't change each iteration (and they might only very rarely) then you shouldn't have to recreate them each time (because neither the coefficients nor the RHS of the constraint is changed).
Variables in JuMP are VariableRef objects... what I'm proposing is that we create these once and reuse them each loop... I think it could mean that we need to copy the values from the last iteration.

In GitLab by @manuelma on Apr 3, 2019, 11:07

For constraints that need to be regenerated, we @build_constraint a new version of it.

That's the spirit, however I'd still like to consider modifying rather than building here. Maybe we can take a constraint that's already built, copy it, modify the coefficients that need to be modified and add it to the model? Might be faster than generating a new one?

In GitLab by @jkiviluo on Apr 3, 2019, 11:46

Yeah, if we can copy constraints, can't we just populate most of the model by copying (and keep the full timeline of variables)?

We should re-use constraints if there is no good way around it, but I'm just poking you to try to find a way. In principle it just sounds stupid that it wouldn't be possible to have an efficient implementation where things stay in their place. As I indicated earlier, being able to re-run on the same variables/constraints with minor changes can be valuable (especially on the solver side).

In GitLab by @DillonJ on Apr 3, 2019, 12:06

however I'd still like to consider modifying rather than building here.

If you can find an easy way to do it, then that would be great, but I think this is way more complicated than it sounds. My feeling is that only a small subset of the parameters will actually be time varying so that we will get 80% or more of the benefits with 20% of the effort by just regenerating time variant constraints (only when they are actually time variant).

To modify the constraints - we will need to work out the new aggregated coefficients on every variable and aggregate all the constants terms ... to me this seems very heavy. You would need to evaluate what jump_all_out parameters have changed and by how much and then work out what variables' coefficients have changed and how the parameters impact the rhs.

In GitLab by @manuelma on Apr 3, 2019, 17:10

Ok so you mean, if the constraint only involves parameters whose value doesn't depend on t, we call that a time invariant constraint, build it once at the beginning, then attach it to successive models as we loop.

If the constraint involves at least one parameter whose value depends on t, we build it at every iteration. (Even if for some particular t, the value didn't change with respect to previous iteration.)

Is that good for a first approach?

In GitLab by @DillonJ on Apr 3, 2019, 17:58

. (Even if for some particular t, the value didn't change with respect to previous iteration.)

No, this is what I meant by simulating the loop structure with JuMP_all_out... for each parameter we figure out if it changes and in what loops it changes... only then do we rebuild it

In GitLab by @Poncelet on Apr 9, 2019, 07:59

To summarize based on the call last Friday, the currently open discussion items regarding looping are:

Data representation of how to loop (assigned to Kris, will be further discussed in issue #17 )
Specify how to do selection of constraints which need to be regenerated/modified (assigned to Jody, Manuel)
Implement example of function to modify/regenerate a constraint (assigned to Jody, Manuel)
Implementation in Spine Model, including updating time slices, copying output and adapting model (based on 2 and 3 above) (assigned to Kris, Maren)

In GitLab by @mihlema on Apr 23, 2020, 10:11

This issue seems to be resolved

In GitLab by @mihlema on Apr 23, 2020, 10:11

closed

spine-tools / SpineOpt.jl

Looping structure and some benchmarks #72

Model modification: