spine-tools / SpineOpt.jl

A highly adaptable modelling framework for multi-energy systems
https://www.tools-for-energy-system-modelling.org/
GNU Lesser General Public License v3.0
53 stars 13 forks source link

Data structure for connections is inefficient for simple cases #161

Closed spine-o-bot closed 1 year ago

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 10, 2020, 06:15

Having been working with a load flow model with c. 550 connections, I think we can do a lot better.

To define a simple connection with a capacity I need to define the following:

connectionfrom_node(forward) connection__to_node(forward) connectionfrom_node(reverse) connection__to_node(reverse) connectionnodenode(forward) connectionnodenode(reverse) connection_capacity(forward) connection_capacity(reverse) fix_ratio_out_in(forward) fix_ratio_out_in(reverse)

image

image

image

This is not ideal for many reasons:

For a simple connection with a capacity, the following information is all that is necessary to specify:

It would be much nicer to see just this :

image

Of course we want to retain the flexibility to define many different types of connections for different commodities and that is why we did it the way way we did, so I propose we retain the data structure that the model sees and perhaps do some pre-processing of the data, but there are other ways to do it and I am open to suggestion.

The following is a proposal for how we might make it easier for a user in the simple case.

The minimum data required for a connection are:

This is how we could pre-process the data and infer the data and relationships necessary for the model

You can override any of this by defining the individual data items explicitly - the idea is to make the simple case more straight forward for the user.

Thoughts?

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 10, 2020, 06:16

changed the description

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 10, 2020, 06:19

changed the description

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 10, 2020, 06:21

changed the description

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 15, 2020, 07:27

I think at a minimum we should keep the existing structure and use preprocess_data_structure.jl to allow for simple bi-directional connections with just a to and from node and a capacity, and/or a capacity in each direction. We can then add the missing parameters and relationships using preprocess_data_structure.jl. This still allows us to implement more exotic cases when the need arises.

However, I was thinking further about his and several things have occurred to me :

It would clean up the structure a lot given that I believe we can do with two relationships what we are currently doing with six (2 x connectionfrom_node, 2 x connection__to_node, 2 x connectionnode__node).

Thoughts @mihlema @Tasqu @jkiviluo @manuelma @nnhjy ?

spine-o-bot commented 3 years ago

In GitLab by @Tasqu on Apr 15, 2020, 11:07

I'm not sure how your shortcut distinguishes between a directional and a bi-directional connection. If I want to create a single restricted directional connection, I currently make a single connection__from_node and a single connection__to_node, and give it a connection_capacity, but in your example this would automatically create a bi-directional connection instead? Would this be controlled through a method parameter or something?

Personally, I'm still not a fan of having both units and connections so similar from the model's point of view. If it were completely up to me (and there had to be a connection type entitiy in the first place), I would have it be a direct transfer variable between two nodes, e.g. v_trans(connection, from_node, to_node). This way, directional connections would be defined using a single connection__from_node__to_node relationship, and bi-directional connections using two in the opposite directions.

I realize that the above would again be a breaking change to the core structure, and only included it here as "wishful thinking".

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 15, 2020, 12:01

@Tasqu

I'm not sure how your shortcut distinguishes between a directional and a bi-directional connection.

I wasn't really suggesting a complete proposal, just raising the issue for discussion along with some ideas.

If it were completely up to me I would have connectionfrom_node and connection__to_node and a parameter (perhaps "directionality"). This way you avoid redundant information and avoids the situation where the reverse connectionnode__node could be defined insonsistently.

But I do agree with the principle of having a single flow variable, or at most two rather than the 4 we have now. In the vast majority of cases, we just have a single flow. We could still account for losses with a single free variable (or two positive variables) with terms in the node balance equation.

But perhaps there are generic Energy Systems cases that we're not aware of driving the current structure?

All that said, I think it's not ideal that I have to define 6 relationships. I would take two of any kind over that!

spine-o-bot commented 3 years ago

In GitLab by @Tasqu on Apr 15, 2020, 12:23

But perhaps there are generic Energy Systems cases that we're not aware of driving the current structure?

Could be, but I'm still not a huge fan.

In any case, I agree with the sentiment of implementing shortcuts for these types of common simple structures. However, I feel we should make such shortcuts really explicit for the user, e.g. have a dedicated parameter (like bi-directional_simple_connection_capacity or something) that triggers the aforementioned pre-processing. Mainty to try and avoid any potential confusion when any shortcuts are being applied.

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 15, 2020, 13:12

Me neither. And by the way, an aspect of what I'm proposing, or at least an option, is using preprocess_data_structure.jl to create some of the redundant relationships that are inferred from the base ones (however you define them), based on parameter values (could be method parameters).

spine-o-bot commented 3 years ago

In GitLab by @mihlema on Apr 15, 2020, 13:54

connection__from_node(forward) connection__to_node(forward) connection__from_node(reverse) connection__to_node(reverse) connection__node__node(forward) connection__node__node(reverse) connection_capacity(forward) connection_capacity(reverse) fix_ratio_out_in(forward) fix_ratio_out_in(reverse) I don't think you need to define the capacity for your case twice, as you already have the fix_.. taking care of that but that's just a side note.

Regarding inconsistent relationships: Some sanity check would be nice, or as proposed an automated creation of the reverse flow, if exactly the parameters for both directions apply. A simple connection__node__node does at first glance seem nice. I think it could be done with pre-processing.

It seems to go back to an early issue #4 . I do agree that it is cumbersome at the moment to create these, but we do have a sort of stable version of SpineModel at the moment which is (more or less) working. Thus, I'd rather not open a discussion again on how to define connections at this stage. Using methods more consistently is a good way forward. I think we need to have a wiki or the like to align on naming etc.. Even though I'm very happy about the pre-processing, I am a bit worried that we are overloading it at this stage of the model development and we end up with quite a few patches.

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 15, 2020, 14:41

@mihlema We could do it in a backward compatible way. If we introduce a new method parameter (connection_type for example). Then connection_type=connection_type_simple_bidirectional (or whatever) that would trigger some code in preprocess_data_structure.jl that would add the additional expected relationships.

So from the model point of view, it wouldn't be seeing anything different and databases with connections using the explicit relationships would continue to work

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 15, 2020, 14:44

changed title from Data structure for connections is in{- -}efficient for simple cases to Data structure for connections is inefficient for simple cases

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 16, 2020, 08:39

A more simple graph view would be this: image

I guess that's my part of wishful thinking, since I feel that the 'connection' entity can be expressed by the 'node__node' entity and therefore redundant.

That of course wouldn't address the problem of multiple variables required for implementing losses in a two-way connection where only one direction can be active at a time - which I believe is typical for energy transfers. Using method would be the best way in my view. You have only one connection object where the method triggers necessary variables (or maybe it's the connection__node__node where the method resides).

And if you can do it in backward manner, then that's probably good - however, we should also carrying lot of baggage from the past - it can become heavy to maintain. In that sense, I would probably prefer a migration script to update old data into new format.

spine-o-bot commented 3 years ago

In GitLab by @Tasqu on Apr 16, 2020, 08:50

Well, sometimes one might have need to define multiple connections between the same two nodes with different parameters. If I remember correctly, that was an issue with Backbone at one point.

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 16, 2020, 08:51

Yes, but in Spine Toolbox you can have multiple node__node entities with different names.

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 16, 2020, 09:40

Actually, we assume for a single relationship class, you can't define the same relationship twice, even with different names - uniqueness of the members is assumed.

spine-o-bot commented 3 years ago

In GitLab by @manuelma on Apr 16, 2020, 10:33

@DillonJ is right, we can't have two node__node relationships between the same two nodes at the moment. If we release this restriction we need to rethink at least the tree view and the parameter convenience function syntax.

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 16, 2020, 11:15

Right, of course. I suppose there is no appetite to fix those issues right now. The downside is that we need more entities when we don't allow multiple relationships between two entities. Of course it wouldn't be trivial to handle the convenience function syntax issue. Although, I wouldn't be 100% sure it needs to be resolved. Is there any case where you would need just one but not the other? In connection specific constraints both connections would get their own set of constraints and in a node balance flows from both connections would be summed anyway. Spine Toolbox could of course block one connection when needed. E.g. two scenarios: one with AC line and another with DC line. If these were parallel lines (and thus not alternatives), Spine Model could still distinguish between them since they would have different methods.

spine-o-bot commented 3 years ago

In GitLab by @nnhjy on Apr 16, 2020, 15:17

Juha's wishful thinking implies that the connection we want is exactly a relationship between two nodes rather than an object. But since now it seems impractical to loosen the uniqueness of relationship, we should probably stay with the connection being an object. In that sense, Jody's proposal of preprocessing with some method parameter sounds a nice starting point.

Another thing I'm not clear about the present connection structure is whether we assume that one connection links two nodes only. A connection object can technically link to multiple nodes. What would SpineModel do in that case? A connection linking to multiple nodes may be useful in modelling gas/heat networks.

If a connection is exclusively for two nodes, then we might simply use connection__node__node with a directional parameter (consisting of values bidirectional, node1_to_node2 and node2_to_node1) to contain all needed parameters. Directions are linked to the directional parameter value in the pre-processing phase.

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 20, 2020, 12:22

Ok, so let's see... basically we have three ways (and variants thereof) to do this:

My preferences is connection__from_node + connection__to_node mainly because :

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 22, 2020, 11:30

It seems there is going be a compromise one way or another.

I guess if using connection__from_node, then the connection object would hold any parameters that are not directional and connection__from_node and connection__to_node would hold the directional data. This might not always be clear to the user. However, if our GUI is intuitive in the way it shows such data, then most of the potential confusion should be possible to avoid. But this might require additional information on the class level - when user chooses a bunch of connections, the UI should show data for the connections and for relevant child objects in intuitive order - also in the tabular view. Connection can also be part of a group or it can be an alternative to other definition of the same connection (e.g. alternative investment choices for a transmission line). The UI should be able to handle these cases without mixing things up for the user.

As a result, I think it might be easier if something like a transmission line would have one entity that holds all the data. For example, a transmission line has 1000 MW in one direction and 1200 MW in another direction (NTC, technical issues). But it can also be expressed with physical data (impedance, etc.) and in this latter case the model would optimize how much flow it allows in each direction. If there is just one entity, then all the data would be automatically shown when that entity is chosen in any of the views. Of course this means that there needs to be more parameters with names like capacity_left_to_right or capacity_bidirectional. But many parameters are never direction dependent anyway.

Another matter is what the Spine Model sees. In there, I would think it's best if there is only one parameter that defines NTC capacity in each direction (often the same number). This would keep the model code more simple, but then data needs to be translated from the specific format to the model format. I suppose that's the way it currently is, using direction dimension to establish a constraint for both directions.

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 22, 2020, 11:30

I find from_node and to_node confusing. My brain refuses to understand that there could be flows going into the from_node. I would rather see something like node_left and node_right.

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 22, 2020, 11:32

Even if we would have node__node relationship as the connection entity, the class could still be named connection. Our GUI still holds a very clear distinction between objects and relationships, but I feel it should be more versatile for the user. You should be able to mix them when that makes sense and see them separate when that makes sense.

Actually, I feel that the bigger distinction we might need is between data holding entitites (like unit, node, connection) and relationships that establish model relevant functionality (e.g. saying that these units belong to this unitgroup).

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 22, 2020, 13:20

I strongly prefer the consistency around having energy system entities at object level and being able to see an object structure that represents an energy system.

Think of investments for example. Units and connections are physical entities that you would invest in and which would have an investment variable associated with them.

Relationships for me represent information about the structure of the energy system (as opposed to it's topology).

I strongly feel we should see connections at object level.

My strong preference is connection__to/from_node as I feel it is the best from a utility point of view. If I had to choose between the connection_node_node model or the node_nodemodel, I would chose connection_node_node. If you liked, you could call the connection__node__node relationship class, node__node :-)

spine-o-bot commented 3 years ago

In GitLab by @mihlema on Apr 22, 2020, 14:15

Thanks for the overview @Jody. I have some clarification questions:

connection__from_node + connection__to_node

  • pros
    • connections are distinct objects, with visibility at the same level as other physical objects like units and nodes etc. (consistency)
    • all parameters can be defined on a per direction basis
    • no risk of inconsistent data
  • cons
    • need at least two relationships This would mean that 1 connection__from_node + 1 connection__to_node would be used for one but also bi-directional transmissions? I agree with @Juha comment that the connection__to_node with an associated parameter directional_loss (or which ever name) might give the impression of a onedirectional flow. – but maybe that is a question of naming. In the case above, a connection can only transfer between 2 nodes as I understand? I think we would need a connection__node__node relationship, too, to allow multiple commodities travel across the same transmission line. Consider for instance an Electricity line for which both Electricity and Reserves are allowed. I think Current versus Alternative No.3:
Current Alternative No. 3
objects: connection Conn Conn_El
Conn_FRR
Connection_Group
nodes BE_El
BE_FRR
FR_El
FR_FRR
Node_Group_BE
Node_Group_FR
BE_El
BE_FRR
FR_El
FR_FRR
Node_Group_BE
Node_Group_FR
relationships: connection__from_node ConnBE_El
Conn__ BE_FRR
Conn
FR_El
Conn__ FR_FRR
Conn__BE_El
ConnBE_FRR
Connection_group
Node_Group_BE
connection__to_node ConnBE_El
Conn__ BE_FRR
Conn
FR_El
Conn__ FR_FRR
Conn__ Node_Group_BE
Conn__FR_El
Conn FR_FRRE
Connection_group
Node_Group_FR
connectionnodenode ConnBE_El__FR_El
Conn
BE_FRRFR_FRR
Conn__ FR_El
BE_El
Conn__FR_FRR__BE_FRR
node__group_node Node_Group_BEBE_El
Node_Group_BE__BE_FRR
Node_Group_FR
FR_El
Node_Group_FR__FR_FRR
Node_Group_BEBE_El
Node_Group_BE__BE_FRR
Node_Group_FR
FR_El
Node_Group_FR__FR_FRR
connection_group__connection Connection_groupConn_FRR
Connection_group
Conn_El

While original suffers from huge amount of connection__to/from, we can use the same connection for multiple flows - I guess the question is where we want to aggragate the flows. Personally I prefer having one connection being connected to multiple nodes.

From the above options, my preference goes for connection__node__node as it is more readable for the user than the current implementation and at the same time only have one loss parameter / fix_ratio (not left_node_loss & right_node_loss). This also allows us to have one connection connecting multiple nodes easily.

spine-o-bot commented 3 years ago

In GitLab by @jkiviluo on Apr 23, 2020, 10:05

I strongly object to a world where entities are discriminated on basis of color or on basis of dimensionality. It should be about function - not if something has one dimension or two dimensions. Naturally, there needs to be a way to see what one needs to see at a given time. For this, we could have a tree where one selects to display 'physical entities' or to display 'model structure' - or to display 'scenario structure' or whatever users and developers will find useful in the future.

However, I did some experimenting with a data store and while a physical node__node connection could work fine, it would be more difficult with units, because the dimensionality can change from unit to unit (multiple inputs/outputs). And it would not be consistent to treat units differently than connections. Hence, it might be best to have a user-facing one-dimensional object whenever the physical entity has data spread across multiple relationships.

The GUI will need improvements though, but the good thing is that they don't need to be there right away. When I select couple of multi-dimensional entities, I'd like to see parameters in good order within one view. This has been discussed before in relation to tabular view (capability to show data in the same table even when some data is for e.g. unit__output_node and some data for input_node__unit, but it should also work in the list view.

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 23, 2020, 10:26

This also allows us to have one connection connecting multiple nodes easily

A question here is if we want to allow this... if you have more than two nodes related to a connection, what does it mean? Now you need some sort of balance constraint as the fix ratio constraints only relate two nodes at a time. If you need balance, then it is more like a node and node should be used instead in this case.

I personally think we need node__node to allow what you are effectively trying to do with a connection connected to more than two nodes.

If we had node__node (and associated flow variable), we can make logical connections between nodes where it is an unconstrained logical connection. This gives possibilities to make your system simpler. We already have node__node to handle diffusion but there is no explicit variable... so that is all that is really needed, and it would be nice to see the flow between nodes explicitly, even in the case of diffusion.

In summary:

You can use the combination of these two things to handle complex situations where you effectively want a physically constrained connection connecting more than two nodes.

spine-o-bot commented 3 years ago

In GitLab by @DillonJ on Apr 23, 2020, 10:38

Also... @mihlema

the question is where we want to aggragate the flows

The philosophy behind the "model adjustments" was to rationalise the structure to be a little more coherent based on the idea that the node is the universal aggregator. I think in terms of a commodity networks this makes a lot of sense and we should stick with this idea - that balance always happens in nodes. We could think of our system like this

And this, for me, is why it is nice to have these main system elements appear at the object class level

clizbe commented 1 year ago

Closing due to stale status.

clizbe commented 1 year ago

Closing because stale.