GMLC-TDC / HELICS

Hierarchical Engine for Large-scale Infrastructure Co-Simulation (HELICS)
https://docs.helics.org/en/latest/
BSD 3-Clause "New" or "Revised" License
127 stars 41 forks source link

convergence mini federate (app?) #451

Open phlptp opened 6 years ago

phlptp commented 6 years ago

This issue is to capture a discussion on the potential nature of a helper federate for dealing with some convergence issues that are not fully dealt with by the local iteration control.

The idea would be that federates link into the convergence federate with some domain level API. This will make use of the underlying HELICS mechanics to manipulate synchronization on a multi-federate level.

The real question is what situations should this target (we are not likely to have a single one deal with all possible circumstances).
And what mathematical basis should be used for the convergence.

The target is an early version that could be included in the 2.1 release probably in November/early December.

trevorhardy commented 6 years ago

Is there a reason this should be a separate federate (I assume similar to player and tracer) and not integrated into HELICS itself? I didn't realize we had made this decision already and don't know the reasoning behind it.

phlptp commented 6 years ago

I don't know how you would integrate convergence directly without forcing a lot of assumptions on different federates. Convergence is a mathematical issue, which is separate from iterations which is supported at the lower levels. The core of HELICS doesn't deal in anything about the nature of the data being passed back and forth. So the core can know binary equivalence, and timing, that is it. The application API knows somethings about the data itself but only on a local level and an individual interface. Convergence is a mathematical issue connecting multiple federates which cannot be dealt with at Levels 0(comm/os level), 1(core), 2(application api) without breaking many of the rules and principles behind HELICS development. so it needs to be at level 3(apps/tools), 4(domain apis), or 5 (global management). Because I don't think sufficient information is available at lower levels to be able to deal with convergence so I think it must by definition be at a higher level than the application API.

There is a significant distinction between convergence (mathematical) and iteration. Convergence has different definitions in different domains. iteration can defined precisely and determined locally and in a distributed fashion. (interface update or not).

trevorhardy commented 6 years ago

Not knowing the architecture of your APIs, why couldn't the responsibility of convergence be made a function of the broker (possibly the root broker)? It has access to all the data and part of its configuration could be to define who is converging and when convergence is reached. It is already controlling the time and messages. The challenge of defining convergence and communicating that in a general way to the federation would not be dramatically different than for a separate convergence federate.

I say this not because I necessarily think this is what we should but because I don't understand why it isn't an option.

abhyshr commented 6 years ago

"There is a significant distinction between convergence (mathematical) and iteration. Convergence has different definitions in different domains. iteration can defined precisely and determined locally and in a distributed fashion. (interface update or not)."

I have still not understood the meaning of an "iteration" the way it is implemented in HELICS. Considering that we are doing a co-simulation, that solves a system of coupled equations spread over multiple federates, "iteration" definition would be something that is defined for the federation, not the federate itself. As the number of iterations increase, the solution for the federation may (or may not) converge to the expected value. So, in that sense, and consistent with the iterative solution of equations, all federates should "converge" to a solution at the same iteration (assuming the solution is convergent). This is the global convergence for the federation.

Now, as Trevor suggests, one possibility (without knowing the intricacies of HELICS core implementation) would be for TimeRequestIterative() call to take in the local convergence of each federate, the broker do an AND operation to get the global convergence, and then return the global convergence to each federate.

For mathematical basis for convergence, my first thought is each federate a scalar (or vector) real function and the convergence coordinator do a norm operation (what type of norm operation to be done would be set by the user)

On a separate note, I've also seen a convergence federate being used to pass in the function and Jacobian for solving nonlinear system using Newton's method. @phlptp : Do you envison supporting this as well?

phlptp commented 6 years ago

In HELICS an iteration is a simply a repeated timestep for a single federate. This is determined locally for every federate independently. When a federate requests a time grant it makes a decision, it can select no iterations, guaranteed iteration, or conditional iteration. The sole condition for determining whether to iterate or not is whether or not updates for any of the federates interfaces were received at the current time. If not no iterations, if so iterate.

The notion of iterations must be applicable to every possible federate, communication systems, discrete events, strings, random data, and every possible combination of them. The notion of co-simulation dealing solutions to algebraic and differential equations is a subset. A very important subset that is key to the nature of the problems HELICS was targeted at but it is still a subset.

The notion that all federates should have the same number of iterations is fundamentally false, and would only be true in the limited case of a very tightly coupled system with equivalent tolerances on all interfaces.

phlptp commented 6 years ago

Assuming we are limiting ourselves to federates that deal mainly with algebraic or differential equations. I think a convergence API built on top of the application API might be a reasonable thing. That could take a couple forms, either a global convergence notion or a more detailed concept that included some residuals and Jacobian information, and embedded some mathematics inside of it. I think this could be built as part of one of the domain API's we have discussed in the past. But I will strenuously object to any notion of global information as part of the core or Application API layers.

abhyshr commented 6 years ago

Thanks for the explanation. Based on your comments, Would it then be better to reword "iterate" to "repeat" or some other equivalent term to emphasize the step is repeated? "Iteration" here, at least to me, is somewhat confusing as it seems to denote the notion that the federation is approaching some limit (convergence) as the iterations increase. But, based on your explanation, it is not necessarily that.

"The notion that all federates should have the same number of iterations is fundamentally false, and would only be true in the limited case of a very tightly coupled system with equivalent tolerances on all interfaces."

Hmm..Most of the use-cases, through the HELICS and some other infrastructure co-simulations projects, that I've come across have each federate solving some system of equations and is coupled with one or more federates. For such a solution for a coupled system of equations, typically, the convergence is achieved when the all the residuals are within certain tolerance at some specific iteration. I have not come across a method that declares convergence for different iteration counts for equation subsets.

Perhaps, I don't understand the logic for how iterations are implemented in HELICS, so I may have misunderstood your comments. I have a few questions on that:

Lets' say for two federates A and B are repeating a time-step (iterating). If A converges (locally) at a time-step and requests no iteration, but B has not i) Would A be granted the next time-step OR would it be blocked till B has moved to the next step? ii) Would B stop receiving values from A during the time-step repeats?

"That could take a couple forms, either a global convergence notion or a more detailed concept that included some residuals and Jacobian information and embedded some mathematics inside of it."

I think for the short-term, the global convergence notion might be better. If we have the convergence federate act as the main solver then may be a bit complicated to support different type of solution methods (nonlinear, linear, optimization, time-stepping), etc.

phlptp commented 6 years ago

In those scenarios.
i). A would be blocked until B has stopped iterating.
ii). not sure what you mean here, the nature of the value interfaces in HELICS is such that when you request a value you get the most recent update from a prior iteration or timestep, so B would not get updates, but it would get the previous value.

I am not sure these questions capture the notion of how iterations work though. Perhaps I can try a slightly modified scenario

Lets' say for two federates A and B are (iterating). Convergence is defined as the value changing less then a threshold. A detects that is output value has not changed by a threshold so it doesn't update its output. Both A and B are allowed HELICS control of the iterations. B updates its value for the current timestep, then requests iterate_if_needed. No updates have been received so it waits. A is granted an iteration as B has updated its value. It detects this change is below its input threshold, or it loops again and again detects its output has not changed sufficiently to merit an update. It again requests iterate_if_needed. This time both and B are requesting this but neither has updates so both are granted the next time step.

This gets more interesting when extended to 3 or more federates. In this scenario Two federates could iterate several times without any updates from the third. Or they could ping pong back and forth with updates, and all three being at very different iteration counts.

Getting specific to power systems say I have a scenario where a transmission network is coupled to distribution networks in several different locations. For rapid changes this requires iteration and some level of convergence. Lets say there is a change on one of the distribution networks. This will require iteration between that distribution system and the transmission network to find a new value. But the buses around the other distribution networks may not change much if at all so it makes very little sense to require those networks and possibly expensive computation to iterate as well or have the same number of iterations.