Power migration steps - Githubissues

rcakella commented 12 years ago

Currently, the P* is set to current power_level of physical device +/-1 in the lbAgent::StepPStar() function. However, this would render a few ineffective 'power migrations' when multiple calls to this function are issued. For example, for two successive power migration steps A and B that take place in a relatively short time interval, A sets p* relative to current gateway, B also sets it to current gateway instead of setting it relative to previous P. In the current setting where it would take some time for the simulation to reach to the target P_ after a command is issued, this problem delays the time of convergence and needs to be fixed, of course by me.

rcakella commented 12 years ago

The options we have to deal with this situation are 1) Limit the #migrations to 1/LB cycle or 2) Allow multiple migrations/LB cycle but keep track of the set commands.

nickmame commented 12 years ago

If you plan to include this for the live demo, it will need to be committed by today at 2 PM.

rcakella commented 12 years ago

I implemented the option 2 above, but the results seem to be scary using the PSCAD simulation. This is primarily due to stability concerns as can be seen in the graph here. This could differ for RTDS tests though. For now, I wont push anything, unless we all agree it is essential.

mcatanzaro commented 12 years ago

Is this problem internal to load balancing, or is it a device management problem?

E.g. if the power value is P and you Set it to P' and then immediately Get the value, is it P' as expected or P due to real-time delays? The device should be responsible for returning the state it will have after all received commands have been implemented; you as a caller shouldn't have to keep track of all that. Or - perhaps it needs two different functions for both purposes?

If it's not converging fast enough, wouldn't it make sense to just use a p_migrate value greater than 1 until convergence is imminent? That would allow us to converge faster without running LB more often?

You might push your solution to a side branch if you want us to look at it. ^^

scj7t4 commented 12 years ago

Do we need to consider that our approach to interacting with PSCAD is wrong? Maybe we should try and find similar systems in the real world and emulate them. For example, I know with robotics, typically when you send commands to move through a certain angle to a motor, the system may block until that command completes. Abstracting that to our system, perhaps then we need to establish a maximum time the system has to react to the command being placed, and then the load balancing system should wait until either it observes its command in effect, or the timeout has expired.

You can of course abstract that further to make the procedure asynchronous using callbacks.

This kind of thing does make me a little nervous about the LB trying to generate too many contracts in large systems.

nickmame commented 12 years ago

The device should not be responsible for tracking its expected value. A call to GetSignal on a device reads the physical state of some piece of hardware. The value it returns should not be a cyber value that doesn't reflect the physical state. If we attempt to compute the expected value, we'll end up implementing a power simulation inside of DGI.

We already have a command table that keeps track of the last value from load balance. If we want to stack migrations, we should just increment / decrement values in the command table instead of setting them. Then when the command gets sent to the simulation, all of the changes made by load balance are reflected.

mcatanzaro commented 12 years ago

Tom, I understand better now :-) but that solution does not get at the problem of physical delay. If the problematic delay were coming from the device adapter, that would work. Is it? The delay in CClientRTDS will be set by the FPGA. If that's greater than the time to do a LB, say 200ms, then we have a BIG inefficiency because commands will block in CClientRTDS - it will only send one command every 200ms - and extra messages from LB will pile up into a bigger and bigger queue. Your solution would solve that. But if the problem is actually physical delay in the system, that won't help. Correct?

Stephen - we could have devices block on /sending/ commands. i.e. the physical adapter (CClientRTDS) would not let the call to device.Set("setting") finish until it has sent the command, which would prevent LB from sending extra commands. That might make problems for the round-robin scheduler, though (is it smart enough not to schedule LB when it's blocked?). But if the problem is physical delay in the system, that's again not going to help. And perhaps if we implement Tom's idea, we wouldn't need to do that. We should also consider if this could cause other problems (though I can't think of any).

So - what frequency is the FPGA communicating on? Seems like we need to know that.

Here's a scenario. The system contains only DGI A (in supply, but only by a little) and DGI B (in demand, but only by a little). LB runs every 60s, FPGA transmits every 200ms, and the future RTDS adapter compresses duplicate set commands on the command table into just one set command before sending. What if LB then overshoots convergence and transfers too much power, because it's relying on its outdated state? The only way to prevent that would be for LB to track the commands it's send and "anticipate" the future, which we don't want to do but which Ravi has coded but not pushed?

On 5/18/2012 8:48 AM, Thomas Roth wrote:

The device should not be responsible for tracking its expected value. A call to GetSignal on a device reads the physical state of some piece of hardware. The value it returns should not be a cyber value that doesn't reflect the physical state. If we attempt to compute the expected value, we'll end up implementing a power simulation inside of DGI.

We already have a command table that keeps track of the last value from load balance. If we want to stack migrations, we should just increment / decrement values in the command table instead of setting them. Then when the command gets sent to the simulation, all of the changes made by load balance are reflected.

Reply to this email directly or view it on GitHub: https://github.com/scj7t4/FREEDM/issues/62#issuecomment-5786400

Michael Catanzaro Missouri University of Science and Technology Senior - Computer Science michael.catanzaro@mst.edu

nickmame commented 12 years ago

I'm confused on 'physical delay in the system' - could you restate the problem?

mcatanzaro commented 12 years ago

The delay between CClientRTDS/CLineClient (the "device adapter") sending a command to the FPGA/PSCAD and when the state table is updated. In between those times, the device finishes "responding" to the command.

Scenario 1: LB sends commands to the device adapter more frequently than they are transmitted to the FPGA. The queue of commands therefore increases forever and commands are run even if they no longer make sense. The graphs that were posted don't seem to indicate this problem, but I wouldn't expect them to, because I believe CLineClient passes commands to PSCAD as soon as they are received. CClientRTDS passes commands to the FPGA on a periodic schedule as set by the FPGA, so if Florida's graphs look different than ours (do they?), that might be why, and we could fix it by scheduling LB less frequently.

Scenario 2: The period of LB is set less than the period of the FPGA, so we're dandy, and the delay causing this issue is a reflection of how long it takes a device to respond to commands. That was my initial understanding from the first post.

On 05/18/2012 04:30 PM, Thomas Roth wrote:

I'm confused on 'physical delay in the system' - could you restate the problem?

Reply to this email directly or view it on GitHub: https://github.com/scj7t4/FREEDM/issues/62#issuecomment-5796250

Michael Catanzaro Missori University of Science and Technology Senior - Computer Science michael.catanzaro@mst.edu

mcatanzaro commented 12 years ago

Correction: scenario 2, we're dandy if the period of LB is greater than the period of the FPGA, i.e. LB runs less frequently than CClientRTDS

mcatanzaro commented 12 years ago

By the way, CClientRTDS::Run does get called whenever its async wait expires, right? i.e. the simulation runs on a multicore computer, separately from LB/GM/SC and CClientRTDS won't have to wait an indeterminate amount of time for one of those modules to complete? Because if CClientRTDS has to wait on those modules to send to FPGA, then this is even more complicated and potentially messed up ^^

lfqt5 commented 12 years ago

This problem still exists in the current SuperMassiveMasterFix with PSCAD simulation. Although there are messages (caught in SC) indicated successful power migration, the power value of the device observed in Load Table and SC won't change until after at least 8 successive successful power migration messages. I wonder if it is due to the delay of the simulation to reach the P target.

Another problem about setting P* after two successive power migrations becomes severe in the system with two supply nodes and one demand node. Based on the trace, it seems that two supply nodes send power out, but demand node only obtain one of their's power. It leads to the incorrect of the converge value. For example, if three nodes are 50, -20, and -30, the converge value will reach around 8.

lfqt5 commented 12 years ago

The trace of above test could be found on r-facts machine under branch SuperMassiveMasterFix with name 0920test5. In the beginning, the Normal value is 0.66607, which is correct. But as the migration going, this Normal value will increase to 9.7 then converge to 9.0 with all three nodes are normal.

FREEDM-DGI / FREEDM

Power migration steps #62