Sleeper85 / esphome-jk-bms-can

GNU General Public License v3.0
64 stars 18 forks source link

Automatic Charge Voltage v2 (recently introduced) - Too slow and too aggressive ? #41

Open luckylinux opened 3 months ago

luckylinux commented 3 months ago

The default value of charge_v_factor = 2.0 didn't work at all for me (it actually went "full" power and caused cell 11 to hit OVP as usual).

When charge_v_factor = 3.0 instead it seems to work WONDERS.

However, near the top of the charge and possibly at the end of the day when not enough solar power is available (thus battery charges/discharges/charges/discharges/... constantly), there is a huge ringing with many cells getting dangerously close to OVP.

This is somewhat of a physical issue (it doesn't take much charge / Ah at that high SOC/voltage to increase the voltage a lot, and normally the charge should stop far sooner)

image

image

image

image

image

image

image

image

luckylinux commented 3 months ago

Maybe this is not something that can be fixed directly with the interver ...

Probably tomorrow I'll let it charge up (hopefully it will be done by 11h or 12h), open the breaker and switch-disconnector, then feed in power through some 10A Banana Plugs that I left there for troubleshooting/testing etc. Leave it connected with a Lab Power supply at 56.2V / 5A and let it sort itself out, without doing all the charge/discharge, since it never gets the chance to balance like that.

MrPabloUK commented 3 months ago

Couple of thoughts: Try reducing the refresh interval for the JK BMS component, perhaps 1s. Set the charge voltage to 3.45v or so per cell and attempt to balance lower down the voltage curve.

It might be possible to introduce a smoothing filter, but I'm hesitant to do that, as it'll add some delay. The other option is something similar to that on the auto voltage control, where a reduction in CVL is instant, but an increase is smoothed.

MrPabloUK commented 3 months ago

Although, are you using the BLE connection to the BMS?

luckylinux commented 3 months ago

Do you think it's possible to set this refresh interval to a parameter that can be tuned ? It's a nightmare to update when it's live ... Probably only safe in the early morning when there is no sunlight, otherwise it will go full power (or trip with BMS_Err_Stop).

I think in the case of BLE it's throttle: 3s instead of the wired update_interval: 3s equivalent.

luckylinux commented 3 months ago

Although, are you using the BLE connection to the BMS?

Yes, I'm using BLE.

MrPabloUK commented 3 months ago

Does it need to be adjusted in real time? Just try setting to a lower interval and see if it continues to update as expected. I feel that a refresh interval / throttle of 3s is possibly a little low if you're charging to a voltage higher up the knee, plus running with a lower OVP.

luckylinux commented 3 months ago

Do you mean "I feel that a refresh interval / throttle of 3s is possibly a little high" ? I think you meant the refresh frequency is too low, right ?

Nothing needs to be adjusted in real time in an ideal situation. Yet I'm doing 2-3 reflashes each day ....

MrPabloUK commented 3 months ago

I have an idea for a different / enhanced approach, I'll see if I can code this tomorrow. Once ready, I'd like if you could test some specific cases I'll outline at the time. As a result (and this feels odd to say) please don't attempt to get your battery fully balanced just yet. Perhaps just keep charge voltage to 3.45v a cell for now.

luckylinux commented 3 months ago

OK I'll cancel the charge then

It was starting to SLOWLY balance now, max: 3.503V, min: 3.420V.

luckylinux commented 3 months ago

The Battery was actually quite full around midnight when I stopped already, but still heavily unbalanced.

Sitting at roughly 70% now, charging quite fast (knock on wood).

Let me know how you want to proceed. In the end I did NOT update the 3s -> 1s yet nor reflash the firmware. Will do that this evening hopefully after sunset ...

MrPabloUK commented 3 months ago

This issue is being caused by the Auto Voltage Control being a proportional-only control system, that is, it only applies a proportional response to the error (or voltage delta to target). Initially, it looks like a default gain (or factor) of 2.0 was too low, preventing the control system from sufficiently clamping down on the voltage. Increasing this to 3.0 worked in one situation, but was too high a gain when the situation (loading) changed yet again.

You might suppose that setting it to the middle ground of 2.5 would help, which it might. The issue with P control is that it always needs a voltage delta in order to be effective. No delta = no response. Additionally, there's no concept of how fast the delta is developing, so that can also result in oscillations when the rate of input is low (3s).

As a result, I'm going to explore adding an Integral element to this control logic. It'll likely slow the rate of change but should damp out oscillations.

luckylinux commented 3 months ago

I switched to throttle 1s now. Actually, the Deye apparently talks to the CANbus even when in voltage mode, so there is no "loss of control" when switching back to Lithium Mode.

I know about proportional controllers, they can be quite unstable, especially in case of big sampling/processing delays and/or high "KP" gains. Remember back then in my university days and power electronics projects :+1:. Basically if the delay is too big, it could be that instead of helping the system to stabilize, they amplify the oscillation, since the control output now adds up to the original input.

If you want some more stable controller a PI (not "a lot of KP") or FF+I would be better.

And yeah, a sample rate of 1s (either of the measurement or the control algorithm) might be too low indeed ...

EDIT 1: Also keep in mind that the delay of the whole loop (measurement + control algorithm) might be, in the worst case (since I don't think these are synchronized like e.g. ADC sampling in case of Power Electronics PWM Control) to tDelay = tSampleMeasurement + tSampleAlgorithm (+ tDurationAlgorithmComputation).

So that might now be around 2s instead of the previous 4s (slighly more, depending on how long the control section takes to execute, maybe 0.05s extra or so).

luckylinux commented 3 months ago

I'm also wondering if the issue is related (first ... due to unbalance) to the fact that the "Requested Charge Voltage" has only 1 decimal ...

For 16s and assuming 5 mOhm total equivalent resistance (10kA short-circuit current), a 0.1V step in voltage could translate into 0.1V / 0.005 Ohm = 20 A. Which of course can quicky overcharge the last cell.

If all cells share that 0.1V it shouldn't be a problem. Also if 14 out of 16 share it (like I observed in the last days) that is also OK. But at 20A, that is a lot of "surge" and if one cell takes almost all the voltage increase, then that cell (almost ?) hits OVP, and the controller tries to counter that, resulting in a step in the opposite direction, triggering a -0.1V step, triggering a discharge.

And the cycle repeats.

Are we sure that the Deye Inverter / PYLON Battery Protocol only allows for 1 decimal digit ??? That could be another way to tackle the issue. With 2 digits, it's going to be much more smoother, e.g. 2A charge current near the top of the charge instead of 20A.

luckylinux commented 3 months ago

image

image

luckylinux commented 3 months ago

Looking at the code

  // +--------------------------------------+
                      // | CAN messages                         |
                      // +--------------------------------------+

                      // Byte [00:01] = CVL : Charge Limit Voltage    (0.1 V)
                      // Byte [02:03] = CCL : Charge Limit Current    (0.1 A)
                      // Byte [04:05] = DCL : Discharge Limit Current (0.1 A)
                      // Byte [06:07] = DVL : Discharge Limit Voltage (0.1 V)

                      uint8_t can_mesg[8];

                      can_mesg[0] = uint16_t(charging_v * 10) & 0xff;
                      can_mesg[1] = uint16_t(charging_v * 10) >> 8 & 0xff;
                      can_mesg[2] = uint16_t(charging_a * 10) & 0xff;
                      can_mesg[3] = uint16_t(charging_a * 10) >> 8 & 0xff;
                      can_mesg[4] = uint16_t(discharging_a * 10) & 0xff;
                      can_mesg[5] = uint16_t(discharging_a * 10) >> 8 & 0xff;
                      can_mesg[6] = uint16_t(discharging_v * 10) & 0xff;
                      can_mesg[7] = uint16_t(discharging_v * 10) >> 8 & 0xff;

Not fully sure why there is a & 0xff (that will do a bitwise AND with 0xff, i.e. 0x00ff).

So is that used to "disregard" the left part of the value (right part of the value for e.g. can_mesg[1] since you have a bitwise shift >> 8) ?

So, effectively, to "split" the uint16 value into 2xuint8, and assign these pieces to the CAN message ?

luckylinux commented 3 months ago

So basically it is representing, for my inverter, the Battery voltage (let's make simple here: 40 VDC to 60 VDC to have round values) into 400 "V" ... 600 "V", then convert these to 2 x Bytes (8 bit), i.e. 400 and 600 should correspond into 0..65535 binary values.

It seems a bit weird that this is the way the protocol is coded. Why go up to a range of requested charge voltage of 0 VDC ... 65536 / 10 = 6554 VDC ?

image CAN_Battery_Protocol_Resolution.ods

MrPabloUK commented 3 months ago

Both the SMA and Pylontech protocols specify a 0.1v precision (0.1A also). More precision would be a good thing, but it's not supported. Discussing why it is set up this way is out of the scope of this particular issue, and not something that the developers of this project can impact.

luckylinux commented 3 months ago

I can understand. No matter what we do, the firmware on the inverter is as it is.

I was just wondering why ... Seems to lose lots of resolution for nothing ...

MrPabloUK commented 3 months ago

@luckylinux, branch automatic-charge-voltage-v.2.1 has been updated with adjusted auto charge voltage logic. This is not PI control, it's an interim approach which may provide the needed results.

There are two adjustable factors via substitutions, if you want to make them adjustable in real-time, feel free.

Factor to adjust the aggression of the auto voltage control logic. The higher the setting, the greater the voltage penalty applied to cells exceeding the target voltage (bulk voltage). charge_v_factor: "1.5"

Factor to adjust the speed of the auto voltage control logic. The higher the setting, the longer the period of time that voltage changes are applied over. charge_t_factor: "5"

This logic uses a static penalty multipler as opposed to the power function previously used. This will make it respond less aggressively to higher deltas. Requested CVL values are changed over time with a sliding window average, by default this is set to 5 seconds. If not fast enough, this could be either reduced or changed to an exponential moving average.

Can you please test with the default parameters and the following settings?

When you provide the graphs, please filter them to the period where CVL is being adjusted by the code.

luckylinux commented 3 months ago

@MrPabloUK: I'll test tomorrow early morning, as this will require both turning off CAN to reflash as well as turning off BLE to reconfigure the 2 BMSs.

OVPR = 3.55 VDC up from current OVRP = 3.52 VDC -> What about OVP ? should that also be increased to OVP = 3.60 VDC from the current OVP = 3.57 VDC ?

A moving average, to some extent, corresponds to an "averaged" response. So it's effectively a Proportional controller with a "averaged" input (filtered), correct ?

Instead of having a PI with an instantaneous input and (somewhat) a filtered output.

MrPabloUK commented 3 months ago

If you want, you can go higher with OVP. My setup has the following settings: OVP = 3.65v OVPR = 3.60v

The BMS is there as a protection layer, I don't want it triggering prematurely when the inverter can be told to reduce current.

As for the code, the averaging is applied to the output. Proportional control on input, raw output which is filtered. It works out lighter in terms of performance and maintainability.

luckylinux commented 3 months ago

Oh, so you filter the output ? Sorry, then I misunderstood.

But yeah, it gives a similar "smoothing" effect as a PI controller I would guess (of course with comparable coefficients etc).

My fear, with high values for OVP and OVPR is:

  1. In case of fast transients (load steps, i.e. loads abrubply stops), you can hit OVP just because of the extra ESR * Current voltage generated by that phenomenon
  2. The regulation slow and filtered becomes even more critical at higher voltage, because your time to react and tell the inverter to reduce current / voltage becomes shorter. In essence there are more volts per each coulomb (Ah or As in this case) at 3.60 VDC than at 3.55VDC or at 3.50VDC ... At constant current IIRC (and only 5A or so) I think the voltage raised like 0.01 - 0.02 VDC / s or so above 3.50 VDC. So a 1s delay (or 5s filtering moving average) can definitively be a problem. I'll make it tunable via Home Assistant, my gut feeling is that a value of 2-3 is probably better off at that high cell voltage.

Don't you regularly hit OVP on your battery ? Or you mean that, once balanced, then every cell shouldn't exceed 3.50 VDC anyway ...

MrPabloUK commented 3 months ago

The last time I hit OVP, it was on purpose months ago during testing. I charge to 3.45v, giving a healthy buffer to OVPR. Even though I've suffered imbalance at points (possibly similar to yours), charging at 3.45v has given the code ample time to back off power as needed for balancing.

If you're pushing the voltage higher (3.51v IIRC), you're already reducing the batteries headroom to deal with transients. The same goes for cell imbalance, a higher bulk voltage and lower OVP means there's not much buffer. https://louisvdw.github.io/dbus-serialbattery/faq/#why-is-the-max-cell-voltage-set-to-345v

With well configured battery + inverter communication, the OVP becomes protection only. The current reduces as bulk voltage is reached, leaving OVP just for when something has gone wrong.

I think there's a shift needed in your thinking. OVP & OVPR should never be hit in normal charging. Instead, you control via bulk voltage with appropriate current reduction via CVL or CCL control.

Give it a try at 3.45v bulk and the above settings (plus 3.43v balance) and just see what happens. I would be very surprised if you hit OVP, especially if you also enable current control with default settings.

luckylinux commented 3 months ago

I agree with you. You should not hit OVP regularly, if the Inverter talks to the BMs like you do here. It should only be a protection of last resort.

I hit OVP regularly WITHOUT Inverter <-> BMS communication. But that's also tied to the unbalance problem.

The problem is right now it doesn't seem to be able to balance properly. That's why I raised the bulk voltage, which was also not enough to do it. And potentially created further instabilities ....

So should I test with your code as it is with the outlined modifications ?

Or should I charge to say 98%, cut off the breaker & switch disconnector, then only CHARGE with a small lab supply (so there are no transients charge/discharge/charge/discharge/... either due to the controller, sun/shading or load steps) ?

MrPabloUK commented 3 months ago

I think you should:

Lots of changes but I'm confident we can get your battery behaving itself automatically.

luckylinux commented 3 months ago

Alright, I'll backport your changes to my YAML file. I'd still like to have the option to change parameters, should we need to :).

Should I update only Battery 02 JK BMS settings or also Battery 01 JK BMS settings ? Particularly the 3.43 VDC for balance ... You told me that for your (balanced) battery you were using 3.45 VDC.

But also OVP, OVPR, etc. Should also Battery 01 be updated "in the same manner" ?

One thing that I wondered about was why it couldn't be a user defined parameter instead of OVPR, to determine the "error" to which apply the proportional controller.

And now raising OVPR ... It's true that the "reaction time" at such level becomes shorter. But I guess your point is rather: since the bulk voltage is "normal" now, this means effectively that the cells get more time to balance out (combined with the 3.43 VDC starting balance ?

And the delta balance of 0.010 VDC is that to transition from bulk to float "early on" (EOC stop condition once balance finishes) ?

MrPabloUK commented 3 months ago

I would set up both BMSs and ESP32s with the same settings. In an ideal world, I'd say disconnect your uncontrolled battery so there's one less variable to deal with, during testing anyway.

For my battery, I bulk at 3.45v, balance starts at 3.43v. This works well for one of my lazy cells, so I think it will for yours too.

As for OVPR, that's used in the auto charge current control as it forces users to set up their BMS appropriately. I don't want to debug both code and BMS settings if someone raises an issue. It's not used in the voltage control code - that's purely bulk voltage based.

Your point about raising OVPR, that is exactly what we're aiming for.

The delta balance at 10mv is generally the standard. You could go for 5mv or something, but frankly it's small amounts of energy at that point.

luckylinux commented 3 months ago

Ah so you always start balancing at 3.43 VDC ? I thought that was a temporary setting for you, then you moved back to 3.45 VDC. Alright ...

I need to charge up Battery 01 in the morning, but after charged 90%+, I can disable both charge and discharge on it via Home Assistant.

Anyways the problem is at the top of the charge, don't you agree ?

luckylinux commented 3 months ago

Sigh, window_size apparently doesn't allow to use dynamic variables / templates :(.

        This option is not templatable!.
        window_size: !lambda |-
          return int(max(1.0 , id('charge_t_factor_setting').state));

I tried both of:

    update_interval: 1s
    filters:
    - sliding_window_moving_average:
        send_every: 1
        send_first_at: 1
#        window_size: !lambda return int(max(1.0 , id(charge_t_factor_setting).state));
        window_size: ${charge_t_factor}
    - round: 1

I also tried with global variables and updating charge_t_factor global value in the auto charge voltage algorithm:

charge_t_factor = int(id(charge_t_factor_setting).state);

It doesn't recognize it either:

  update_interval: 1s
  filters: 
    - sliding_window_moving_average: 
        send_every: 1
        send_first_at: 1

        Expected integer, but cannot parse ${charge_t_factor} as an integer.
        window_size: ${charge_t_factor}
    - round: 1
luckylinux commented 3 months ago

I flashed the code and changed the settings on both BMS early this morning according to what we discussed.

But then of course the sun had other plans ...

Charged the battery up to ~85%, then the sun went away.

It seems the next 4 days are also bleak.

I'll keep you updated when I'll get some sunlight. Thanks for your help.

luckylinux commented 3 months ago

I'm trying to push it with the charger (right now the charger is NOT respective the charge voltage, so I just set it to 55.0 VDC constant), because not much sun lately / for the next few days.

I can see WILD oscillations of the Charge Voltage to the Inverter though. Basically 54.3 V ... 55.3 V which is quite a lot.

The highest cell is 3.46 VDC at the moment, so that's probably why (bulk is 55.2 VDC / 55.3 VDC).

I still think the controller is very aggressive though ...

luckylinux commented 3 months ago

Max Cell is now 3.577 VDC, which is WAY above bulk (55.2 VDC / 16 = 3.45 VDC / cell).

It seems the newest Auto Charge voltage isn't really working or ?

MrPabloUK commented 3 months ago

You say you're running at a fixed input voltage? Not using CAN control?

luckylinux commented 3 months ago

Inverter follows Requested Charge Voltage. My charger does not (currently using an EXCEL input table based on time of day). Fixed at 54.4 VDC which is the minimum of the Automatic Charge Voltage anyways ...

luckylinux commented 3 months ago

Lowering bulk for inverter to 54.0 VDC now ... Almost hit 3.60 VDC on one cell.

Lowering charger voltage to 54.0 VDC.

Raising back Bulk for Inverter to 55.2 VDC to test if auto charge voltage works ...

EDIT 1: it seems that like this requested charge voltage = 54.2 V, so it's essentially giving time for the JK BMS to balance the difference and discharge the highest cell ...

EDIT 2: I guess I'll need to play with paho-mqtt in my python script so that the charger does not go above the required charge voltage :(

luckylinux commented 3 months ago

This peak at 3.60 VDC on one cell is way too close for confort :(. It's true that I had calibrated the voltage measurement but still ...

image

luckylinux commented 3 months ago

It seems to be working now keeping that cell at around 3.55 VDC, requested charge voltage increased to 54.3 VDC (Charger is set to 54.1 VDC only, so basically to prevent the battery from falling lower now that the day is almost over).

I still do not get this cell #11: why does it shoot up so high, then go lower than any other cells, then increase again ?

Is this something you observed on your side as well sometimes @MrPabloUK ?

luckylinux commented 3 months ago

I take it back. It doesn't work well at all :(.

Current Charge Control reduced to approx 20A (for 2 batteries) as one cell was almost 3.60 VDC.

But Voltage Charge Control still increased its reference. The only solution was to lower the bulk voltage to 54.0 VDC to somehow keep it under control (still, the cell is sitting at 3.56 VDC !).

I'll need to widen the range of allowed values. 54.0 VDC is barely enough as a minimum in case this situation occurs.

luckylinux commented 3 months ago

@MrPabloUK: another thing that seems weird is the bulk / absorbtion offset. It does NOT seem to have any influence when I tune it. Instead, there is a fixed 0.1 V volt added to whatever bulk voltage I set.

Is this something that was recently changed ?

EDIT 1: looking at the code, sometimes the global constant inverter_offset_v is used. sometimes the state absorption_offset is used. Is this intentional ?

luckylinux commented 3 months ago

I think I'll need to revert to a previous state where the voltage control, although aggressive, was working.

EDIT 1: Or possibly doing a PI controller.

MrPabloUK commented 3 months ago

Can you provide graphs of this data please? I'm not really able to follow along effectively. I'm particularly interested in seeing requested voltage, actual inverter provided voltage and all the individual cell voltages. Please filter the graphs to the best time period that shows this behaviours for maximum data resolution.

luckylinux commented 3 months ago

I don't think I have Inverter Provided Voltage. The Inverter is currently NOT monitored by Home Assistant (I'll do that soon once I receive the RS485 Isolated Adapters that Digikey shipped me this week).

luckylinux commented 3 months ago

Overview (only Battery 02 shown for the cell voltages since that's the only one influencing the algorythm ... Battery 01 is "dumb" voltage controlled but much more balanced anyways): image image image

Zoom 15h40-15h55 (only Battery 02 shown for the cell voltages since that's the only one influencing the algorythm ... Battery 01 is "dumb" voltage controlled but much more balanced anyways): image image image

If I'd have to pick one place, it's 15h54. There the cell voltage is at its peak, yet the Requested Charge Voltage keeps increasing: image image

After that I manually lowered the bulk voltage, since the Automatic Charge Control was keeping charging that poor cell 11 .... image

MrPabloUK commented 3 months ago

Thanks @luckylinux, any chance you could export the cell voltage data for that 15 minute period to CSV? There's a handy download feature recently added to HA.

luckylinux commented 3 months ago

Sure thing, @MrPabloUK, thank you for your help :+1: .

cells_voltages_bat02_20240331_allday.csv.zip

cells_voltages_bat02_20240331_15h40_15h55.csv

MrPabloUK commented 3 months ago

I think I just spotted the issue, I had one calculation reversed, left over from an earlier version of voltage control. I've just updated the branch if you want to pull into your repository.

luckylinux commented 3 months ago

Yeah, with 2.1.1 you ADDED actually (Cell Voltage - 3.45 VDC bulk) to the Pack Voltage, whereas now with 2.1.2 you will SUBTRACT (multiplied by charge_v_factor proportional gain) to the current battery voltage that value. So if cell is now 3.55 VDC and charge_v_factor = 2, then your charge voltage will get (in 2.1.2) to: charge voltage = pack voltage + 2.0 * (3.45 - 3.55) = pack voltage - 0.2V.

In 2.1.1 that would have been charge voltage = pack voltage + 2.0 * (3.55 - 3.45) = pack voltage + 0.2V !

EDIT 1: So instead of a runaway effect, this (v2.1.2) should cause a stabilizing effect.

EDIT 2: Although I guess that now we might be (again) in the situation where we might not be able to charge at all (since the voltage is lower than the pack voltage). Or you hope this is still taken care by the ceiling() applied to the calculated voltage ?

EDIT 3: I implemented a basic function in my Charger to listen to the topic jk-bms-bat02/sensor/jk-bms-bat02_requested_charge_voltage/state and do not exceed that value. Didn't test it, but the script seems to be working at least. So now also the charge should obey the BMS. It's much slower though (IIRC 30 seconds for the charger to react), so I just set the reference voltage of the charger at most requested_charge_voltage - 0.2 VDC (where the -0.2VDC is a bit of butter).

MrPabloUK commented 3 months ago

I'm hopeful that a default v factor of 1.5 (down to 1 if it responds well) would be sufficient to slow / pause charging enough to reduce the max cells, then resume afterwards.

Nice work on making the standalone charger dynamic also, we shall see how this adjusted algorithm (without the inadvertent error) works for your misbehaving cell.

Also, the ceiling function has been removed in v2.1 of the charging logic, it's not still in your version is it?

luckylinux commented 3 months ago

Indeed, I actually left it in the code DISABLED. So you were right :)

            // Use minimum of dynamic charge voltage or bulk voltage
            dyn_chg_v = min(id(bulk_voltage).state, target_total_v);
            // Return automatic charge voltage
            // return ceil(dyn_chg_v * 10) / 10; // Charge Logic v2.0 - Can lead to oscillations
            return dyn_chg_v;                    // Charge Logic v2.1.1 - Testing

Yeah, the MQTT charger adaptation is really rough for now ... No error catching, doesn't check if esphome status is offline, etc.

https://github.com/luckylinux/solar-controller/blob/main/strategy/strategy.py

luckylinux commented 3 months ago

I'm currently testing with the charger so it's extremely slow to update (and I removed these -0.2VDC offset for the charger for now).

image

It's very aggressive, even with the aggressivity factor 1.0.

In my view, once that "mode" (overvoltage prevention) has been triggered, the voltage should be SLOWLY increased.

Basically set a maximum slew rate for the voltage "recovery" to go back to bulk.

Here it basically does: 54.3 VDC ... 55.3 VDC ... 54.3 VDC ... 55.3 VDC ...

As soon as the voltage goes down, battery gets discharged, max cell voltage IMMEDIATELY drops, thus the output also drops immediately.

This is the "classic" all-or-nothing controller basically.

image