nitrousnrg commented 2 years ago

The battery level (SOC) code was relying solely on instantaneous battery voltage. Even at light loads battery sag produced large changes in the battery % displayed in the gauges. At higher loads, the SOC would sag as much as 80%.

Here is an example of the old ways vs the new approach

Note that this is an early dataset with very light filtering. The code proposed in this commit uses stronger filtering so those small drops are less noticeable. Both traces use the same 14Ah battery, same drive unit and same rider, although the intensity might have been different.

For reference of the filtering, this is the response to voltage vs time when its relying only on the input voltage (motor not driven for enough time). I'm using a lab power supply to change the input voltage.

The drops are caused by the recovery time of the cell. Resting voltage should be longer instead of the short 5 seconds being used now, but it can be increased later. If resting time is set to 60 seconds, it may be risking some significant drift over time if the drive never stops for those 60 seconds. Also, batteries have less capacity at low temperatures and high currents that we're not compensating for, so we better have some data points in between.

And here is a more complete picture showing the periods without battery current and how the input voltage takes some time to recover, affecting the SOC estimation

Resting time has a big impact on the estimation, it would make sense to expose it in mcconf

vedderb commented 2 years ago

I think this is problematic for applications where you are going to use a lot of throttle for a long duration at a time. For example on boats you put the throttle in one position and leave it there with small adjustments sometimes for the entire charge.

Have you tried just using a really slow low-pass filter? That is what I do on the BMS and it seems to work quite well. The problem with that is when you have high average draw the SOC will be estimated lower, but the rest of the SOC won't be available to use then anyway.

nitrousnrg commented 2 years ago

For an application like that we can expose this parameter to vesc tool:

define MIN_RESTING_TIME_SECONDS 5.0

if set to 0.0 seconds it will never use the Wh tracker and behave like usual. Well, plus the extra filtering.

Zero seconds could be the default to avoid breaking workflows.

nitrousnrg commented 2 years ago

For an application like that we can expose this parameter to vesc tool:

define MIN_RESTING_TIME_SECONDS 5.0

If this is viable I'll extend this PR with a commit that exposes it next to si_battery_ah

If you would like to also expose the filter constant or ramp time let me know and I'll put them together.

surfdado commented 2 years ago

Even for the E-boat example isn't this proposed approach still going to perform better than the current pure voltage based approach? The instant you push that throttle (or a few seconds later with a low-pass filter) the SoC estimates will be too low.

danilolattaro commented 2 years ago

Agreed, SOC estimation would be off for the given example anyway

surfdado commented 2 years ago

I'm now testing your commit in my build, and I noticed a few things: 1) when powering on the board it takes very long (almost a minute?) for the initial percentage estimate to stabilize. If I start riding immediately I'm stuck with 0%... 2) the SoC percentage during easy continuous riding on flat road (going 15mph at less than 3-5 battery amps) the percentage dropped faster than I expected and then when taking a break after two miles went back up by a few percentage points, not a big deal but I still wanted to let you know

However, what I do like is how the percentage remains unaffected by acceleration, braking, or small hills

Big hill testing is next... but overall - I really like it, thank you!

nitrousnrg commented 2 years ago

when powering on the board it takes very long (almost a minute?) for the initial percentage estimate to stabilize. If I start riding immediately I'm stuck with 0%...

Solving this is trivial, its like 1 line of code in mc_interface.c to shorten the filter time constant at boot. I might have added it locally after the PR, I can take a look.

the SoC percentage during easy continuous riding on flat road (going 15mph at less than 3-5 battery amps) the percentage dropped faster than I expected and then when taking a break after two miles went back up by a few percentage points, not a big deal but I still wanted to let you know

I have hundreds of data logs from many different riders and I always see that. Its a though compromise, the Li-Ion cell takes some time to fully recover its resting voltage, lets say its something like 10 minutes. However this code only waits a couple of minutes so when it switches to voltage mode the cell is not fully resting and the algorighm gets a bit confused.

The solution is to wait longer...but I just didn't want to risk relying too much in the Ah tracker as over time it could backfire if the batt capacity is not correct. Exposing the wait time as a configurable parameter I think is a good solution for everyone and as it is now it doesnt miss it for more than a few %. Down the road I can work on a better batt capacity estimation that takes into account discharge rate, temp and age.

surfdado commented 1 year ago

Solving this is trivial, its like 1 line of code in mc_interface.c to shorten the filter time constant at boot. I might have added it locally after the PR, I can take a look.

Would you mind sharing that "trivial" change? May be trivial to you but not to me....

nitrousnrg commented 1 year ago

Would you mind sharing that "trivial" change? May be trivial to you but not to me....

Oh I'm sorry if my comment landed in a bad way.

I just checked and the code to speed up the convergence is already included at the end of this PR commit. I can't find a way to link to the exact line but it looks like this

    // Force a quick convergence in the first 10 seconds
    float current_time = (float)chVTGetSystemTimeX() / (float)CH_CFG_ST_FREQUENCY;
    if( current_time < 10.0 ) {
        ramp_time = 0.5;
        filter_constant = 0.5;
    }

The difference I have between that and my local code is that I started to use these quicker constants:

    // Force a quick convergence in the first 6 seconds
    float current_time = (float)chVTGetSystemTimeX() / (float)CH_CFG_ST_FREQUENCY;
    if( current_time < 6.0 ) {
        ramp_time = 0.01;
        filter_constant = 1.0;
    }

Let me know if you try it and still doesn't converge quickly, there could be something else going on, like offset compensation or other stuff delaying too much the OS startup.

surfdado commented 1 year ago

Awesome, will try the suggested ramp time and report back - thank you!

surfdado commented 1 year ago

Let me know if you try it and still doesn't converge quickly, there could be something else going on, like offset compensation or other stuff delaying too much the OS startup.

So unfortunately the quicker constants hasn't helped - it's almost worse now, but hard to tell. Either way it's pretty much unusable the way it is. It's only accurate if I first wait 1-2 minutes after powering on the board.

nitrousnrg commented 1 year ago

Let me know if you try it and still doesn't converge quickly, there could be something else going on, like offset compensation or other stuff delaying too much the OS startup.

So unfortunately the quicker constants hasn't helped - it's almost worse now, but hard to tell. Either way it's pretty much unusable the way it is. It's only accurate if I first wait 1-2 minutes after powering on the board.

Oh ok I'll take a look

nitrousnrg commented 1 year ago

Rebased this branch to the latest master... although I can't reproduce @surfdado 's issue of the SOC not converging quickly on initialization. What hardware are you using?

In order to avoid problems in setups that have not defined the battery capacity, I defaulted the resting time to zero which avoids using W.hr tracker. This will keep the heavy filtering on the Vin so the settling time is about 60 seconds but converge quickly during the first 10 seconds.

Only if the hardware header defines MIN_RESTING_TIME_SECONDS the watt.hr tracker will be used to try to ignore the sag effect. In our testing we saw the SOC dropping by 80% without this PR and about 5% when this code is used.

I'm not quite sure how to approach this, I've been told several times that the original SOC is unusable so if the PR is rejected I need to start learning lisp to get it done as its absolutely a must for us. If exposing the time constants to vesc tool is the way to go I can still do it, but might be rejected if its a feature freeze.

vedderb commented 1 year ago

I don't like that the state of the filter is inside this function, so the behavior will depend on how often it is called. It would be better to have some thread run the estimation then and have this function just get the estimation state from that thread (e.g. the timer thread).

Also, did you have a chance to just try a heavy lowpass-filter on the input voltage?

vedderb commented 1 year ago

Here is a try at a simple low-pass filter that can be configured from vesc tool: https://github.com/vedderb/bldc/commit/50aa16d20083d53af9fd324ae9c4138a2b382527

nitrousnrg commented 1 year ago

I don't like that the state of the filter is inside this function, so the behavior will depend on how often it is called. It would be better to have some thread run the estimation then and have this function just get the estimation state from that thread (e.g. the timer thread).

Ah good point. Makes sense to run it in the timer thread.

Also, did you have a chance to just try a heavy lowpass-filter on the input voltage?

Yes but not so heavy, for example between a low pass and a ramp, I preferred a ramp because the lowpass moves pretty quickly with large input variations and I wanted to avoid that hesitation in the gauge. When I went too heavy on the filtering it started detaching from the SOC calculated with the W.hr tracker, but I didn't try that many filter values as I did yesterday and today.

Here is a try at a simple low-pass filter that can be configured from vesc tool: https://github.com/vedderb/bldc/commit/50aa16d20083d53af9fd324ae9c4138a2b382527

Checking your last commit here is the default value of 45.0. I get a drop from 58% to 39% in about 20 seconds. These are allegedly samsung 30Q cells.

And 55% to 36%

Filter = 60.0, still too much variation

At filter value 70.0 it works quite well in different dyno runs

At filter=80.0 it becomes too unresponsive

Zoomed in I see it only reacts to large variations and low loads are not accounted on time. I don't like the scenario of a SOC that shows more % than it really has, it would leave people stranded.

I just broke the dyno so that's it for today.

Bottom line is that I can work with this, I'll update my hardware mcconfs asap to use a filter value of 70.0 which works much better on this 14s4p battery. I imagine 70 is better than 45 for most cases. Over time I'll be getting more real use logs and will be able to comment further.

Feel free to close the issue, I think if we want more accurate SOC than this it should come directly from the BMS

vedderb commented 1 year ago

You could also argue that the value under load is the SOC that you can count on as it will stop working when that value goes to 0 (at least the power will start to get reduced then). If you want to get rid of the ESR-effect it might make sense to add a high-pass filter with a current threshold that tries to estimate the ESR and compensates for it on the input voltage.

I'm closing this one for now, we can look into it again in the next beta.

vedderb / bldc

Stabilize battery level with resting voltage and Watt hour tracking #477

define MIN_RESTING_TIME_SECONDS 5.0

define MIN_RESTING_TIME_SECONDS 5.0