New code on ecm_redefine is too complex

victorventuri commented 3 months ago

Class to hold on to HealthVariable objects has is too complex. How can we rewrite it such that the following requirements are met

[ ] It contains an unspecified number of HealthVariable objects, determined by the end application (such as multiple RC components)
[ ] It knows which HealthVariable objects can be updatable
[ ] It can update specified objects from arrays, similarly to what HealthVariable does
[ ] For a hypothetical end user, it can be treated as a HealthVariable object, seeing as these are conceptually equivalent

The questions that I have are:

Can we simply implement all of this functionality inside of HealthVariable, like on commit b1c53bc?
If not, can we have attributes be HealthVariables or lists/tuples of HealthVariables? This way, we could clean up a lot of the dynamic allocation code
(in the not-so-distant future, can we make sure we can have collections of collections of HealthVariables in order to implement efficient operations with numpy.ndarrays?)

Additional comments:

Part of the reason why I did it dynamically was the following hypothetical situation: imagine that, in the future, we still want to use the ECM, but with some physics included, say, we want to incorporate loss of lithium inventory and loss of active material. We could simple '+=' those HealthVariables into the ECMASOH without needing to rewrite another whole class to deal with this. However, perhaps it would be best if these were all completely separate classes as well.
A perhaps not good benefit is that, for example, the RC components can be stored in a HVCollection composed of individual RC components, each also being its own HVCollection composed of a Resistor and a Capacitor

@WardLT What other things am I missing from this?

WardLT commented 3 months ago

Should we also have the ability to mark which variables are "updatable" vs which are not?

victorventuri commented 3 months ago

That's requirement 2 above, no?

WardLT commented 3 months ago

Yep🤦

victorventuri commented 3 months ago

Adding here a simple code snippet of how it currently works for reference.

Code (Note: ignoring warnings because I chose to warn when model fields are reset, but that adds clutter here):

from asoh.models.base import HealthVariableCollection as HVCol
from asoh.models.base import HealthVariable as HV
import warnings
warnings.filterwarnings('ignore')

class DumHV(HV):
    name: str

class DumHVCol(HVCol):
    name: str

r0 = DumHV(base_values=[1,2], name='r0')
c0 = DumHV(base_values=3, name='c0', updatable=False)
r1 = DumHV(base_values=[4,5], name='r1')
c1 = DumHV(base_values=6, name='c1')
r2 = DumHV(base_values=[7,8], name='r2')
c2 = DumHV(base_values=[9,10], name='c2', updatable=False)
rc1 = DumHVCol(name='rc1')
rc1 += [r1, c1]
rc2 = DumHVCol(name='rc2')
rc2 += [r2, c2]
asoh = DumHVCol(name='asoh')
asoh += [r0, c0, rc1, rc2]
print('Updatable: ', asoh.updatable)
print('ASOH len: ', len(asoh))
print('Updatable values: ', asoh.get_updatable_parameter_values())
print('Changing values!')
asoh.update(new_values=[10,20,30,40,50,60,70])
print('New values: ', asoh.get_updatable_parameter_values())
for param_name in asoh.updatable:
    print('-----------------')
    print('Parameter ', param_name)
    param = getattr(asoh, param_name)
    if isinstance(param, DumHVCol):
        print('Sub-parameters: ', param.updatable)
    print('Updatable vals: ', param.get_updatable_parameter_values())
print('---------------------------------')
print('All accessible fields: ', list(asoh.model_fields.keys()))

Output:

Updatable:  ('r0', 'rc1', 'rc2')
ASOH len:  7
Updatable values:  [1, 2, 4, 5, 6.0, 7, 8]
Changing values!
New values:  [10, 20, 30, 40, 50, 60, 70]
-----------------
Parameter  r0
Updatable vals:  [10, 20]
-----------------
Parameter  rc1
Sub-parameters:  ('r1', 'c1')
Updatable vals:  [30, 40, 50]
-----------------
Parameter  rc2
Sub-parameters:  ('r2',)
Updatable vals:  [60, 70]
---------------------------------
All accessible fields:  ['updatable', 'name', 'r0', 'c0', 'rc1', 'rc2']

victorventuri commented 3 months ago

Going to try to summarize my thoughts here for a more streamline discussion over Teams: It seems like, no matter what we do, we cannot escape the dynamic addition of more information for the HealthVariableCollection class. After all, we need to accommodate any natural number of RC components. If the need for dynamic allocation is not real, then we need to consider more than just these two options I am about to discuss below. Note: I will abbreviate HealthVariable to HV below

Option 1: employ python's inherent dynamic allocation capabilities in Tuples and List ('+=') for the task

Pros:

we can define field names pre-instantiation and enforce them
no need to build our own dynamic allocation
peace-of-mind of knowing that no matter what, it will work

Cons:
most of the inheritance from HealthVariable will need to be re-written (__len__, get_updatable_parameter_values, _update_single_param, update), to the point that maybe it shouldn't even inherit from it, or we should change how HV is defined.
if in the future we decide to add more metrics to a collection (such as incorporating physics-informed loss of lithium inventory, loss of active material, electrode potential degradation, etc.), we will need to create a brand new class for that.

My thoughts:
- the Pros are very strong
- in my simple mind, a collection of HVs is conceptually indistinguishable from a multivariate HV (similar to how HV has base_values, the concept of resistance is conceptually equivalent to the concept of resistance at 50% SOC). This makes it hard for me to overlook Con#1

Option 2: trim down clutter in current implementation

Pros:

most of the functionality already implemented
conceptually more sound: an HVCollection is a collection of HV and HVCollection objects
more easily adaptable to other models
can (nearly) be treated as a sequence of HV-like objects

Cons:
no control over field names, have to hope for the best out of the user (==me), who more often than not, has a knack for dumbfounding the programmer with overcomplicated "solutions" (xkcd 530)
will have to properly test and maintain our dynamic allocation capabilities
will have to de-clutter the code

My thoughts:
- conceptually, this is pleasing to me, but, as you can tell, the code feels like forcing a square peg through a round hole (insert another xkcd comic relief here)
- I believe that, as long as we provide good templates for what these collections should look like, we can mostly wash our hands of Con#1

Final remarks:

I will start this by saying I am positively biased towards Option 2
I believe the code can be decluttered in the following manner: change the GenericContainer to have some of the basic functionality, such as update, get_updatable_parameter_values, __len__, and force others (such as _update_single_param) to be abstractmethods, and make most things that will be updated a child of this
If we want to make dynamic allocation easier, we can, instead of adding new fields to pydantic's BaseModel children, use the extra option, and then update the appropriate methods and properties to check model_fields.keys() and model_extra.keys(), though I am not sure what impact that would have on subsequent serialization
Consider adding/overwriting __setitem__ and __getitem__ methods for looping through and setting attributes

ROVI-org / auto-soh