pybamm-team / PyBaMM

Fast and flexible physics-based battery models in Python
https://www.pybamm.org/
BSD 3-Clause "New" or "Revised" License
1.07k stars 529 forks source link

[Roadmap area] Model refactoring #3908

Open brosaplanella opened 6 months ago

brosaplanella commented 6 months ago

We don't have as many contributors adding models to PyBaMM as we'd like. There are a few reasons for this

We propose "model entry points" as a solution to this. Using entry points, contributors can create models in their own repositories and make them easily available to the rest of the community, without the models having to be added to PyBaMM itself. This will enable a community for battery models that use the PyBaMM infrastructure but are not owned by the PyBaMM team.

Advantages

How to start

Later changes

After this is done, we will be in a much better position to consider fully separating the model code and PDE code

Originally posted by @tinosulzer in https://github.com/pybamm-team/PyBaMM/issues/3839#issuecomment-1966614301

valentinsulzer commented 3 months ago

Some additional thoughts on refactoring submodels. Perhaps deserves its own issue (or three) but relevant to the roadmap.

Although in general we should move towards encouraging more standalone models, the composability of submodels is still useful. The things that make the submodel structure complex are:

  1. The relationship between options strings and submodels is opaque
  2. Submodels are nested (e.g. BaseParticle -> FickianDiffusion)
  3. Separation into get_fundamental_variables, get_coupled_variables, set_rhs, etc

Each of these can be addressed and simplified, although in all cases it's a big undertaking.

  1. Get rid of options and instead only allow creating models by passing in a dictionary of models. This is heavily breaking and may not be worth it. The options handle quite a few special cases and putting that onus on the user could be quite annoying.
  2. The role of nested submodels is to add a bunch of extra variables, but this could be done with helper functions instead to make it less confusing - it doesn't need to be object oriented. This can be done in a non-breaking way for the front end.
  3. This is the hard one. It would be great to be able to create a model in a single function instead, but the challenge is that there are coupled dependencies. However, we could get around this as follows: (i) define the equations as if all the variables exist (ii) assign any variables that don't exist yet to a symbolic CoupledVariable (iii) at the end, go through and replace any CoupledVariable objects with the corresponding variable with that name. If there are any circular dependencies, raise an error. Here's a minimum working example of this:
    
    import pybamm

class CoupledVariable(pybamm.Variable): def init(self, *args, *kwargs): super().init(args, **kwargs)

class Variables(dict): def getitem(self, key):

if a variable is not found, return a CoupledVariable

    if key not in self:
        return CoupledVariable(key)
    return super().__getitem__(key)

class SubmodelA: def build(self, variables): a = pybamm.Variable("a") b = variables["b"] self.rhs = {a: -b} self.initial_conditions = {a: 1} self.variables = {"a": a}

class SubmodelB: def build(self, variables): a = variables["a"] b = pybamm.Variable("b") self.rhs = {b: -a} self.initial_conditions = {b: 1} self.variables = {"b": b}

model = pybamm.BaseModel() variables = Variables()

submodel_a = SubmodelA() submodel_a.build(variables) variables.update(submodel_a.variables) model.rhs.update(submodel_a.rhs) model.initial_conditions.update(submodel_a.initial_conditions) submodel_b = SubmodelB() submodel_b.build(variables) variables.update(submodel_b.variables) model.rhs.update(submodel_b.rhs) model.initial_conditions.update(submodel_b.initial_conditions) model.variables = variables

def replace_coupled_variables(d, variables): items = list(d.items()) for key, value in items: d[key] = process_symbol(value, variables) return d

def process_symbol(symbol, variables):

replace CoupledVariable with the actual variable from the dictionary

if isinstance(symbol, CoupledVariable):
    return variables[symbol.name]
elif isinstance(symbol, pybamm.UnaryOperator):
    new_child = process_symbol(symbol.child, variables)
    return symbol._unary_new_copy(new_child)
else:
    return symbol

replace_coupled_variables(model.rhs, variables) replace_coupled_variables(model.initial_conditions, variables)

sim = pybamm.Simulation(model) sim.solve([0, 10]) sim.plot(["a", "b"])


A side benefit of this is that it might make it possible to print the equations of a standalone submodel, which until now has not been possible.
rtimms commented 3 months ago
  1. I think we should keep options to help perform these sanity checks. Then composing models from submodels can be "user-beware!"
  2. I think one level of nesting is ok (e.g. base model takes the variables you solve for and then makes all the different averages, surface values etc.) but more than that gets hard to follow.
  3. This would be great and the most impactful change.
valentinsulzer commented 3 months ago

Ok, let's focus on 3 first then

brosaplanella commented 3 months ago

My thoughts:

  1. I agree, but I think it probably makes sense to keep options for fairly standard models, i.e. let's restrict to SPM, SPMe and DFN with a few options (e.g thermal, SEI...) but anything more complex let's use the dictionary. Probably we would need functionality to read from/print to dictionary (the latter would be useful for reproducibility).
  2. Not sure I follow the point completely, but I think what Rob said is very reasonable.
  3. Agree, that would be very useful. Would we then merge all the set_rhs, set_initial_conditions, etc. into the same function or would this only replace get_fundamental_variables and get_coupled_variables?
rtimms commented 3 months ago

I wonder if 1 is simplified by having more models but fewer options. E.g. should particle size distributions models be separate models (MPM, MPMe, MPDFN) rather than having a particle size option? Not sure where the distinction is between a new model and an option, or if this is more confusing than what we currently have.

It would make documenting easier as we can more easily say “this is the model” rather than “the model is this for option A and this for option B”.

brosaplanella commented 2 months ago

According to the docs we have 32 different options we can set. Below there is a potential classification of the options to make sense of them. It might also be helpful to think which of these options would be useful in other chemistry models and which not.

Some general thoughts:

Classification of options

Options that should probably stay

Additional physics

Here I have included additional physics that are common enough that people would want to add on electrochemical models quite often.

Additional stuff we might want to calculate when solving

Does not change the core of the model but rather how we write/solve it.

Options that should probably go

Stuff that should be a separate model

Niche/expert options that should be better accessed as a submodel dictionary

Includes model-specific options (e.g. lead acid, MSMR) which could potentially be set as model specific options (i.e. not defined at BaseBatteryModel).

Options that could be set through the parameter values

Unclear to me

Hysteresis

There are hysteresis options that not sure where they would fit better, also because for OCP it includes MSMR too.

(Sub)model specific options

These are clearly options but that only apply to certain (sub)models.

Other