cb-geo / mpm

CB-Geo High-Performance Material Point Method
https://www.cb-geo.com/research/mpm
Other
243 stars 83 forks source link

Refactor particle to handle different types #637

Closed kks32 closed 4 years ago

kks32 commented 4 years ago

Summary

Refactor Particle data to have the generic functions with the ability to fetch and modify generic scalar and vector properties, map, and interpolate these properties to nodes. Add a separate implementation functions to operate on these properties of the particle. These functions could be grouped under the namespace of solid particle functions, fluid-particle function, and mixed particle functions. The particle agnostic design of functions would allow for mix-and-matching different functions in the Solver class, without having to deal with multiple-inheritance or repetition.

Motivation

Refactoring and separating the Particle data from the implementation of algorithm enables handling different particle data types, without having to have a lot of virtual functions in the ParticleBase and other subsequent derived particles. Addresses the comments raised in RFCs #633 and #634. For example, deriving FluidParticle from Particle would result in extra data needed for FluidParticle, which is not needed in the case of Particle conversely, FluidParticle won't need certain Particle data. However, the important consideration is that the functions of a FluidParticles have to be virtualized in the abstract ParticleBase and every time a new class is derived, more abstract functions have to be added to the base and other existing derived classes. These functions, which are not useful, must, therefore, throw an error. The Abstract Factory design with inconsistent derived functions is inherently prone to errors when a developer fails to add functions to all the derived classes. This RFC proposes an alternate way of deriving particle data and keeping the implementation separate and reusable. Separation of implementation allows for the composition of different functions in the Solver class.

Design Detail

Particle class has a generic ordered map of scalar and vector properties. An open addressing scheme will be used to store the map properties, that the data stored in here will remain continuous in memory. These properties can be accessed using scalar_property and vector_property. Particle types will still control the initialization of these different properties, i.e., derived classes will be used to create new types of Particles, but all these classes will have a consistent functional interface. This avoids the problem of too many virtual functions. The ParticleBase will be altered to have the following. A consistent interface allows transferring particles across MPI ranks.

// Scalar properties
tsl::ordered_map<std::string, double> scalar_properties_;
tsl::ordered_map<std::string, Eigen::Matrix<double, Tdim, 1>> vector_properties_;

Scalar and vector properties can be set and returned using generic functions.

void assign_property(const std::string& property, double value);
double property(const std::string& property) const;

Nodes will also have a similar property pool structure, which will allow us to write generic map and interpolation functions.

//! Map scalar properties to nodes
template <unsigned Tdim>
void mpm::Particle<Tdim>::map_scalar_properties_to_nodes(const std::string& property) noexcept {
  auto value = scalar_properties_.at("property");
  // Check if scalar property is not max
  assert(value != std::numeric_limits<double>::max());

  // Map mass and momentum to nodal property taking into account the material id
  for (unsigned i = 0; i < nodes_.size(); ++i) {
    value = value * shapefn_[i];
    nodes_[i]->update_property(true, property, value);
  }
}

This property pool structure provides a generic interface to assign and get properties. We can now define a collection of functions under a namespace, such as particle::fluid to define functions that operate on fluid properties. The initialization of the property pool for the Particle will be done by the derived Particle class. Each particle type implementation will have access to the property pool data.

  // Assign projection parameter
  template <unsigned Tdim>
  void mpm::particle::fluid::projection_parameter(std::shared_ptr<mpm::ParticleBase<Tdim>> particle, double value) {
    particle->assign_scalar_property("projection", value);
  } 

In the custom solver for the fluid particle, we'll have:

// Iterate over all the particles and assign a projection parameter
double value = 2.4;
mesh->iterate_over_particles(
       [&value](std::shared_ptr<mpm::ParticleBase<Dim>> ptr) {
          return mpm::particle::fluid::projection_parameter<Dim>(ptr, value);
 });

To operate on a specific set of particle types, we could use iterate_over_particles_with_predicate functionality that can filter particles with a particular type.

With this approach, a solver class can build on different types of Particle data and functions pooled from multiple namespaces.

The proposal is to start moving new Particle types to use this design, and the existing solid particle will be refactored in phases.

Drawbacks

The implementation operating on the derived data types is separated from the actual data, which is good in some ways, but also it might be seen as breaking encapsulation. Considering the fact that if something owns a Particle type, then it should be able to modify the data in the Particle. This separation of data and implementation may give some developers a false feeling that their implementation could work on all Particle types, however, as soon as the code is run or compiled with debug mode, it will be known that the data type is unsupported and cannot be used. The phase refactoring of Solid Particles will break a lot of existing code, however, it will minimize the number of update_nodes and interpolate_from_nodes design.

Rationale and Alternatives

Advantages

Other designs considered

What is the impact of not doing this?

Prior Art

Most large codes avoid a lot of derived classes and Abstract class design with inconsistent interfaces across derived classes.

Dealii uses an extreme case of all properties being double and provide particle handle. Dealii Particle

Uintah also uses a generic datawarehouse to store properties.

No Paradigm programming

Free your functions

Unresolved questions

What parts of the design do you expect to resolve through the RFC process before this gets merged?

New Particle types will follow this new design, while the existing solid particle has to be slowly refactored.

What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?

The refactoring of solver class to make it more composable will be discussed in a separate RFC.

Unresolved questions

How to handle MPI data transfer with an arbitrary number of particle data types? Since the ParticleBase will be derived for each type of Particle, we will have enough information to transfer particles. Suggest creating serialization and deserialization schemes. Additionally, since the number of active Particle types are going to be a handful at most, we can iterate through different types when doing MPI halo exchange.

Still undecided if we need to derive a new Particle for each type, so the advantage of this method is that the derived Particle type knows its variables and is easy for instantiation. However, it creates additional vtables. On the other hand, if we move the variable instantiation outside of the Particle, we don't have to derive new types, but initialization and MPI transfer may become harder?

Changelog

[Edit #1]: Update iterate_over_mesh function and formatting. [Edit #2]: Add std::variant as alternatives and prior art [Edit #3]: Unresolved question on deriving new Particle types [Edit #4]: Discussion on superfunctions as an alternative

ezrayst commented 4 years ago

Thanks @kks32. I think the plan on doing that for two-phase first is good, and then refactor the current single-phase MPM slowly. You can assign different tasks to us actually, and I think it would be a good exercise to have someone new like Joel to be part of this refactoring.

I do have a few comments/questions on the design:

I will drop comments as they come. I will be thinking about this too.

kks32 commented 4 years ago
* Previously, we need to have a new `Particle ` derived class every time we have a new solver. I guess with this, we will construct many solvers by pulling different functions from `particle::solid`, `particle::fluid` and (one day) `particle::gas`. Is this the spirit of the new design?

Still undecided if we need to derive a new Particle for each type, so the advantage of this method is that the derived Particle type knows its variables and is easy for instantiation. However, it creates additional vtables. On the other hand, if we move the variable instantiation outside of the Particle, we don't have to derive new types, but initialization and MPI transfer may become harder?

* What about state variables? Would we store them within the `double` ordered map?

Same way as we store an unordered_map right now, it's a change in data structure, won't affect the design as a whole.

tianchiTJ commented 4 years ago

I'm not sure if I have understood correctly of all the design. But if we define the functions for different phases in different namespace of Particle class, then it means there will be no new deriving Particle class from ParticleBase, what's the meaning of the current Particle class? Why don't we define all things in ParticleBase straightforward?

kks32 commented 4 years ago

@tianchiTJ yes, that's correct. I'm still undecided if we need to derive datatypes or we could initialize and transfer Particles without having derived classes, this would require knowing the size. If we can achieve serialization then there is no need for deriving particles, and we can move everything to Particle class and remove the base class altogther.

tianchiTJ commented 4 years ago

Shall we keep the current ParticleBase-Particle-TwoPhaseParticle(or other deriving type) structure, and only put the variables storage, initialization functions and some other necessary functions (like the compute_mass, compute_update_position), then move all other specific function into a new class, like ParticleFunction, then create different namespace into the ParticleFunction for different type particles, like SolidParticleFunction, FluidParticleFunction.

kks32 commented 4 years ago

Hi @tianchiTJ what would be the reasoning behind deriving the types, because the only difference would be the instantiation functions. We can't really have fixed data types in the derived class and all the variables would be in scalar_properties_ or vector_properties_

tianchiTJ commented 4 years ago

I just think not change the current structure too much, but you are right, maybe it's not necessary to use the deriving types. Anyway, could we decide it as soon as possible? Or we can not work on the implementation of some important features. @kks32 @bodhinandach

kks32 commented 4 years ago

I'd like to go ahead with this design. Let's put to a vote, if you are happy add a :+1: to this comment, if not :-1: .

bodhinandach commented 4 years ago

Thanks, @kks32 for the RFC. I some comments (or questions) for the proposed RFC:

  1. Will stresses and strains will be stored in vector_properties_too? The type is tsl::ordered_map<std::string, Eigen::Matrix<double, Tdim, 1>> so it wont fit.
  2. The way you update the property in the node: nodes_[i]->update_property(true, property, value) means that we will not use phase id in node anymore?
  3. I see that this proposal is generally good to treat mapping, assignment, and get functions of different fluid, solid, and mixture properties. How about other functions only useful for fluid/two-phase solvers? For instance, in two-phase in we might have to update intermediate liquid velocity with function: bool compute_updated_liquid_velocity(double dt); How to treat such function?
  4. I am interested to know what do you think about the particle transfer:

    How to handle MPI data transfer with an arbitrary number of particle data types? Since the ParticleBase will be derived for each type of Particle, we will have enough information to transfer particles. Suggest creating serialization and deserialization schemes. Additionally, since the number of active Particle types are going to be a handful at most, we can iterate through different types when doing MPI halo exchange.

Any further thoughts on this?

  1. Can we have similar mapping functions from particle to cell for elemental matrices? We need that.

I think I agree with @tianchiTJ that we need to minimize the changes as much as possible. To my knowledge, the current RFC only fixes the function derivations for common processes in particles. But for special functions, I am still not sure how this will help.

kks32 commented 4 years ago

Will stresses and strains will be stored in vector_properties_too? The type is tsl::ordered_map<std::string, Eigen::Matrix<double, Tdim, 1>> so it wont fit.

No, stresses and strains will be there as usual. Nothing is changing in the base particle class, we'll slowly refactor only scalar and vector properties.

The way you update the property in the node: nodes_[i]->update_property(true, property, value) means that we will not use phase id in node anymore?

Thanks, we'll pass an additional argument of phase to update_property.

I see that this proposal is generally good to treat mapping, assignment, and get functions of different fluid, solid, and mixture properties. How about other functions only useful for fluid/two-phase solvers? For instance, in two-phase in we might have to update intermediate liquid velocity with function: bool compute_updated_liquid_velocity(double dt); How to treat such function?

I can't see why this will be any different from assigning the projection parameter example shown in the RFC. We define a function that accepts a particle pointer and can set and get velocities.

I am interested to know what do you think about the particle transfer: How to handle MPI data transfer with an arbitrary number of particle data types? Since the ParticleBase will be derived for each type of Particle, we will have enough information to transfer particles. Suggest creating serialization and deserialization schemes. Additionally, since the number of active Particle types is going to be a handful at most, we can iterate through different types when doing MPI halo exchange. Any further thoughts on this?

At the moment the idea is to serialize the object data and transfer each particle type separately.

Can we have similar mapping functions from particle to cell for elemental matrices? We need that.

The particle class has a pointer to the cell, so I don't see why not. If we need to store element matrices, it might be better to do something similar to the contact algorithm with a common nodal properties pool.

I think I agree with @tianchiTJ that we need to minimize the changes as much as possible. To my knowledge, the current RFC only fixes the function derivations for common processes in particles. But for special functions, I am still not sure how this will help.

Particle will be a data container, so any function can be defined that updates the particle properties. So no function will be special. There will be no derived classes.

bodhinandach commented 4 years ago

@kks32 @tianchiTJ These are the functions collected from Particle, FluidParticle, and TwoPhaseParticle. I made the categories, but please modify and add or remove it accordingly.

@kks32 some functions listed here are not available in the current develop branch, so if you can't find it, please ignore them as Tianchi and I will organize them according to our PR.

@tianchiTJ Can you check and list things that are not available here?

Some comments:

  1. Many functions are derived or override from the functions listed in General Functions, mostly in TwoPhase particles. The lists below only contain distinct functions.
  2. Please ignore the pure and non-pure virtual functions indication, I just copied them from the header files.

General Functions:

Common Properties (Geometric, Kinematics & Material):

Mapping Functions:

Particle Constraints & BC

Fluid:

Mapping Functions (to cell):

TwoPhase:

Mapping Functions (to cell):

Mapping Functions (to node):

kks32 commented 4 years ago

@bodhinandach Thanks for grouping functions based on type. However, we want to group them by the order of execution in the solver, so we may be able to write bigger functions, which will be generic for all derived types. The exercise was to try to see if we could actually group these functions in an order they appear in the solver, so we don't end up having non-pure virtual functions or specific derived classes having extra public functions. Looking at MPMExplicit and Navier-Stokes solver I can't see a potential set of global particle functions than can then call local private functions in the derived type.

Additional context

@kks32 would like to avoid derived classes behaving like a superset to the base class or have virtual functions that are not defined in all the derived classes.

@bodhinandach suggested modifying this list of functions in different particle types into a superset of functions, which are available in all particle types, such as initialize, compute_stress.

This discussion and the listing of functions is to check if such a grouping of particle functions is feasible, leaving no virtual functions.

The benefit is potentially avoiding virtual functions and has safe overriding. Even if we assume this approach is feasible, hiding away all the functions as private would mean a smaller exposed interface, which will not be easy to extend and add functionality. It is also difficult to write tests because all these functions will be private. Testing would then require calling friend functions, which is not a good design.

kks32 commented 4 years ago

After some more consideration, I am wondering about the possibility to derive particle types, but only keep the common particle functions within the class, and merge all independent/specialized functions to be outside the class. This will help to initialize the classes with the right types, and also in MPI transfer, as the type is known. We derived data of individual Particle types but keep specialized particle functions separately. Maybe the particle functions may have to stay in the same namespace.

kks32 commented 4 years ago

This design was 30% slower than private variables so closing this RFC. See #654 for more details