cb-geo / mpm

CB-Geo High-Performance Material Point Method
https://www.cb-geo.com/research/mpm
Other
245 stars 82 forks source link

[Refactor] Particle variables and functions handling #654

Closed kks32 closed 4 years ago

kks32 commented 4 years ago

Describe the PR This is an alternative implementation of particle refactor.

After some more consideration, I am wondering about the possibility to derive particle types, but only keep the common particle functions within the class, and merge all independent/specialized functions to be outside the class. This will help to initialize the classes with the right types, and also in MPI transfer, as the type is known. We derived data of individual Particle types but keep specialized particle functions separately. Maybe the particle functions may have to stay in the same namespace.

Related Issues/PRs

637 and #650

Additional context Specialized functions will be still separate.

Following are the run-times for 3D hydrostatic column 10k steps on Stampede2

Branch TACC runtime (s)
develop 742.6
prop map + free fn 1280
vector + free fn map 865
flatmap + free fn 972
codecov[bot] commented 4 years ago

Codecov Report

Merging #654 into develop will increase coverage by 0.00%. The diff coverage is 97.45%.

Impacted file tree graph

@@            Coverage Diff            @@
##           develop     #654    +/-   ##
=========================================
  Coverage    96.66%   96.66%            
=========================================
  Files          123      124     +1     
  Lines        25375    25641   +266     
=========================================
+ Hits         24527    24785   +258     
- Misses         848      856     +8     
Impacted Files Coverage Δ
include/node_base.h 100.00% <ø> (ø)
include/particles/particle_base.h 100.00% <ø> (ø)
include/solvers/mpm_base.tcc 73.21% <50.00%> (-0.48%) :arrow_down:
include/particles/particle.tcc 93.24% <95.56%> (-0.89%) :arrow_down:
include/node.tcc 95.12% <96.18%> (-0.86%) :arrow_down:
include/mesh.tcc 83.94% <100.00%> (+0.04%) :arrow_up:
include/node.h 100.00% <100.00%> (ø)
include/particles/particle.h 100.00% <100.00%> (ø)
include/particles/particle_base.tcc 100.00% <100.00%> (ø)
include/particles/particle_functions.tcc 100.00% <100.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3b20da5...a0bbd15. Read the comment docs.

bodhinandach commented 4 years ago

Dear all, I think this refactoring branch is ready for review. Could you all please some time to take a look at it? We can discuss it for sure in the following developer meeting. Here are some notes to ease the checking process:

As this change is breaking for all the development, could you please test it to run some of your simulations and let us know if you encounter any issue. Many many thanks in advance for the help. Any comments and questions are appreciated.

tianchiTJ commented 4 years ago

Why initialise_particle(const HDF5Particle& particle, const std::shared_ptr<Material<Tdim>>& material) is defined as virtual function in particle.h?

bodhinandach commented 4 years ago

Why initialise_particle(const HDF5Particle& particle, const std::shared_ptr<Material<Tdim>>& material) is defined as virtual function in particle.h?

Good point. I will remove it, we should either use virtual or override, but not both in one go.

tianchiTJ commented 4 years ago

Why initialise_particle(const HDF5Particle& particle, const std::shared_ptr<Material<Tdim>>& material) is defined as virtual function in particle.h?

Good point. I will remove it, we should either use virtual or override, but not both in one go.

This one also exists in the develop branch

bodhinandach commented 4 years ago

@tianchiTJ the most recent commit should fix, thanks

jgiven100 commented 4 years ago

I'm getting matching outputs running a PEA simulation on develop and refactor/particles 👍

ezrayst commented 4 years ago

Anyways, how is the performance of using ordered map on different scalar and vector properties? Any comparison done yet?

kks32 commented 4 years ago

The 3D benchmarks of hydrostatic column shows this branch is about 1.5 - 2x slower.

develop: 615931 ms refactor-particles: 980017 ms

TACC develop: 742655 ms TACC refactor-particles: 1280757 ms

kks32 commented 4 years ago

Anyways, how is the performance of using ordered map on different scalar and vector properties? Any comparison done yet?

We have to look at that now separately to see if that gives us any benefit.

bodhinandach commented 4 years ago

@kks32 I created a partially revert branch refactor/particles_2 to move back the functions in particle_functions.tcc back to particles. Just want to see whether the slowdown is caused by restructuring the properties or moving the function out to a separate file with passing pointer argument. I am re-running the same 3D hydrostatic benchmark locally.

kks32 commented 4 years ago

Looks like the map function is the reason

Branch Local runtime (s) TACC runtime (s)
develop 615.9 742.6
particle (prop map + free fn) 980 1280
particle2 (only prop map) 872.9 1250
kks32 commented 4 years ago

image Definitely the map lookup is the reason for the slowness

bodhinandach commented 4 years ago

@kks32 Thanks for checking! I was worried about this before. Any thoughts on changing the data structure?

kks32 commented 4 years ago

I'm trying out a few different data structures, I'll let you know once I have the results.

kks32 commented 4 years ago

I tried a few different structures:

Branch TACC runtime (s)
develop 742.6
prop map + free fn 1280
only prop map 1250
only robin map 1012
only vector 850

The vector implementation seems close to develop. Let me try vector implementation with free functions

kks32 commented 4 years ago

I tried a few different structures: Branch TACC runtime (s) develop 742.6 prop map + free fn 1280 only prop map 1250 only robin map 1012 only vector 850

The vector implementation seems close to develop. Let me try vector implementation with free functions

Vector with free functions is 865s.

bodhinandach commented 4 years ago

Thanks @kks32, that makes it 1.16 times slower than the original. Are you checking the OpenMP/MPI performance as well? As the current refactor is created purposely for fluid and two-phase implementation, do you want me to check the vector one with the navier-stokes branch too?

kks32 commented 4 years ago

Vector implementation profiling

profile

kks32 commented 4 years ago

The problem with vector is like you said, we may have to fill everything otherwise we have an access issue.

kks32 commented 4 years ago
Branch TACC runtime (s)
develop 742.6
prop map + free fn 1280
vector + free fn map 865
AssocVec + free fn 972

Profiling of develop image

Profiling of robinhashmap image

Profiling of vector image

Profiling of AssocVector image

kks32 commented 4 years ago

After considering the speed-ups and different approaches, the use of scalar and vector properties still is 30% slower than using private variables. The advantage of scalar/vector properties is we'll have fewer functions and less repetition of code. On the other hand, defining private variables and deriving particle classes has more function definitions and repeating code, but is faster. I'd like to propose a vote, use emojis to vote between different options, please use comments below if you like to elaborate on your thoughts. @bodhinandach @tianchiTJ @thiagordonho @cw646 @ezrayst @jgiven100 @WeijianLIANG

  1. Use derived particle types with private variables (:rocket:)
  2. Solid particles will remain as is, while we add scalar and vector properties to derived particle types (:smile:) PR #673
  3. Cleaner but slower design with free functions and scalar/vector properties for all particle types (:heart:) PR #654
  4. Go back to the drawing board (:eyes:)
WeijianLiang commented 4 years ago

Thanks Krishna, I would prefer the use of scalar and vector properties which has a more clear structure.

kks32 commented 4 years ago

@WeijianLIANG is that option 2 or 3? Could you please vote with a reaction emoji :smile: for option 2 and :heart: for option 3. Thanks!

yliang-sn commented 4 years ago

After considering the speed-ups and different approaches, the use of scalar and vector properties still is 30% slower than using private variables. The advantage of scalar/vector properties is we'll have fewer functions and less repetition of code. On the other hand, defining private variables and deriving particle classes has more function definitions and repeating code, but is faster. I'd like to propose a vote, use emojis to vote between different options, please use comments below if you like to elaborate on your thoughts. @bodhinandach @tianchiTJ @thiagordonho @cw646 @ezrayst @jgiven100 @WeijianLIANG

  1. Use derived particle types with private variables (🚀)
  2. Solid particles will remain as is, while we add scalar and vector properties to derived particle types (😄)
  3. Cleaner but slower design with free functions and scalar/vector properties for all particle types (❤️)
  4. Go back to the drawing board (👀)

For 2, are we just dealing with the scalar and vector properties to the derived particle types? If we have other types of quantities, do we still use the private variables?

kks32 commented 4 years ago

@yliang-sn everything will be held within scalar/vector/tensor/boolean

yliang-sn commented 4 years ago

@yliang-sn everything will be held within scalar/vector/tensor/boolean

Can I vote, too? If the private variables have the best efficiency, I prefer to 1. The structure is also clear enough to be read and extended. However, 2 is also fine for me.

kks32 commented 4 years ago

@yliang-sn yes, I couldn't find your handle, please vote.

kks32 commented 4 years ago

@thiagordonho @tianchiTJ @cw646 @WeijianLIANG could you please use emoji reactions to vote on this comment please: https://github.com/cb-geo/mpm/pull/654#issuecomment-662429917

ezrayst commented 4 years ago

Both @kks32 and @bodhinandach have been working hard on this issue so appreciate your work. I think it has been about 2 months now.

However, I feel that we need a unanimous decision on this refactor, or at least a 2/3 decision. The reason being is I don't want us to be in this position within 6 months where we are thinking to refactor again. I think the timeline on this issue has shown that this is a lot of work and everyone's progress is halted. Personally, if we are not even sure on (2) or (3), or anyone still has reservation on them, wouldn't it better to keep it as is (1)? Yes, particle_base will be messy because of the many functions from two_phase particles, but at least we get the speedup. Yes we might need to refactor in the future, but isn't that the assumption that we will refactor again? Lastly, if you are sure for (2), might as well go for (3) so we don't come back to it again later. So my vote will be (3) now!

WeijianLiang commented 4 years ago

@kks32 Thanks, I also vote for (2)

bodhinandach commented 4 years ago

After considering @ezrayst comment, I changed my vote to option (3). I know that the code will be slower, but the code is structurally way better and tidier. Also, if currently by moving to (3) causing us to be 30% slower than develop, I think we should rethink any other way to improve the speed. For instance, PR #677 has already improved the speed by 4% (thanks to @kks32). I think we can regain the current performance in short time by improving efficiency in other parts.

yliang-sn commented 4 years ago

After talking with Ezra and Nanda, I think i will change to vote for 3 if we can improve the efficiency in other aspects.

kks32 commented 4 years ago

Other containers like Abseil, folly, vector of vectors, static_vector, and small_vector implementations all gave slower or almost identical performance as the flat_map. Due to the 30% reduction in performance we are closing this issue. https://trello.com/c/fGF3MrKg/110-plan-for-next-week-notes