kks32 commented 4 years ago

Describe the PR This is an alternative implementation of particle refactor.

After some more consideration, I am wondering about the possibility to derive particle types, but only keep the common particle functions within the class, and merge all independent/specialized functions to be outside the class. This will help to initialize the classes with the right types, and also in MPI transfer, as the type is known. We derived data of individual Particle types but keep specialized particle functions separately. Maybe the particle functions may have to stay in the same namespace.

Related Issues/PRs

637 and #650

Additional context Specialized functions will be still separate.

Following are the run-times for 3D hydrostatic column 10k steps on Stampede2

Branch	TACC runtime (s)
develop	742.6
prop map + free fn	1280
vector + free fn map	865
flatmap + free fn	972

codecov[bot] commented 4 years ago

Codecov Report

Merging #654 into develop will increase coverage by 0.00%. The diff coverage is 97.45%.

@@            Coverage Diff            @@
##           develop     #654    +/-   ##
=========================================
  Coverage    96.66%   96.66%            
=========================================
  Files          123      124     +1     
  Lines        25375    25641   +266     
=========================================
+ Hits         24527    24785   +258     
- Misses         848      856     +8

Impacted Files	Coverage Δ
include/node_base.h	`100.00% <ø> (ø)`
include/particles/particle_base.h	`100.00% <ø> (ø)`
include/solvers/mpm_base.tcc	`73.21% <50.00%> (-0.48%)`	:arrow_down:
include/particles/particle.tcc	`93.24% <95.56%> (-0.89%)`	:arrow_down:
include/node.tcc	`95.12% <96.18%> (-0.86%)`	:arrow_down:
include/mesh.tcc	`83.94% <100.00%> (+0.04%)`	:arrow_up:
include/node.h	`100.00% <100.00%> (ø)`
include/particles/particle.h	`100.00% <100.00%> (ø)`
include/particles/particle_base.tcc	`100.00% <100.00%> (ø)`
include/particles/particle_functions.tcc	`100.00% <100.00%> (ø)`
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3b20da5...a0bbd15. Read the comment docs.

bodhinandach commented 4 years ago

Dear all, I think this refactoring branch is ready for review. Could you all please some time to take a look at it? We can discuss it for sure in the following developer meeting. Here are some notes to ease the checking process:

The main changes are located mainly in particle and node class with the addition of mpm_properties and particle_functions.
Some functions in particle are moved out to particle_functions.
Assign (or update) and return variables in nodes and particles are kept as before.
In particle, tensorial properties are still kept as before, we will refactor these in another PR, if necessary.
After a discussion with @thiagordonho, the contact functions in particle class are still kept as it is in this PR. We need to refactor how nodal_properties are stored and hopefully coming with similar approach to this. This will also be in a separate PR.
Please refer to issue #637 for further details.

As this change is breaking for all the development, could you please test it to run some of your simulations and let us know if you encounter any issue. Many many thanks in advance for the help. Any comments and questions are appreciated.

tianchiTJ commented 4 years ago

Why initialise_particle(const HDF5Particle& particle, const std::shared_ptr<Material<Tdim>>& material) is defined as virtual function in particle.h?

bodhinandach commented 4 years ago

Why initialise_particle(const HDF5Particle& particle, const std::shared_ptr<Material<Tdim>>& material) is defined as virtual function in particle.h?

Good point. I will remove it, we should either use virtual or override, but not both in one go.

tianchiTJ commented 4 years ago

Why initialise_particle(const HDF5Particle& particle, const std::shared_ptr<Material<Tdim>>& material) is defined as virtual function in particle.h?

Good point. I will remove it, we should either use virtual or override, but not both in one go.

This one also exists in the develop branch

bodhinandach commented 4 years ago

@tianchiTJ the most recent commit should fix, thanks

jgiven100 commented 4 years ago

I'm getting matching outputs running a PEA simulation on develop and refactor/particles 👍

ezrayst commented 4 years ago

Anyways, how is the performance of using ordered map on different scalar and vector properties? Any comparison done yet?

kks32 commented 4 years ago

The 3D benchmarks of hydrostatic column shows this branch is about 1.5 - 2x slower.

develop: 615931 ms refactor-particles: 980017 ms

TACC develop: 742655 ms TACC refactor-particles: 1280757 ms

kks32 commented 4 years ago

Anyways, how is the performance of using ordered map on different scalar and vector properties? Any comparison done yet?

We have to look at that now separately to see if that gives us any benefit.

bodhinandach commented 4 years ago

@kks32 I created a partially revert branch refactor/particles_2 to move back the functions in particle_functions.tcc back to particles. Just want to see whether the slowdown is caused by restructuring the properties or moving the function out to a separate file with passing pointer argument. I am re-running the same 3D hydrostatic benchmark locally.

kks32 commented 4 years ago

Looks like the map function is the reason

Branch	Local runtime (s)	TACC runtime (s)
develop	615.9	742.6
particle (prop map + free fn)	980	1280
particle2 (only prop map)	872.9	1250

kks32 commented 4 years ago

Definitely the map lookup is the reason for the slowness

bodhinandach commented 4 years ago

@kks32 Thanks for checking! I was worried about this before. Any thoughts on changing the data structure?

kks32 commented 4 years ago

I'm trying out a few different data structures, I'll let you know once I have the results.

kks32 commented 4 years ago

I tried a few different structures:

Branch	TACC runtime (s)
develop	742.6
prop map + free fn	1280
only prop map	1250
only robin map	1012
only vector	850

The vector implementation seems close to develop. Let me try vector implementation with free functions

kks32 commented 4 years ago

I tried a few different structures: Branch TACC runtime (s) develop 742.6 prop map + free fn 1280 only prop map 1250 only robin map 1012 only vector 850

The vector implementation seems close to develop. Let me try vector implementation with free functions

Vector with free functions is 865s.

bodhinandach commented 4 years ago

Thanks @kks32, that makes it 1.16 times slower than the original. Are you checking the OpenMP/MPI performance as well? As the current refactor is created purposely for fluid and two-phase implementation, do you want me to check the vector one with the navier-stokes branch too?

kks32 commented 4 years ago

Vector implementation profiling

profile

kks32 commented 4 years ago

The problem with vector is like you said, we may have to fill everything otherwise we have an access issue.

kks32 commented 4 years ago

Branch	TACC runtime (s)
develop	742.6
prop map + free fn	1280
vector + free fn map	865
AssocVec + free fn	972

Profiling of develop

Profiling of robinhashmap

Profiling of vector

Profiling of AssocVector

kks32 commented 4 years ago

After considering the speed-ups and different approaches, the use of scalar and vector properties still is 30% slower than using private variables. The advantage of scalar/vector properties is we'll have fewer functions and less repetition of code. On the other hand, defining private variables and deriving particle classes has more function definitions and repeating code, but is faster. I'd like to propose a vote, use emojis to vote between different options, please use comments below if you like to elaborate on your thoughts. @bodhinandach @tianchiTJ @thiagordonho @cw646 @ezrayst @jgiven100 @WeijianLIANG

Use derived particle types with private variables (:rocket:)
Solid particles will remain as is, while we add scalar and vector properties to derived particle types (:smile:) PR #673
Cleaner but slower design with free functions and scalar/vector properties for all particle types (:heart:) PR #654
Go back to the drawing board (:eyes:)

WeijianLiang commented 4 years ago

Thanks Krishna, I would prefer the use of scalar and vector properties which has a more clear structure.

kks32 commented 4 years ago

@WeijianLIANG is that option 2 or 3? Could you please vote with a reaction emoji :smile: for option 2 and :heart: for option 3. Thanks!

yliang-sn commented 4 years ago

After considering the speed-ups and different approaches, the use of scalar and vector properties still is 30% slower than using private variables. The advantage of scalar/vector properties is we'll have fewer functions and less repetition of code. On the other hand, defining private variables and deriving particle classes has more function definitions and repeating code, but is faster. I'd like to propose a vote, use emojis to vote between different options, please use comments below if you like to elaborate on your thoughts. @bodhinandach @tianchiTJ @thiagordonho @cw646 @ezrayst @jgiven100 @WeijianLIANG

Use derived particle types with private variables (🚀)

Solid particles will remain as is, while we add scalar and vector properties to derived particle types (😄)

Cleaner but slower design with free functions and scalar/vector properties for all particle types (❤️)

Go back to the drawing board (👀)

For 2, are we just dealing with the scalar and vector properties to the derived particle types? If we have other types of quantities, do we still use the private variables?

kks32 commented 4 years ago

@yliang-sn everything will be held within scalar/vector/tensor/boolean

yliang-sn commented 4 years ago

@yliang-sn everything will be held within scalar/vector/tensor/boolean

Can I vote, too? If the private variables have the best efficiency, I prefer to 1. The structure is also clear enough to be read and extended. However, 2 is also fine for me.

kks32 commented 4 years ago

@yliang-sn yes, I couldn't find your handle, please vote.

kks32 commented 4 years ago

@thiagordonho @tianchiTJ @cw646 @WeijianLIANG could you please use emoji reactions to vote on this comment please: https://github.com/cb-geo/mpm/pull/654#issuecomment-662429917

ezrayst commented 4 years ago

Both @kks32 and @bodhinandach have been working hard on this issue so appreciate your work. I think it has been about 2 months now.

However, I feel that we need a unanimous decision on this refactor, or at least a 2/3 decision. The reason being is I don't want us to be in this position within 6 months where we are thinking to refactor again. I think the timeline on this issue has shown that this is a lot of work and everyone's progress is halted. Personally, if we are not even sure on (2) or (3), or anyone still has reservation on them, wouldn't it better to keep it as is (1)? Yes, particle_base will be messy because of the many functions from two_phase particles, but at least we get the speedup. Yes we might need to refactor in the future, but isn't that the assumption that we will refactor again? Lastly, if you are sure for (2), might as well go for (3) so we don't come back to it again later. So my vote will be (3) now!

WeijianLiang commented 4 years ago

@kks32 Thanks, I also vote for (2)

bodhinandach commented 4 years ago

After considering @ezrayst comment, I changed my vote to option (3). I know that the code will be slower, but the code is structurally way better and tidier. Also, if currently by moving to (3) causing us to be 30% slower than develop, I think we should rethink any other way to improve the speed. For instance, PR #677 has already improved the speed by 4% (thanks to @kks32). I think we can regain the current performance in short time by improving efficiency in other parts.

yliang-sn commented 4 years ago

After talking with Ezra and Nanda, I think i will change to vote for 3 if we can improve the efficiency in other aspects.

kks32 commented 4 years ago

Other containers like Abseil, folly, vector of vectors, static_vector, and small_vector implementations all gave slower or almost identical performance as the flat_map. Due to the 30% reduction in performance we are closing this issue. https://trello.com/c/fGF3MrKg/110-plan-for-next-week-notes

cb-geo / mpm

[Refactor] Particle variables and functions handling #654

637 and #650

Codecov Report