adamantine-sim / adamantine

Software to simulate heat transfer for additive manufacturing
https://adamantine-sim.github.io/adamantine/
Other
31 stars 9 forks source link

Add new template parameter MaterialStates #265

Closed Rombur closed 3 months ago

Rombur commented 3 months ago

This PR is massive but I think it can be reviewed by ignoring the first and the last commit. The first commit is very much mechanical. It adds the new template parameter to all the classes and instantiates all the classes.

The problem with instantiating so many template parameters is that the code takes more than 10 minutes to compile and about 8GB of RAM. The last point is a problem for the CI since the machines have 8GB of RAM. The solution is to split the instantiations in multiple files. This improves the parallelism of the compilation and reduces the highwater memory required. With this change, we can compile the code in about 4 minutes and we require just above 2GB of RAM. The convention I've use to create the instantiations files is the following XXX.cc becomes XXX.templates.hh and the instantations files are XXXInstSHost.cc (Instantiate Solid on the Host), XXXInstSLHost.cc (Instantiate SolidLiquid on the Host), XXXInstSLPHost.cc (Instantiate SolidLiquidPowder on the Host) and similarly for the device. I am very much open to use a different naming scheme. I am not a fan of the current scheme but I couldn't come up with something better. This change is done in the last commit so you don't need to review that commit.

Almost all the tests are using the SolidLiquidPowder MaterialStates. I have only changed two integration tests to use SolidLiquid. The issue is that we now have so many parameters it is hard to test all the combinations. I am open to use Solid and SolidLiquid in more tests. Just let me know.

The two main performance enhancement in this PR are for update_state_ratios and get_inv_rho_cp. update_state_ratios is simplified in the Solid and SolidLiquid cases because we don't need to keep track of the powder state that can only become liquid and is never created. For get_inv_rho_cp, I noticed that we had an if that checked that we were in the mushy zone by checking that the liquid ratio was in (0, 1). This was very slow, instead I created a mushy variable that is the product of the liquid and the solid ratio. This variable is non-zero in the mushy zone. This simplifies the if condition and provides a good speedup.

There are still a few things we can do to speedup the code. The next thing to look at is the evaluation of the source term. This is looks like the next bottleneck. This is sth that I wanted to look at for the GPU code anyway. The other thing is to reorder the order of parameters in the multidimensional Views in MaterialProperty. We could get a better access pattern if we reorder the parameters.

Rombur commented 3 months ago

nvcc has annoying rules when using constexpr in a lambda. The last commit works around the issue by creating a temporary variable.

Rombur commented 3 months ago

Retest this please