cholla-hydro / cholla

A GPU-based hydro code
https://github.com/cholla-hydro/cholla/wiki
MIT License
60 stars 32 forks source link

Reconstruction Kernel Fusion 2: New Loading and Conversion Utility Functions #375

Closed bcaddy closed 1 month ago

bcaddy commented 4 months ago

Summary

The primary purpose of this PR is to introduce 4 new utility functions for loading grid data and converting between the primitive and conserved variables. This builds off of PR #371 and will show changes from both that PR and this PR until #371 is merged into dev. Most of the relevant changes are in the following files:

The data loading functions are both templates that take the direction as a template parameter and default to dir = 0. Per Issue #308 we've seen that this can improve performance noticeably. After integrating these new templated functions into PLMC and PPMC (and making them templates in the same way) I found a ~6% improvement in performance on a single V100. (timing data)

hydro_utilities::Load_Cell_Conserved

This function loads the conserved data from the dev_conserved array and returns an assembled Conserved object.

hydro_utilities::Load_Cell_Primitive

This function calls hydro_utilities::Load_Cell_Conserved and then hydro_utilities::Conserved_2_Primitive to load the conserved data, convert it to primitive, and returns a Primitive object.

hydro_utilities::Conserved_2_Primitive

Converts the conserved variables in a Conserved object into primitive variables in a Primitive object and returns it.

hydro_utilities::Primitive_2_Conserved

Converts the primitive variables in a Primitive object into conserved variables in a Conserved object and returns it.

math_utils::Cyclic_Permute_Once and math_utils::Cyclic_Permute_Twice

These two functions take a hydro_utilities::Vector object and cyclically permute its members once or twice. It's used in the Load_Cell_Conserved function and I split it out on its own for clarity and because I thought it might be more generally useful. I tried to think of a single function that would do this more elegantly but I couldn't figure out a way, if you have suggestions I would be happy to hear them.

Other