trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.19k stars 565 forks source link

Panzer: DofManager working with device mesh data structures #12380

Open maartenarnst opened 11 months ago

maartenarnst commented 11 months ago

@trilinos/panzer

The Panzer DOFManager and ConnManager are great tools for numbering degrees of freedom in finite element models.

However, as mesh data structures move to device, challenges arise:

As a result, when the mesh data structure is on device, the computation does not exploit the parallelism of the device, and data transfers to the host are needed.

It seems that addressing this issue is of current interest at Sandia. E.g., we have seen recent work in Albany related to making the DOFManager and ConnManager work with the Omega_h device data structures.

On our side, we use the Panzer DOFManager and ConnManager in an in-house FE code that we develop in our group. We have experimented a bit with device implementations (use of Kokkos StdAlgorithms and Views) and we have seen significant speedups (factor of more than 20 times faster).

The goal of this issue is therefore to ask what's the panzer team's view on this? Would it be of interest to have a discussion with users/developers of the Panzer DOFManager to try to align ideas?

Just as a suggestion, starting points for a discussion might be:

@rppawlo @romintomasetti @CamelliaDPG @mperego @bartgol @cwsmith.

rppawlo commented 11 months ago

We definitely would like to move all computations possible to device. Backwards compatibility is important. Will any of you be attending the TUG meeting in Albuquerque at the end of this month? Would could have a side meeting there. Otherwise I can set up a virtual meeting over teams.

maartenarnst commented 11 months ago

Hi @rppawlo. I won't be attending the TUG. I'd be happy to meet over teams.

maartenarnst commented 11 months ago

Hi @rppawlo. I've gone ahead and made a first PR that declares some of the member function of the dof manager as virtual:

This way, we can get the ball rolling. And this first step may not be the step where significant discussion is needed.

A follow-up could be to think about how to generalise the conn manager, so as to allow the data to arrive in other types than std::vector. I have the impression that discussion would be needed there to find the best way forward.

Would you have a moment to take a look at #12432? Thanks in advance!