This PR adds the axisymmetric flow solver mode to the _GPU_ code path, which resolves #104. Doing so required two primary updates:
Add the position to the precomputed information available on the faces (e.g., via struct interiorFaceIntegrationData ), which enables one to set the radius correctly at each quadrature point
Modify AxisymmetricForcing::updateTerms to do the main computation of the axisymmetric source terms on the gpu in the usual way (i.e., using MFEM_FORALL).
Various other minor updates were also required, none of which are very noteworthy, with the exception of the interesting behavior discovered for cuda when extending TransportProperties::GetViscosities to be a __device__ function. Very briefly, it seems that when a __device__virtual function calls another __device__virtual function, if the function signatures are similar enough, then nvlink thinks you may have a recursive function, and it fails to automatically size the stack appropriately. This behavior can lead to problems unless one manually sizes the stack correctly, which tps does not presently do. It appears this same problem was observed in #184 (which was avoided rather than resolved in #193) but was not understood at that time. I think it is also responsible for nvidia linker warnings associated with deconstructing a derived class via the base class destructor. For more discussion see 1cac8e69d7d1c4d81e178fe9bbbb3e92e078c19a.
NB: the gitlab failure is the result of a runner issue, not a real test failure. The tests have been run manually and passed. So, this is ready from that point of view.
This PR adds the axisymmetric flow solver mode to the
_GPU_
code path, which resolves #104. Doing so required two primary updates:struct interiorFaceIntegrationData
), which enables one to set the radius correctly at each quadrature pointAxisymmetricForcing::updateTerms
to do the main computation of the axisymmetric source terms on the gpu in the usual way (i.e., usingMFEM_FORALL
).Various other minor updates were also required, none of which are very noteworthy, with the exception of the interesting behavior discovered for cuda when extending
TransportProperties::GetViscosities
to be a__device__
function. Very briefly, it seems that when a__device__
virtual
function calls another__device__
virtual
function, if the function signatures are similar enough, thennvlink
thinks you may have a recursive function, and it fails to automatically size the stack appropriately. This behavior can lead to problems unless one manually sizes the stack correctly, whichtps
does not presently do. It appears this same problem was observed in #184 (which was avoided rather than resolved in #193) but was not understood at that time. I think it is also responsible for nvidia linker warnings associated with deconstructing a derived class via the base class destructor. For more discussion see 1cac8e69d7d1c4d81e178fe9bbbb3e92e078c19a.