Port axisymmetric solver to gpu

This PR adds the axisymmetric flow solver mode to the _GPU_ code path, which resolves #104. Doing so required two primary updates:

Add the position to the precomputed information available on the faces (e.g., via struct interiorFaceIntegrationData ), which enables one to set the radius correctly at each quadrature point
Modify AxisymmetricForcing::updateTerms to do the main computation of the axisymmetric source terms on the gpu in the usual way (i.e., using MFEM_FORALL).

Various other minor updates were also required, none of which are very noteworthy, with the exception of the interesting behavior discovered for cuda when extending TransportProperties::GetViscosities to be a __device__ function. Very briefly, it seems that when a __device__ virtual function calls another __device__ virtual function, if the function signatures are similar enough, then nvlink thinks you may have a recursive function, and it fails to automatically size the stack appropriately. This behavior can lead to problems unless one manually sizes the stack correctly, which tps does not presently do. It appears this same problem was observed in #184 (which was avoided rather than resolved in #193) but was not understood at that time. I think it is also responsible for nvidia linker warnings associated with deconstructing a derived class via the base class destructor. For more discussion see 1cac8e69d7d1c4d81e178fe9bbbb3e92e078c19a.

pecos / tps

Port axisymmetric solver to gpu #203