This PR converts non-RAJA base and lambda GPU kernel variants so that all GPU variants use the same methodology for kernel launches, specifically what is used inside RAJA.
It introduces new methods to launch non-RAJA variants of GPU kernels and converts all such kernel implementations to use them.
It also addresses function argument organization and alignment issues that have been discussed by the team.
Summary
Resolves https://github.com/LLNL/RAJAPerf/issues/385
Resolves https://github.com/LLNL/RAJAPerf/issues/371
NOTE: This is a large PR touching many files. No functionality was changed. All of the changes are very similar acoss all of the kernels.