Improve CUDA target arcitecture selection

This replaces the NV_ARCH nested ifs in each of the nvhpc.mk compiler makefiles with a CUDA_GEN variable that is constructed based on the environment NV_ARCH and CUDA_GEN that allows for multiple target architectures to be specified. In addition, since CUDA_GEN is set in compilers.mk, the nested if duplication is eliminated.

With this, the old system of specifying NV_ARCH=Kepler make ... remains supported (with the addition of the Ampere target), and in addition you can now also do, for example, CUDA_GEN=50,60 NV_ARCH=Volta,Ampere make ... and have binaries emitted with PTX code for 50, 60, 70 and 80.

This also allows us to properly support inheriting from CudaPackage in the Spack package, as we can pass the full list of cuda_arch.

OP-DSL / OP2-Common

Improve CUDA target arcitecture selection #223