CUDA 12.4+ NVRTC `-minimal`

CUDA 12.4 introduces:

Add a new flag -minimal for NVRTC compilation. The -minimal flag omits certain language features to reduce compile time for small programs. In particular, the following are omitted:

Texture and surface functions and associated types (for example, cudaTextureObject_t).

CUDA Runtime Functions that are provided by the cudadevrt device code library, typically named with prefix “cuda”, for example, cudaMalloc.

Kernel launch from device code.

Types and macros associated with CUDA Runtime and Driver APIs, provided by cuda/tools/cudart/driver_types.h, typically named with the prefix “cuda” for example, cudaError_t.

This might be worth investigating in the future (post #1150)

This will require changes to our headers to prevent nvrtc from seeing cudaStream_t etc. Actually adding it to JitifyCache::compileKernel is trivial (though potentially it can be made a runtime decision rather than compile time for cuda 12.0-12.3 builds, depending on how nvrtc works)

---------------------------------------------------
--- JIT compile log for outputdata_program ---
---------------------------------------------------
flamegpu/simulation/detail/CUDAScanCompaction.h(65): error: identifier "cudaStream_t" is undefined
      void zero_scan_flag_async(cudaStream_t stream);
                                ^

flamegpu/simulation/detail/CUDAScanCompaction.h(115): error: identifier "cudaStream_t" is undefined
      void zero_async(const Type& type, cudaStream_t stream, unsigned int streamId);
                                        ^

flamegpu/exception/FLAMEGPUDeviceException.cuh(26): error: identifier "cudaStream_t" is undefined
      DeviceExceptionBuffer *getDevicePtr(unsigned int streamId, cudaStream_t stream);
                                                                 ^

flamegpu/exception/FLAMEGPUDeviceException.cuh(27): error: identifier "cudaStream_t" is undefined
      void checkError(const std::string &function, unsigned int streamId, cudaStream_t stream);
                                                                          ^

4 errors detected in the compilation of "outputdata_program".

FLAMEGPU / FLAMEGPU2

CUDA 12.4+ NVRTC `-minimal` #1187