Open Robadob opened 7 months ago
Update regarding header pre-loading with Jitify2/CUDA 12.3
Windows/CUDA 12.0
No preload
Millis: 6822.000000
Millis: 6853.000000
Preloading FLAMEGPU headers
Millis: 4045.000000
Millis: 4277.000000
Preload FLAMEGPU + CUDA headers
Millis: 1296.000000
Millis: 1667.000000
Linux/CUDA 12.3
Jitify 2 from scratch (Waimu)
Millis: 25318.000000
Millis: 24143.000000
Preload FLAMEGPU + CUDA headers
Millis: 1376.000000
Millis: 2218.000000
CUDA 12.0 has ~30 CUDA headers to preload. CUDA 12.3 has ~257 CUDA headers to preload. (List contains some dupes)
Not clear whether we would want to generalise this code, to better handle different CUDA versions, because we could be potentially needing to update it with each CUDA update.
Edit: Removed from-cache times, latest commit has these matching Jitify1.
Current issue holding back the Jitify2 preprocesor branch is that it expects our flamegpu headers to be included as system header <>
rather than " "
. Waiting to here back from the dev (Ben) before I try to correct that on our side.
Did three full test runs last night, all passed, however in those cases the cmake jitify dependency was pointing at the preprocess branch. Not currently using that here as it causes all windows CI to fail with WError.
Linux/CUDA12.3/Seatbelts ON/GLM ON/Release Linux/CUDA12.3/Seatbelts OFF/GLM ON/Release Windows/CUDA12.0/Seatbelts ON/GLM OFF/Debug
In release builds kernels are taking ~1 second to compile each. As Jitify is now doing the pre-processing, this is closer to 2.5 seconds under Debug builds.
launch()
method (used inCUDASimulation
), says to replace withlaunch_raw()
.cuda().ModuleUnload()
during sim shutdown, whenCUDAAgent
map is cleared byCUDASimulation
destructor.jitify2-misc-fixes2
branch)curve_rtc.cpp