Use CUDA stream pool to enable operators to run in parallel on the GPU. Use BlockMemoryPool to avoid calls to cudaFree() which synchronize CPU and GPU execution.
Execute post processing on GPU (needs Holoscan SDK 2.6).
Improves performance from 12 ms to 7.4 ms per frame.
Use CUDA stream pool to enable operators to run in parallel on the GPU. Use BlockMemoryPool to avoid calls to cudaFree() which synchronize CPU and GPU execution. Execute post processing on GPU (needs Holoscan SDK 2.6).
Improves performance from 12 ms to 7.4 ms per frame.