Is your feature request related to a problem? Please describe.
static_map/dynamic_map currently does not take cudaStream_t stream parameter. This often requires additional synchronization and limits speedup we can get when we wish to concurrently run multiple cuCollection operations using multiple CUDA streams.
Describe the solution you'd like
Add CUDA stream support
Additional context
cuGraph needs this to run multiple graph kernels concurrently using multiple CUDA streams (for batch processing).
Is your feature request related to a problem? Please describe. static_map/dynamic_map currently does not take cudaStream_t stream parameter. This often requires additional synchronization and limits speedup we can get when we wish to concurrently run multiple cuCollection operations using multiple CUDA streams.
Describe the solution you'd like Add CUDA stream support
Additional context cuGraph needs this to run multiple graph kernels concurrently using multiple CUDA streams (for batch processing).