Open GregoryKimball opened 11 months ago
The other thing to think about is what happens if you are not using UVM. Putting hints in is nice, but are they going to slow down the processing when UVM is not being used? If they do how is the best way to mitigate this?
@revans2 I do not expect that this would affect non-managed allocations. My expectation is that cudf will need to determine (or track) if an allocation is managed before attempting prefetching or giving other advice to the driver. That should be very inexpensive to check/track. Non-UVM cases shouldn’t see regressions as a result.
It seems like the right object to offer the ability to hint allocations is the memory resource, in which case non-managed memory resources could provide no-op implementations.
Is your feature request related to a problem? Please describe. In cuDF-python and RMM, it's easy to opt into managed memory (also known as Unified Memory, UM, and Unified Virtual Memory, UVM). However, libcudf is not optimized for use with managed memory and encounters many "just too late" page faults when the "oversubscription factor" is >1.
Hinting options and strategies
Implementation ideas for libcudf
Useful reference for cudaMemAdvise
Please note: Using managed memory in libcudf is in early stages of scoping. This issue will improve over time.
Describe the solution you'd like I would like to add a libcudf benchmark for studying managed memory performance, and then some targeted experiments (with profiling) to observe the impact of different hinting strategies. When we have identified a promising design, we will open a more targeted issue.
Describe alternatives you've considered Continue to let Dask and Spark-RAPIDS catch and retry when there are device OOM errors.
Additional context Please note that with managed memory pools, the pool allocation is lazy. This is different from unmanaged memory pools where we allocate the full pool upfront, trading slightly longer startup time for much faster algorithm allocations.
Useful blog posts: https://developer.nvidia.com/blog/unified-memory-cuda-beginners/ https://developer.nvidia.com/blog/improving-gpu-memory-oversubscription-performance/ https://developer.nvidia.com/blog/maximizing-unified-memory-performance-in-cuda/ https://developer.nvidia.com/blog/beyond-gpu-memory-limits-unified-memory-pascal/