rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.09k stars 874 forks source link

[BUG] cudaMalloc and cudaFree are being called during aggregations #10080

Closed jlowe closed 2 years ago

jlowe commented 2 years ago

Describe the bug While examining a recent trace I noticed that within the libcudf aggregate range there are calls to cudaMalloc and cudaFree, the latter which causes a synchronization on the default stream. I attached gdb and put a breakpoint on cudaMalloc and found it was being triggered by cudf::detail::is_relationally_comparable<cudf::table_device_view> because it calls thrust::all_of without passing an execution policy. Without using the RMM policy, it will use the default CUDA allocator. Ideally it should be using rmm::exec_policy(stream) but the stream is not available to this method and would need to be passed.

Steps/Code to reproduce bug Attach a debugger to a query using the RMM arena allocator and executes an aggregation. Place a breakpoint on cudaMalloc and execute the query and observe the breakpoint is hit in a callstack that derives from cudf::detail::is_relationally_comparable.

Expected behavior libcudf should not trigger calls to cudaMalloc or cudaFree.

harrism commented 2 years ago

Yowza. That function should take an explicit stream.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

jlowe commented 2 years ago

Still relevant and would like to see this fixed.