rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.09k stars 874 forks source link

[FEA] Expose `allocate_like` and other copying utilities that take a `rmm::stream_view` #11653

Closed harrism closed 1 year ago

harrism commented 1 year ago

Is your feature request related to a problem? Please describe. Currently utilities like allocate_like in the public cuDF interface do not take a stream. This is troublesome for non-cuDF libraries (like cuSpatial) that use these functions in stream-ordered code.

Describe the solution you'd like

Expose cudf::detail::allocate_like and other similar stream-ordered utilities in the cudf public interface.

Describe alternatives you've considered Calling cuDF detail APIs directly. This is a bit smelly for code outside of cuDF to do. Another option is to use the cudf::column copy constructor which does take a stream. But this is a lot of overhead when you don't need a copy of the data in the new column.

CC @bdice

github-actions[bot] commented 1 year ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

davidwendt commented 1 year ago

I feel like this is covered by https://github.com/rapidsai/cudf/issues/925 None of the libcudf APIs expose stream parameters. This issue just highlights a few of them. Recommend closing this in favor of #925

bdice commented 1 year ago

I agree this is a subset of #925. The current guidance I would give is to call the desired APIs and sync with cudf::default_stream_value.value() before using the result in another (user-provided) stream. See also #11853 (in progress) for stream policy guidance.

bdice commented 1 year ago

I'll close for now, but feel free to re-open if there is more to explore here @harrism.

harrism commented 1 year ago

@bdice synchronising the default stream is a big performance cost.

bdice commented 1 year ago

@harrism That is true. However, the work is being done on the default stream, necessitating that sync if the result is to be used on another non-blocking stream. Exposing user-provided stream parameters is a goal of the library, and this will be resolved as a part of #925.

vyasr commented 1 year ago

Calling cuDF detail APIs directly. This is a bit smelly for code outside of cuDF to do.

cuspatial already does this, though. We've made numerous improvements, and perhaps the situation has improved even further since I last looked (especially with the header-only APIs) but even if the C++ has largely removed such dependence (a quick search only shows uses of cudf::detail::make_counting_transform_iterator, cudf::detail::sort_by_key, and cudf::detail::allocate_like, which is pretty good) cuspatial Python definitely still relies heavily on rewriting cudf Python internals. Rather than making one-off changes in cudf to support cuspatial needs, I would rather see cuspatial's continued reliance on cudf internals be a forcing function that leads to cudf improving in the more systematic ways that it needs to, whether that means exposing streams across the board or something else. Adding a few stream-ordered methods to cudf's public API before we develop a coherent streams strategy seems like a worse outcome than cuspatial relying on cudf internals for a little while longer, especially since we're actively working towards resolving #925.

harrism commented 1 year ago

cuspatial already does this, though.

The only remaining one is cudf::detail::allocate_like. We no longer use cuDF's sort and we have a cuspatial::make_counting_transform_iterator -- I will file an issue to clean up any calls to cuDF ones. But we still need to call detail::allocate_like because of the stream issue.

I'm not suggesting cuDF shouldn't have a coherent stream solution. I am just assuming that copying utilities that accept a stream would be part of that solution.

vyasr commented 1 year ago

I think we are mostly all on the same page here. cudf needs a coherent stream solution, which will resolve #925 and enable what this issue is asking for as well. I think @davidwendt and @bdice closed this issue to indicate that there was no plan to expose a stream-ordered allocate_like API until we got around to exposing stream-ordered versions of all libcudf APIs. Until then, incurring the performance hit of a stream sync is the expected cost of using libcudf for any consumer, including libcuspatial.