[FEA] Alternate design for owning/non-owning comparators

jrhemstad commented 2 years ago

Is your feature request related to a problem? Please describe.

The new experimental row operators require non-trivial preprocessing that involves new allocations whose lifetime must be maintained while attempting to do any row-wise operations on the specified data.

To manage this, we introduced owning and non-owning comparator types.

For example, self_comparator is an owning type that handles doing row-wise operations on a single table.

self_comparator is not a binary callable object (it doesn't have an operator()). Creating the actual non-owning callable object is currently done through a member factory function of the owning type (renamed in https://github.com/rapidsai/cudf/pull/10870).

table_view input_table{...};
self_comparator s{input_table}; // "Expensive" construction that does necessary pre-processing and allocations
auto callable = s.less(); // "Cheap" factory that returns a binary callable suitable for passing to algorithms like thrust::sort
thrust::sort(..., callable);

After reviewing code using this functionality I've noticed that the callable being returned from a member of the owning type is a bit awkward. For instance the s.less() call above isn't immediately obvious that this is actually a factory returning a callable function object. One way to remedy that could be to call it s.make_less() instead, but I think there's an all together better way.

Describe the solution you'd like Inspired by std:: function objects like std::less/std::equal_to, I think we should make the callable objects currently being returned from functions like less()/less_equivalent()/equal_to() to instead be freestanding types that are constructible from the owning types instead of being returned from a factory of the owning type.

I think this would simplify the owning type as well as make the owner/viewer relationship more clear and explicit.

For example, the code above would become:

table_view input_table{...};
self_comparator s{input_table}; // "Expensive" construction that does necessary pre-processing and allocations
auto callable = cudf::row::less{s}; // "Cheap" construction that _views_ the internals of self_comparator
thrust::sort(..., callable);

Here's a high level sketch of how this idea could be implemented: https://godbolt.org/z/5fbK7PTb6

Salient points:

less is a standalone type constructible from the owning type
less is a friend of the owning type to access internals
less deletes constructions from an r-value ref of the owning type to prevent construction from a temporary of the owning type. If allowed, this would lead to dangling references.
For simplicity, this sketch has less::operator() just invoke the PhysicalComparator. The actual implementation would have more layers. It would be roughly equivalent to what the internals of the self_comparator::less factory above does today.

Additional Thoughts

I've come to believe that "comparator" is probably an inappropriate name for the owning types. It's not a comparator (not invokable), it just preprocesses and holds data needed by the actual comparators.

I don't have a good suggestion for a different name yet.

jrhemstad commented 2 years ago

@bdice @rwlee @devavret

jrhemstad commented 2 years ago

Staring at this some more and thinking about names for the "owning" comparators I realized that there is effectively no reason for them to exist.

Once you make the callables freestanding types, the only purpose of the owning comparators is to add the minor convenience of not needing to construct the preprocessed tables yourself. Otherwise we can eliminate them entirely.

// self
table_view input_table{...};
preprocessed_table processed_input{input_table}; // "Expensive" construction that does necessary pre-processing and allocations
auto callable = cudf::row::self::less{p}; // "Cheap" construction that _views_ the internals of the preprocessed table
thrust::sort(..., callable);
...
// two table
table_view lhs{...};
table_view rhs{...};
auto lhs_processed = preprocessed_table{lhs};
auto rhs_processed = preprocessed_table{rhs};
auto callable = cudf::row::two::less{lhs_processed, rhs_processed};
thrust::merge(..., callable);

I have an idea of how we can get rid of needing to be explicit about self vs two_table as well that I need to explore some more.

devavret commented 2 years ago

Staring at this some more and thinking about names for the "owning" comparators I realized that there is effectively no reason for them to exist.

This is how I was thinking of designing this originally.

ttnghia commented 2 years ago

I would prefer some other term like comparator_builder for the owning ones. So you create just one builder that provides any type of comparator that you need later on, or even to share its preprocessed data.

jrhemstad commented 2 years ago

Alright, here's how we can get rid of the owning comparator types all together (obviously keeping the preprocessed_table owning type) and get rid of the user-facing "two-table" vs "self" comparators.

https://godbolt.org/z/fz1veocdc

The main idea is to just infer "self" vs "two-table" based on how many tables are passed to the less constructor. Using deduction guides we can steer these two situations to two different partial specializations of the less template.

ttnghia commented 2 years ago

IMO, it looks more natural to do it this way:

auto builder = ...;
auto comp = builder.less(); // builder.equal();

than doing this way:

auto  lhs = preprocessed_data{...}; // <= what data? why preprocessed?
auto  rhs = preprocessed_data{...};
auto comp = less{lhs, rhs};

Maybe I missed some key insight here?

jrhemstad commented 2 years ago

Cuts both ways :)

auto builder = ...; //  what builder? Why build? 
auto comp = builder.less(); // less? Is this a predicate? builder is less than what?

Furthermore, the preprocessed_data was just a shorthand for the proof of concept. The real code would still keep the preprocessed_table nomenclature.

table_view lhs, rhs;
auto lhs_processed = preprocessed_table{lhs};
auto rhs_processed = preprocessed_table{rhs};
auto callable = less{lhs_processed, rhs_processed};

Also, if you want to reuse the same preprocessed tables with a builder it becomes even more code:

table_view lhs, rhs;
auto lhs_processed = preprocessed_table{lhs};
auto rhs_processed = preprocessed_table{rhs};
auto builder = build{lhs_processed, rhs_processed};
auto comp = builder.less();

All the "builder" does is hide the construction of the preprocessed_table in some situations. That does provide some minor convenience, but imo not enough to make it worth all the extra code and complexity of providing the builder.

Also, with a "builder" it's not clear what it's lifetime requirements are. Is it owning? How long do I need to keep it alive?

Constructing the preprocessed_table makes it explicit to the caller that this is an object that needs to be kept alive.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

vyasr commented 2 years ago

One additional consideration that came up here during the implementation of the list lexicographic comparator is the need for separate code paths for when the compared tables contain nested data to avoid slowdowns of the non-nested data case due to the compiler's inability to sufficiently inline and optimize the complex code paths involving nested types. Whatever approach we take here will need to account for that as well.

rapidsai / cudf

[FEA] Alternate design for owning/non-owning comparators #11040