We have several adaptors and algorithms which take a comparator, and require that the comparator produces a strict weak order over the sequence elements:
cmp::min/cmp::max (for a sequence of two elements)
The comparator for all of these functions follows the classic STL model of requiring a binary predicate which returns true if the first argument is "less than" the second, defaulting to std::ranges::less. We use the standard library std::strict_weak_order concept, in most cases via our own strict_weak_order_for<Seq1, Seq2> concept which checks std::strict_weak_order for all combinations of the sequences' element_t, value_t& and common_element_t (the Flux equivalent of Ranges' std::indirect_strict_weak_order).
The outlier here is flux::compare(seq1, seq2, cmp) which requires the comparator to return a value of one of the three C++20 comparison categories (std::partial_ordering, std::weak_ordering or std::strong_ordering) and performs a lexicographical comparison of the sequence elements returning that category -- that is, the same as std::lexicographical_compare_three_way.
First of all, this is inconsistent. We shouldn't have one algorithm with a completely different interface to all the others.
A simple solution would be to change our compare to be the equivalent of std::lexicographical_compare -- that is, require just a "less than" predicate like everything else.
A potentially better solution would be to consistently use three way comparison everywhere -- specifically, requiring a comparator for all of the above algorithms (except compare) to return at least std::weak_ordering. After all, we're a C++20 library, we can do things the modern way!
As with everything though, there are pros and cons to doing this:
Pros:
Correctness. At the moment, strict weak ordering is basically just a semantic requirement. Although a user could theoretically supply a custom comparator that returns std::weak_ordering but "lies" about it, this seems much less likely than accidentally passing an incorrect predicate.
Proper handling of NaNs: Right now, flux::sort(vector<float>) compiles but does the wrong thing if the data contains NaN values. With the proposed change (assuming std::compare_three_way as the default comparator) then this will fail to compile. This forces the user to think about how they want to handle NaNs; either by passing std::strong_order to use the IEEE total ordering, or by using a custom comparator that ignores NaN values.
Cons:
By default, only types with a spaceship operator could be used with the Flux algorithms. Of course, defaulting the operator is very easy in C++20, but there's probably a lot of code out there that only defines the classic relational operators
Potentially, worse codegen. These algorithms only really care about less-than, but custom comparators (or op<=> implementations) might need to do two compares rather than one. It's probably worth benchmarking this. In particular, using std::strong_order to compare floats generates a lot more code than std::less.
Ergonomics: right now it's easy to sort in descending order by saying sort(vec, std::greater{}). We'd probably want to provide a comparator that inverts ordering::greater and ordering::less to allow the same thing without the user needing to write a lambda to do it themselves.
Unfamiliarity: the STL has used less-than predicate comparators for 30 years, and now suddenly we're asking people to do something different. Likewise, "why can't I use flux::sort on my vector of doubles, std::sort works fine"
Overall, I think this is worth investigating -- at least, benchmarking what happens if we use our current sort with a comparator defined as (roughly):
We have several adaptors and algorithms which take a comparator, and require that the comparator produces a strict weak order over the sequence elements:
find_min
/find_max
/find_minmax
min
/max
/minmax
sort
set_union
/set_difference
/set_symmetric_difference
/set_intersection
cmp::min
/cmp::max
(for a sequence of two elements)The comparator for all of these functions follows the classic STL model of requiring a binary predicate which returns
true
if the first argument is "less than" the second, defaulting tostd::ranges::less
. We use the standard librarystd::strict_weak_order
concept, in most cases via our ownstrict_weak_order_for<Seq1, Seq2>
concept which checksstd::strict_weak_order
for all combinations of the sequences'element_t
,value_t&
andcommon_element_t
(the Flux equivalent of Ranges'std::indirect_strict_weak_order
).The outlier here is
flux::compare(seq1, seq2, cmp)
which requires the comparator to return a value of one of the three C++20 comparison categories (std::partial_ordering
,std::weak_ordering
orstd::strong_ordering
) and performs a lexicographical comparison of the sequence elements returning that category -- that is, the same asstd::lexicographical_compare_three_way
.First of all, this is inconsistent. We shouldn't have one algorithm with a completely different interface to all the others.
A simple solution would be to change our
compare
to be the equivalent ofstd::lexicographical_compare
-- that is, require just a "less than" predicate like everything else.A potentially better solution would be to consistently use three way comparison everywhere -- specifically, requiring a comparator for all of the above algorithms (except
compare
) to return at leaststd::weak_ordering
. After all, we're a C++20 library, we can do things the modern way!As with everything though, there are pros and cons to doing this:
Pros:
std::weak_ordering
but "lies" about it, this seems much less likely than accidentally passing an incorrect predicate.flux::sort(vector<float>)
compiles but does the wrong thing if the data contains NaN values. With the proposed change (assumingstd::compare_three_way
as the default comparator) then this will fail to compile. This forces the user to think about how they want to handle NaNs; either by passingstd::strong_order
to use the IEEE total ordering, or by using a custom comparator that ignores NaN values.Cons:
op<=>
implementations) might need to do two compares rather than one. It's probably worth benchmarking this. In particular, usingstd::strong_order
to compare floats generates a lot more code thanstd::less
.sort(vec, std::greater{})
. We'd probably want to provide a comparator that invertsordering::greater
andordering::less
to allow the same thing without the user needing to write a lambda to do it themselves.flux::sort
on my vector of doubles,std::sort
works fine"Overall, I think this is worth investigating -- at least, benchmarking what happens if we use our current
sort
with a comparator defined as (roughly):and seeing what happens.