I was playing with arrow2::compute::lexsort, and noticed in the source that build_compare_fn is not used initially if there is only a single sort column. It is used if sort_to_indices fails (unsupported type).
This is sort-of documented, as in the function documentation hints that it should be used in case of unsupported sort types (but it doesn't say only for this case). I would expect to be able to use this function for custom orderings in general, for example. In my case, I won't support null keys, and I found that a custom compare fn that doesn't check nulls performs measurably better.
If the single-column optimization is useful (I guess so as the implementation of sort_to_indices is fully generated for each type), then perhaps it should be done in lexsort_to_indices instead of lexsort_to_indices_impl?
Hi,
I was playing with
arrow2::compute::lexsort
, and noticed in the source thatbuild_compare_fn
is not used initially if there is only a single sort column. It is used ifsort_to_indices
fails (unsupported type).This is sort-of documented, as in the function documentation hints that it should be used in case of unsupported sort types (but it doesn't say only for this case). I would expect to be able to use this function for custom orderings in general, for example. In my case, I won't support null keys, and I found that a custom compare fn that doesn't check nulls performs measurably better.
If the single-column optimization is useful (I guess so as the implementation of
sort_to_indices
is fully generated for each type), then perhaps it should be done inlexsort_to_indices
instead oflexsort_to_indices_impl
?