Open llama90 opened 8 months ago
Dictionary Array
sorting was implemented in #35280, but not for ChunkedArray
, RecordBatch
and Table
. You can checked out the SortIndicesMetaFunction
class in vector_sort.cc
for existing implementations, in particular the concrete RadixRecordBatchSorter
and MultipleKeyRecordBatchSorter
.
Supporting equal
for null types won't be hard. It should be added in scalar_compare.cc
. A kernel should be added in the MakeCompareFunction
function. For the sake of consistency, I'd assume the compare kernels should just return a NullArray for NullArray inputs? So I guess registering a trivial kernel that returns the first input should be enough?
codegen_internal.h/cc
contains utility functions for compute kernels so they are probably not the real source of your problems. All the files I mentiond are also under cpp/src/arrow/compute/kernels/
Describe the enhancement requested
Hello. While implementing join operation support for the Dictionary type, I encountered the following message.
I am attempting to support the Dictionary type through the following steps:
non-key
columnskey
columnsI discovered the following error while taking step 2.
'_error_or_value55.status()' failed with Type error: Unsupported type for RecordBatch sorting: dictionary<values=string, indices=int32, ordered=0>
The detailed content is as follows.
It appears that this error occurs due to the absence of sorting operation implementation for the Dictionary type, which is observed in the process of verifying the result values after performing the join operation.
Additionally, I attempted to support key column operations for the Null type, but encountered a similar type of error in this case as well.
'_error_or_value45.status()' failed with NotImplemented: Function 'equal' has no kernel matching input types (null, null)
Following these two error messages led me to the files below:
Should I reference the logic in these files to implement the following functionalities, and then proceed with the join operation?
dictionary
type.null
typesThe files mentioned above may not fundamentally be the problem, but it seems that type-specific operations are primarily needed to perform the join operation.
I am aiming to support sorting for the Dictionary Type to address the feature that triggers the error.
It would be great for some advice if I am misunderstanding the problem, or if anyone is well-informed about this part..
Component(s)
C++