Describe the bug, including details regarding any error messages, version, and platform.
Grouping on empty keys aggregates against the whole table (relates to #14896). But there are 3 hash aggregate functions which do not have corresponding scalar aggregate functions: hash_distinct, hash_list, and hash_one. Grouping on empty keys with those raises a key error.
In []: table = pa.table({'key': list('aba'), 'value': [0, 1, 2]})
In []: table.group_by(['key']).aggregate([('value', 'list')])
Out[]:
pyarrow.Table
key: string
value_list: list<item: int64>
child 0, item: int64
----
key: [["a","b"]]
value_list: [[[0,2],[1]]]
In []: table.group_by([]).aggregate([('value', 'list')])
...
ArrowKeyError: No function registered with name: list
In []: table.group_by([]).aggregate([('value', 'min')])
Out[]:
pyarrow.Table
value_min: int64
----
value_min: [[0]]
Describe the bug, including details regarding any error messages, version, and platform.
Grouping on empty keys aggregates against the whole table (relates to #14896). But there are 3 hash aggregate functions which do not have corresponding scalar aggregate functions:
hash_distinct
,hash_list
, andhash_one
. Grouping on empty keys with those raises a key error.Component(s)
C++, Python