facebookincubator / velox

A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
https://velox-lib.io/
Apache License 2.0
3.42k stars 1.12k forks source link

Spark's max_by/min_by does not support complex type as compare type which contains non-orderable type #8467

Open Yohahaha opened 7 months ago

Yohahaha commented 7 months ago

Bug description

Spark's max_by/min_by require compare type is orderable.

https://github.com/apache/spark/blob/b07bdea3616fc582a1242d3b47b465cd406c13c4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MaxByAndMinBy.scala#L49-L50

https://github.com/apache/spark/blob/b07bdea3616fc582a1242d3b47b465cd406c13c4/sql/api/src/main/scala/org/apache/spark/sql/catalyst/expressions/OrderUtils.scala#L25-L32

However, current implementation does not have this check, we need to fix it.

System information

n/a

Relevant logs

No response

acvictor commented 7 months ago

@Yohahaha are you working on this?

Yohahaha commented 7 months ago

@Yohahaha are you working on this?

not yet, just record the issue for now.

Yohahaha commented 6 months ago

https://github.com/oap-project/gluten/issues/4501