apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.46k stars 966 forks source link

[spark] Support push down aggregate with group by partition column #4275

Closed ulysses-you closed 1 month ago

ulysses-you commented 2 months ago

Purpose

If the aggregate all group by keys are from partition columns, we can support push down it. It makes user to get every partition (or with partition filter) row count fast.

Tests

add test

API and Format

no

Documentation

ulysses-you commented 1 month ago

cc @JingsongLi thank you

YannByron commented 1 month ago

Link to https://github.com/apache/paimon/issues/2404.