apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.48k stars 970 forks source link

[spark] Delete with deletion vectors should enable AQE #4171

Closed ulysses-you closed 2 months ago

ulysses-you commented 2 months ago

Purpose

AQE won't break the required distribution and ordering, so it is unnecessary to disable aqe manually.

Tests

Pass CI

API and Format

no

Documentation

ulysses-you commented 2 months ago

cc @JingsongLi @YannByron thank you

YannByron commented 2 months ago

Disabling AQE will have a better performance when performing delete operations with DV on large amounts of data, but it can submit more tasks in other cases. Paimon can ignore this, and left to users to tune.

YannByron commented 2 months ago

+1

JingsongLi commented 2 months ago

+1