apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.12k stars 839 forks source link

[core]Committer operator memory optimization #3592

Closed codeTai closed 1 week ago

codeTai commented 1 week ago

Purpose

Linked issue: close https://github.com/apache/paimon/issues/3590

When submitting a snapshot triggers a full compaction of the manifest file, we hope to reduce the usage of the taskManager heap memory.

Tests

Before the comparison test, fine-grained-resource-management has been enabled,The committer operator runs in a separate task manager.

Heap memory usage before optimization:

image

Heap memory usage after optimization: image

API and Format

Documentation