apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.43k stars 956 forks source link

[core] throw exception if table is recreated when it still being read #4445

Closed LsomeYeah closed 2 weeks ago

LsomeYeah commented 2 weeks ago

Purpose

Linked issue: close #xxx

A table which is read in streaming read, will not throw exception even if the table is recreated currently. For example, the expected snapshot id is 10001, but the latest snapshot of the recreated table might be null or much less than 10001.

This is because when try to get the next snapshot, if the next snapshot is not exists in filesystem and its id is greater than earliest snapshot id, it will be treat as the next snapshot is still in generating progress.

This feature adds a check in NextSnapshotFetcher#getNextSnapshot. If next expected snapshot id is greater than the latest snapshot id plus one, it will throw an exception.

Tests

PrimaryKeyFileStoreTableITCase#testRecreateTableWithException

API and Format

Documentation