Open toien opened 1 month ago
Expire snapshots only removes data files which are no longer needed by any remaining snapshots. The output of your command shows no files were needed to be removed. Based on the fact that snapshot number seems to have increased in your second request I would be suspicious that you have snapshots older than default time for that command. Retain last is a "minimum" not a maximum, if your expire snapshots doesn't specifically state what the age limit is, it will only expire snapshots older than 5 days (I think, check the docs to be sure).
The bin-pack command you shows it only found 2 files to compact.
Most of the time when folks have this issue it is because they don't have enough small files in a given partition to trigger compaction. By default, the command will only compact files within a partition if there are 5 or more files in that partition that need compaction. See the docs for more info
Snapshots number increased because Flink job still writing data to table.
In my opinion, it's better to clerify retain_last
parameter's "minimum" function in doc:
Number of ancestor snapshots to preserve regardless of
older_than
.
After doing some tests, I finally start understanding iceberg's mainteinance procedures. Hope this help people are new to iceberg like me.
rewrite_data_files
Rewrite data files is a procedure reading source small files, compacting, and writing a new one. It won't delete old small files.
Data files, as leaf level of iceberg table layer, they belong manifest files. Deleting source small files will break its manifest file.
This procedure will optimize data files(usually merging) and create a new version(snapshot) of table.
rewrite_manifests
Unlike data files, rewrite_manifests
will replace old ones.
This procedure will optimize manifest files(usually merging) and create a new version(snapshot) of table.
expire_snapshots
Always use older_than
paramter.
If data files expected to be deleted still remains in S3 or HDFS, recheck metadata tables after executing procedure. They may be linked in manifests or entries.
Say we have a table upserting by flink jobs, which will create a lot data files and metadata. Hourly executing these would optimize iceberg table:
rewrite_data_files
rewrite_manifests
When it comes to partitioned table, say partition by day:
expire_snapshots
on old partitions (this is one-time job).
rewrite_manifests
Unlike data files,
rewrite_manifests
will replace old ones.
Actually, this procedure also just creates a new snapshot and keeps the old metadata files for the original snapshot. If you want to remove the old metadata files, you have to run the ExpireSnapshots procedure.
As you are using Flink to write data to an Iceberg table, you might want to follow https://github.com/orgs/apache/projects/358. This ongoing project aims to provide a Flink-specific solution for the problems mentioned above.
Query engine
Spark SQL on AWS EMR(7.1.0)
Versions:
Question
First i create an iceberg table like:
Flink streaming jobs will calc results and upsert into this table. so that would create many snapshots by Flink checkpoints:
Here is the problem: When I use Spark SQL do
expire_snapshots
, It DO cost time to execute this jobBut nothing been deleted!
And data files on S3 still there.
Spark Job finished successfully:
The same problem occurs when call
rewrite_data_files
TOO, small data files are NOT been compacted(merged).