apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.16k stars 855 forks source link

[Bug] 'tag.num-retained-max' is unavailable #3502

Closed xyk0930 closed 3 weeks ago

xyk0930 commented 1 month ago

Search before asking

Paimon version

0.9

Compute Engine

spark3.5.1

Minimal reproduce step

1.A paimon table is created with 'tag.num-max' = 10. The complete configuration information is retained below TBLPROPERTIES ( 'bucket' = '-1', 'changelog-producer' = 'none', 'deletion-vectors.enabled' = 'false', 'dynamic-bucket.initial-buckets' = '20', 'dynamic-bucket.target-row-num' = '2000000', 'file.compression' = 'snappy', 'file.format' = 'parquet', 'ignore-delete' = 'false', 'merge-engine' = 'deduplicate', 'path' = 'hdfs://hadoop105:8020/paimon/warehouse/paimon.db/dws_follow_record', 'primary-key' = 'id', 'snapshot.expire.limit' = '10', 'snapshot.num-retained.max' = '2147483647', 'snapshot.num-retained.min' = '10', 'snapshot.time-retained' = '1 h', 'tag.num-retained-max' = '10') 2.I manually wrote some data and created the tag, and here is the result of SELECT * FROM 'dws_follow_record$tags' image

What doesn't meet your expectations?

I set 'tag.num-max' = 10, why can I query 12 tags

Anything else?

No response

Are you willing to submit a PR?

xyk0930 commented 1 month ago

i create tag by spark procedures ,as CALL sys.create_tag(table => '" + tableName + "', tag => '" + tag + "')

Zouxxyy commented 1 month ago

Currently only automatically created tags (tag.automatic-creation) are controlled by tag.num-max, I think we should fix it

Zouxxyy commented 1 month ago

CC @yuzelin