apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.48k stars 970 forks source link

[core]delete created_from_snapshot and created_from_snapshot from branchTable #4159

Closed herefree closed 1 month ago

herefree commented 2 months ago

Purpose

created_from_snapshot should be the origin of the tag,not the earliest snapshot of the branch.

Linked issue: close #xxx

Tests

API and Format

Documentation

LinMingQiang commented 2 months ago

It is possible that the branches tag does not exist but the snapshot exists?

herefree commented 2 months ago

It is possible that the branches tag does not exist but the snapshot exists?

Yes, if you create empty branch then insert some data to this branch, or you create branch from tag then delete branch's tag. In above two case snapshot will exists,tag will not exists. In this case created_from_tag will be null. I do not think we should remain created_from_snapshot,because we only support create_branch from tag. Snapshot will expire quickly, it's hard for us to know whether the earliest snapshot of the branch is from the main branch.

LinMingQiang commented 2 months ago

created_from_snapshot can be the origin of the tag?

herefree commented 2 months ago

created_from_snapshot can be the origin of the tag?

That's okay, I'll modify it.

LinMingQiang commented 2 months ago

With the current architecture, we cannot accurately obtain the origin snapshot of the branch. Maybe we should open a discussion thread to discuss this issue.

herefree commented 2 months ago

With the current architecture, we cannot accurately obtain the origin snapshot of the branch. Maybe we should open a discussion thread to discuss this issue.

I create a issue #4166

JingsongLi commented 1 month ago

Can we just delete created_from_tag and created_from_snapshot first?

herefree commented 1 month ago

Can we just delete created_from_tag and created_from_snapshot first?

I think it's ok if we can't get the exact tag and snapshot.I'll change it.

herefree commented 1 month ago

Can we just delete created_from_tag and created_from_snapshot first?

have changed.