Open ashvina opened 1 year ago
@the-other-tim-brown
Do instants uniquely identify commits in Hudi?
Looking at the code, the instants associated with inFlightCommits
are persisted as part of OneTableMetadata
. The on-disk representation is a comma separated string of instants. IMO, OneTable should persist commit-ids.
This is not a high priority issue and can be picked up later.
We can look up an instant based on the instant. We wanted something that would be common between the formats.
@ashvina Like Tim said Onetable format representation is format agnostic and instant would best suit that. Any concerns or further thoughts ?
Another related question: could we make type of
pendingCommits
consistent with type ofcommitsToProcess
, i.e. useCOMMIT
instead ofInstant
? Would it break anything for Hudi? COMMITS uniquely identify a completed or inflight transaction and if needed can provide start and end times of a commit. Instant, on the other hand is a proxy for identifying a COMMIT._Originally posted by @ashvina in https://github.com/onetable-io/onetable/pull/129#discussion_r1375024492_