risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.78k stars 561 forks source link

Tracking: deprecate safe epoch and generalize time travel query #18214

Open wenym1 opened 3 weeks ago

wenym1 commented 3 weeks ago

Proposal

Generalize time travel query for all batch queries, which means that all batch query will be handled as time travel query.

In a single HummockVersion, we only provide a single view at the committed epoch rather than views at all epochs between safe_epoch and committed_epoch, and as a result, we can then deprecate safe_epoch.

Moreover, we need to deprecate support on barrier read on uncommitted epoch with consistency.

Motivation

Currently, we have safe_epoch in HummockVersion to specify that, in this HummockVersion, we are safe to make a query on any epoch above this safe_epoch. In other word, we support querying multiple versions of data under different epochs providing a single HummockVersion. The reason for this feature is that, in each CN, we only have a single latest HummockVersion (ignored those versions pinned at created iterators), but in frontend, each session will pin an epoch (PinnedSnapshot), and we want to serve the query from different pinned epoch with this single latest HummockVersion.

This design makes the communication between frontend and CN elegant, but comes with price on the other hands:

After we support time-travel in batch query, to support queries on different epochs, we don't have to rely on a single hummock version, and instead, we can rebuild a hummock version for a specific epoch. Therefore, we can generalize time travel query for all batch queries, which means for all batch queries, we will first figure out a hummock version for the provided epoch, either from the latest version, or rebuild a new version, and then read data the version, and then each hummock version does not need to store multiple versions of a key anymore, and the safe_epoch can be deprecated.

Besides, we need to deprecate support on barrier read on uncommitted epoch with consistency. Currently, for uncommitted barrier read, we pin an uncommitted non-checkpoint current epoch and use this epoch in batch query. However, since this pinned epoch is non-checkpoint epoch, after this checkpoint epoch gets committed, the pinned non-checkpoint epoch will be below the committed epoch, and to support consistent query on this epoch, the committed version will still have to maintain values of multiple versions between the committed epoch and the previous checkpoint epoch. To make things easier, we can still support barrier read, but the batch query of barrier read won't carry any epoch information anymore. The barrier read batch query always reads the latest uncommitted data of each table, and the consistency is ignored.

Tracking

wenym1 commented 3 weeks ago

cc @hzxa21 @zwang28

hzxa21 commented 2 weeks ago

LGTM for the proposal in general!

To make things easier, we can still support barrier read, but the batch query of barrier read won't carry any epoch information anymore. The barrier read batch query always reads the latest uncommitted data of each table, and the consistency is ignored.

+1. Scarifying consistency for simplicity in the context of read uncommitted query sounds reasonable to me. cc @fuyufjh