linkedin / venice

Venice, Derived Data Platform for Planet-Scale Workloads.
https://venicedb.org
BSD 2-Clause "Simplified" License
487 stars 84 forks source link

[server][dvc] support hybrid store in blob transfer with recreating snapshots #1240

Closed jingy-li closed 1 week ago

jingy-li commented 2 weeks ago

[server][dvc] support hybrid store in blob transfer with recreating snapshots

Blob transfer currently does not support hybrid stores because snapshot generation occurs only after Kafka ingestion is complete in batch store, which can result in outdated snapshots for hybrid stores.

To resolve this issue, this PR will:

  1. Regenerate a new snapshot if snapshot is staled when a client sends a GET request.
  2. Ensure that the transferred offset record accurately reflects the one immediately preceding the snapshot's recreation. If the snapshot manager generates the snapshot first and uses the current offset record for the client, the most recent/largest offset record may be excluded from the snapshot.
  3. Throttle concurrent users. The blob snapshot manager will maintain a concurrentSnapshotUsers map to limit the number of hosts that can initiate a snapshot simultaneously for each topic and partition. If too many hosts attempt to request a snapshot for the same topic and partition, the server will respond with a 404 error. The maximum number of allowed concurrent users will be controlled by config.
  4. Manage snapshot timestamps by implementing a config-controlled snapshot retention time to ensure the freshness of hybrid snapshots.

How was this PR tested?

unit test. [WIP] integration test

Does this PR introduce any user-facing changes?