apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.39k stars 1.26k forks source link

[partial-upsert] On Whether To Enforce CRC Check on Commit for Partial Upsert Tables #12399

Open ankitsultana opened 7 months ago

ankitsultana commented 7 months ago

Issue Description

Partial Upsert tables merge the previous version of a record with the latest version. We can end up with a scenario where the replicas diverge but end up getting committed anyways (both servers keep their local builds).

Once the replicas diverge, that Kafka partition's segments will always be different, until some event forces a reconciliation (e.g. if you restart all servers; since CRC mismatch will trigger a download from deep-store).

If there's no reconciliation for a while, then the situation can become messier because it could be that the other server gets to commit and upload to deepstore in a subsequent segment.

Moreover, after a reconciliation, it will give an illusion that the data has been consistent since forever,

Discussion

Given the criticality of ensuring consistency across replicas for Partial Upsert, should we consider enforcing some checks during commit time itself?

In case of different CRCs across replicas, we could emit a metric, and always pick the committer's segment. The controller could ask the replicas to discard their local copy.

Jackie-Jiang commented 6 months ago

I guess it is a good feature to add to the commit protocol. When a follower (not committing) server trying to build the segment locally and detect a CRC mismatch, it should drop the local copy and download one from deep store.