Currently for Upsert tables,
Using implicit partitioned replica-group assignment from low-level consumer won't persist the instance assignment (mapping from partition to servers) to the ZooKeeper, and new added servers will be automatically included without explicit reassigning instances (usually through rebalance).
To provide an example, we create a Upsert table with BalanceNumSegmentAssignmentStrategy (2 replicas), on a 4 nodes tenant. the partitions can be assigned to
The newServer hosting primaryKeys of partition0 but not all the primarykeys are hosted on newServer, and it will failed to lookup the primary keys during ingestion, and duplicates keys and incorrect query results.
The concerns of using implicit partitioned replica-group assignment is, adding new node and rebalancing the table are not atomic operations. After a tenant expansion and before the table get rebalanced, we will see incorrect result for Upsert table.
Is there any reason/scenarios that we need the current behavior of the implicit assignment?
Shall we change the implicit assignment behavior to be the same as the explicit assignment?
Currently for Upsert tables,
Using implicit partitioned replica-group assignment from low-level consumer won't persist the instance assignment (mapping from partition to servers) to the ZooKeeper, and new added servers will be automatically included without explicit reassigning instances (usually through rebalance).
To provide an example, we create a Upsert table with BalanceNumSegmentAssignmentStrategy (2 replicas), on a 4 nodes tenant. the partitions can be assigned to
When adding one extra server without rebalancing the table, we started to see
The newServer hosting primaryKeys of partition0 but not all the primarykeys are hosted on newServer, and it will failed to lookup the primary keys during ingestion, and duplicates keys and incorrect query results.
The concerns of using implicit partitioned replica-group assignment is, adding new node and rebalancing the table are not atomic operations. After a tenant expansion and before the table get rebalanced, we will see incorrect result for Upsert table.
Is there any reason/scenarios that we need the current behavior of the implicit assignment? Shall we change the implicit assignment behavior to be the same as the explicit assignment?