There are places in the code base in which multiple write interactions with ZK is done as a part of one operation. If any of these interactions fails or if the pinot component fails between these ZK interactions, then we'll be in an inconsistent state. For example we have this situation in segment commit end when a consuming segment gets committed. To clean up the issues from the mentioned failures, we have set up a periodic task (segment validation manager job), and periodically look for these inconsistencies and try to fix them.
A better approach is to use the ZK Transaction API to prevent having these inconsistencies in the first place. At the beginning of the operation, we can create a ZK transaction object and then use the transaction object to interact with ZK by:
creating new ZNode
modifying an existing ZNode
deleting an existing ZNode
When ZK operations are done, then we commit all of them at once. If commit is successful, then all ZK operations have successfully completed, otherwise none will be applied.
By briefly looking at Helix API's, it looks like Helix doesn't expose ZK transaction API's. Until Helix provides the transaction API's, I think we should directly use Zookeeper client to leverage transaction capabilities which, in turns, reduces the chances of facing the mentioned failure cases. It'll also help simplifying the code base to handle these edge cases which is getting more complicated by adding new features.
There are places in the code base in which multiple write interactions with ZK is done as a part of one operation. If any of these interactions fails or if the pinot component fails between these ZK interactions, then we'll be in an inconsistent state. For example we have this situation in segment commit end when a consuming segment gets committed. To clean up the issues from the mentioned failures, we have set up a periodic task (segment validation manager job), and periodically look for these inconsistencies and try to fix them.
A better approach is to use the ZK Transaction API to prevent having these inconsistencies in the first place. At the beginning of the operation, we can create a ZK transaction object and then use the transaction object to interact with ZK by:
When ZK operations are done, then we commit all of them at once. If commit is successful, then all ZK operations have successfully completed, otherwise none will be applied.
By briefly looking at Helix API's, it looks like Helix doesn't expose ZK transaction API's. Until Helix provides the transaction API's, I think we should directly use Zookeeper client to leverage transaction capabilities which, in turns, reduces the chances of facing the mentioned failure cases. It'll also help simplifying the code base to handle these edge cases which is getting more complicated by adding new features.