apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.06k stars 445 forks source link

Update FATE transaction ids to be globally unique across multiple stores #4044

Closed cshannon closed 6 months ago

cshannon commented 9 months ago

While working on changes for #3559 to store FATE operations inside an Accumulo Table, I realized that we will need some way to track FATE transactions globally. After we update FATE to store operations in Accumulo we still need the ZK store for FATE operations for the root/metadata tables (for example). Also we might want to have multiple Accumulo stores depending on how we want to design things such as storing operations for users separately from system operations in different tables.

Regardless of how the exact design turns out we are going to need multiple stores and right now FATE transactions are unique to a single store so we need some way to make sure things are unique. At the same time we could refactor the transaction id to be a better Id than just a long.

keith-turner commented 9 months ago

This can help prevent bugs like creating a FATE operation in FATE instance A and trying to use the id in FATE instance B. Could also be useful for debugging, if when a FATE id is logged its easy to see what FATE instance the id came from.

kevinrr888 commented 8 months ago

I would like to work on this

kevinrr888 commented 7 months ago

The end goal is to have the stronger type FateId replace the current representation of a transaction id (which is just a long). This was brought about from the addition of the AccumuloStore class - there are now two fate instance types associated with a transaction - META (for ZooStore) or USER (for AccumuloStore). FateId is a new class which includes the FateInstanceType and the transaction id.

TODO list for this issue:

(the above have been completed and merged in by PR#4191)

(the above has been completed and merged in by PR#4228)

(the above has been completed and merged by PR#4247)

(the above has been completed and merged by PR#4258)

(the above has been completed and merged by PR#4350)

(the above has been completed and merged by PR#4370)

EdColeman commented 7 months ago

During this change, it would be desirable for the FATE transaction ids to nativity sort by creation timestamp. That would allow for determining order of operations of things solely by examining the FATE ids. This would apply to listing things in the fate store as well as looking at logs. If FATE_ID_1 < FATE_ID_2 then it can be immediately seen that FATE_ID_1 was created before FATE_ID_2 (for whatever timestamp precision is available)

One way to provide this could be to use UUIDs that conform to the emerging UUIDv7 standard. The gist of UUIDv7 - they are 128 bit UUIDs that put the timestamp portion of the UUID first, and random bits at the end with other identifying info in the middle. There are variants that are UUIDv7 compatible that allow for sub-second timing information if that precision is wanted.

kevinrr888 commented 6 months ago

All above TODOs have been completed. I believe this issue can be closed now