qoollo / bob

Distributed BLOB storage
MIT License
31 stars 3 forks source link

Versioning data according to a timestamp provided by the user #735

Open ikopylov opened 1 year ago

ikopylov commented 1 year ago

In the current implementation, user timestamps are used only at the Bob level within a single vdisk. They are completely ignored at the Pearl level and are not used between cluster nodes. We need to implement timestamp based version comparisons everywhere for consistency. We cannot use the timestamps of the local nodes, because in distributed systems there can be an out of sync time on servers, as well as delays in the arrival of data. A common practice is to use external timestamps, provided by the user. In addition, timestamps can be used to optimize storage scans by filtering out BLOBs and holders that definitely contain older data.

Related issues: https://github.com/qoollo/bob/issues/694 https://github.com/qoollo/bob/issues/711 https://github.com/qoollo/bob/issues/708 https://github.com/qoollo/bob/issues/609

ikopylov commented 1 year ago

We can try to generalize versioning with the help of new trait that will select the field from the header or from the metadata that is being used as a version. In this way we can perform versioning by any field (not only by timestamp)