loro-dev / loro

Reimagine state management with CRDTs. Make your app collaborative effortlessly.
https://loro.dev
MIT License
4.08k stars 76 forks source link

Question: Hybrid Logical Clock instead of Lamport timestamps? #424

Closed zamderax closed 2 months ago

zamderax commented 2 months ago

I've been reading the documentation on Maps here: https://loro.dev/docs/tutorial/map

And it states:

Loro's Map uses LWW (Last-Write-Wins) semantics. When concurrent edits conflict, it compares Lamport logic timestamps to determine the winner.

I am interested in exploring the possibility of implementing the map using Hybrid Logical Clocks (HLC) instead of Lamport timestamps. I believe HLCs could offer several benefits over Lamport timestamps, particularly in systems that require more querying on physical time.

By incorporating physical time, HLCs can reduce the chances of generating conflicting timestamps in distributed systems where operations are frequently concurrent. This might lead to fewer conflicts in the map operations, making it a more robust solution for certain use cases.

Do you have any suggestions of how to implement this?

zxch3n commented 2 months ago

Hi! Recording and querying real-world timestamps for ops are already supported in Loro.

See https://loro.dev/docs/advanced/timestamp

There are APIs such as

/**
* Set whether to record the timestamp of each change. Default is `false`.
*
* If enabled, the Unix timestamp will be recorded for each change automatically.
*
* You can also set each timestamp manually when you commit a change.
* The timestamp manually set will override the automatic one.
*
* NOTE: Timestamps are forced to be in ascending order.
* If you commit a new change with a timestamp that is less than the existing one,
* the largest existing timestamp will be used instead.
* @param {boolean} auto_record
*/
  setRecordTimestamp(auto_record: boolean): void;
/**
* If two continuous local changes are within the interval, they will be merged into one change.
*
* The default value is 1_000_000, the default unit is milliseconds.
* @param {number} interval
*/
  setChangeMergeInterval(interval: number): void;

You can get a change by an op ID and then get the timestamp from it.

/**
* Get the change of a specific ID
* @param {{ peer: PeerID, counter: number }} id
* @returns {Change}
*/
  getChangeAt(id: { peer: PeerID, counter: number }): Change;

export interface Change {
    peer: PeerID,
    counter: number,
    lamport: number,
    length: number,
    timestamp: number,
    deps: OpId[],
}

We already handle the delta compression for the timestamp internally. We group the continuous operations into a single Change that shares this timestamp, which makes it more efficient than the HLC.

It could be helpful to have some helper functions that make querying the timestamp of a certain register (like a map entry, a span of text, or a certain list item) easier. Would you like to contribute that?

zamderax commented 2 months ago

Awesome thank you and I fully follow!