Closed misha closed 2 years ago
The issue with allowing nodes with unbounded future clock drifts is that one bad node with a date that's years in the future can effectively disable the "wall clock" part of the HLC, turning it into a simple logical clock.
Consider node A, B and C. Node A and B have correct times and are within a couple of seconds drift between each other. Node C, however comes in 1 year in the future, performs one transaction and is never seen again.
If you allow Node C to participate even once, you now have set the HLC to December 2022 causing A and B to start incrementing the logical counter for each change for an entire year. This can become a problem since the HLC only reserves 4 bytes for the logical counter which can easily overflow depending on your system's size and level of activity.
You bring up a interesting point though: I arbitrarily picked a max drift of 1 minute under the assumption that in the modern world every node will be synched to NTP. I'm now considering whether that value shouldn't be revised, or made configurable.
Unfortunately, the use case I'm considering would use a CRDT as an offline-available database, so synchronizing to NTP would not be feasible.
You are absolutely correct that an excessive, future clock drift defeats the purpose of the HLC. This made me question how the application data would react, especially if it stores dates. The more realistic approach is likely as you suggestion, preventing manipulation of the data in the presence of clock drift to begin with.
Thanks again for your comments - since this library's implementation was intended I'll go ahead and close my issue.
By the way, for my application I just needed the clock on its own so I published an implementation to Pub here.
I thought about publishing the HLC implementation as a separate packaged as well but decided against it since the implementation details will be particular to the CRDT it's used in.
Still, there are some existing implementations that I tried to follow in case my system had to interoperate with others in the future (e.g. https://github.com/jlongster/crdt-example-app/blob/master/shared/timestamp.js). I noticed you're using :
as a separator between the wall clock, logical counter, and node id. The implementations I saw while implementing my solution all used -
which makes sense since the wall clock already uses :
. Using -
makes it easier to break the HLC-string into parts.
An aside: did you base your code on my implementation? If so it would be nice to get a mention :-)
Hi! Thanks for this library. I've got a question about this library's
Hlc
class.From what I understand about HLCs, they can deal with the pathological case where one node sets the system time to point in the future. I was reading this article where the author states the following about the use of HLC in a CRDT:
Using this library, I wrote the following code:
Which yields:
Upon closer inspection, the code in
Hlc.recv
does not appear to match the one in the blog post mentioned above, and thus cannot deal with the pathological case of a future clock. Is this an implementation error?Please let me know if I misunderstood CRDTs, HLCs, or this library in any way. This is brand new territory for me.