awslabs / amazon-dynamodb-lock-client

The AmazonDynamoDBLockClient is a general purpose distributed locking library built on top of DynamoDB. It supports both coarse-grained and fine-grained locking.
Other
472 stars 85 forks source link

Document limitations of locks issued #62

Open tobyhei opened 3 years ago

tobyhei commented 3 years ago

The documentation states that e.g if used for leader election the lock can be relied upon to ensure there is only one leader.

As per https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html which is referenced in Amazons own library https://aws.amazon.com/builders-library/leader-election-in-distributed-systems/ this isn’t correct without a fence using a monotonic version number.

This library seems to only use a UUID for the record version meaning clients can’t implement a fence correctly.

Could the documentation either mention thIs limitation or mention how a fence can be implemented by clients if I’ve missed something.

tobyhei commented 2 years ago

I now see this has already been discussed, but it doesn’t seem like a resolution was reached and instead the issue was locked. https://github.com/awslabs/amazon-dynamodb-lock-client/issues/1

jpo-tu commented 2 years ago

https://github.com/awslabs deleted a comment from fernomac [on Oct 12, 2017]

Ya what happened in that thread for it to just go dead.

jpo-tu commented 2 years ago

referenced in Amazons own library https://aws.amazon.com/builders-library/leader-election-in-distributed-systems/

The paper even recommends to not do what this library does: "Avoid heartbeating leases in a background thread" yet that's what createHeartbeatBackgroundThread is for.

suitianshiwx commented 5 months ago

The documentation states that e.g if used for leader election the lock can be relied upon to ensure there is only one leader.

As per https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html which is referenced in Amazons own library https://aws.amazon.com/builders-library/leader-election-in-distributed-systems/ this isn’t correct without a fence using a monotonic version number.

This library seems to only use a UUID for the record version meaning clients can’t implement a fence correctly.

Could the documentation either mention thIs limitation or mention how a fence can be implemented by clients if I’ve missed something.

Yeah, this lock client actually targets "efficiency" instead of "correctness". if during the processing the lock expires, there's no way to check that. Even with the "fencing token" mentioned by Martin, there can still be race conditions because "validating the token" and "take the writes" are not atomic operations and don't happen instantaneously.

suitianshiwx commented 5 months ago

There's a way to ensure correctness, but the supported use case is very limited (e.g. by relying on a CAS provided by db system, can work iff lock and resource access are in the same table/row)