Closed jepetko closed 4 years ago
Hi @jepetko ,
Wow! Tests \(^_^ )/! That's awesome! This module needs some of those, thank you!
With the use case you've outlined, it seems to me that the problem being solved is to guarantee a unique lock per product + product mutation combination. The way I would think of solving this problem is to concatenate product and product mutation to create the lock key. Using your example, if I want to lock product 1 mutation B, I would use 1-B
as the unique identifier for the lock.
Distributed locks are advisory only. One implication is that interpretation of lock name is always up to the clients and depends on agreed upon convention. Is there something that prohibits using the convention of "concatenate product with product mutation using -
character to create the lock name" in this use case?
Cheers,
Tristan
Tests! That is awesome!
When this is in I'd love to start using node-tdd on top of mocha. I think there is a lot of value here in seeing what requests are actually being made. We can discuss all that separately though
Tldr I don't need this feature, but having tests is awesome and I can't wait to be doing more with them 👌
hi @tristanls, @simlu ,
first of all, thank you for your feedback!
@tristanls
What you basically propose is that we should use some composed key to temporarily create an entirely new record in the table for locking purposes only. The actual record would be still "unlocked" since .acquireLock
would serve as "programmatic" guard for operations on the table.
However, this wouldn't work because of some "NOT NULL constraint " on the range key. In our example, the lock attempt 1_B_LOCK
(for locking hash_key=1 + range_key=B) would fail. The attempt 1_LOCK
/B_LOCK
would succeed.
HASH_KEY (product) | RANGE_KEY (product mutation) | data | owner | guid | fencingToken | leaseDurationMs | (more columns) |
---|---|---|---|---|---|---|---|
1 | A | {some: data, ...} | null | null | null | null | ... |
1 | B | {some: data, ...} | null | null | null | null | ... |
1 | C | {some: data, ...} | null | null | null | null | ... |
2 | D | {some: data, ...} | null | null | null | null | ... |
2 | E | {some: data, ...} | null | null | null | null | ... |
1_B_LOCK | null | null | root | abc | 1 | 10000 | ... |
1_LOCK | B_LOCK | null | root | abc | 1 | 10000 | ... |
The exception thrown by DynamoDB (in get
) for the request
{
"TableName": "my-lock-table-name",
"Key": {
"myPartitionKey": "lockId"
// myRangeKey is missing here..
},
"ConsistentRead": true
}
is
'The number of conditions on the keys is invalid'.
So, obviously range_key needs to be provided (I added a dedicated test for it even if it does not test the actual functionality but the behavior in context of DynamoDB usage).
PS: One possible solution could be to use a dedicated table for locking.
As far as I understand, PutItem
, and in case of DocumentClient
: put
, does not do a partial update.
On the first FailOpen
lock refresh, the data
will be overwritten with nothing (line 211, line 260).
Additionally, the first time application does PutItem
with data
all lock state is overwritten with nothing.
I believe to get a partial update, usage would need to change to use UpdateItem
, and in case of DocumentClient
: update
.
PS: One possible solution could be to use a dedicated table for locking.
This has been my underlying assumption. I did not realize you wanted to maintain the locks along with the data in the same table. This complicates the design somewhat and would require refresh lock heartbeat to PutItem
all of the data, not only the lock state, same with lock release logic. (or change to UpdateItem
)
If the constraint is to maintain data and the lock state in the same table, then yes, that requires range key support.
Unless I am misunderstanding the code changes, it doesn't seem like the changes return non-lock data (entire Item
is present in dataBag.lock
but never returned in the callback), in the example, the data
column. As such, it still requires a separate round trip to return data
once a lock is acquired:
FailOpen.acquireLock(...)
--> getItem (1 request/reply)
--> putItem (1 request/reply)
OtherCode.getDataFromDynamoDB(...) (1 request/reply)
(I am assuming that Conditional Writes are not sufficient for the use cases you have in mind.)
Exposing the original Item
in return callback has the benefit of saving a round trip to DynamoDB:
FailOpen.acquireLock(...)
--> (1 request/reply) getItem
--> (1 request/reply) putItem
At this point, you already have data because of first getItem()
. The non-lock-related attributes of the Item
(like data
in the example), if they were exposed, become safe to use after putItem()
succeeds and the callback returns the acquired lock.
Even with all of the above, to me, this implies that your application code has to be aware of the details of the lock implementation and never update your data without coordinating with any ongoing lock activity, since every data
manipulation like PutItem
, and maybe DeleteItem
, you'd have to update the lock attributes correctly. Otherwise, the application would be restricted to only doing UpdateItem
to ensure lock state is not overwritten, and you'd have to enforce reserved attribute names to prevent collision with lock attributes.
Disclaimer: It took some time to do this write up, but I did it mostly in a single pass. If I made a mistake somewhere and my reasoning and conclusions break down because of it, please do point it out.
As a consequence of the above, I recommend keeping locks and the data separate. However, I think I've outlined the changes necessary to get this to work if the same-table constraint is to be retained.
Hi @tristanls ,
yes, there is an additional business logic for the locking on our side (sorry, it's typescript):
public async retrieveProductDetails(req: GetProductRequest, retryNumber = 1): Promise<ProductResponse> {
let lock: Lock;
try {
lock = await this.db.getLock(req.product, req.mutation);
return this._retrieveRemoteProductDetails(req);
} catch (e) {
if (!lock) {
await this.db.blockingSleep(this.config.LOCK_LEASE_DURATION + this.config.LOCK_LEASE_DURATION / 2);
const record = await this.db.read(req.product, req.mutation) as ProductRecord;
const dateTime = moment(record.expirationTime).utc();
if (dateTime.isAfter(moment().utc())) {
return record.product;
}
if (retryNumber === 3) {
return Promise.reject('max. number of locking trials exhausted');
}
return this.retrieveProductDetails(req, ++retryNumber);
}
} finally {
if (lock) {
await this.db.releaseLock(lock);
}
}
}
Considerations:
this.db
is a client on top of DynamoDB aws sdk + dynamodb-lock-clientthis.db.getLock
acquires lock for the product/mutation combination. This does not mean that it overrides the actual record keeping the data of the product. It actually does the following .acquireLock([product + '_LOCK', mutation], callback)
.this._retrieveRemoteProductDetails
retrieves the product via HTTP and saves it in the DB (something like this.db.write(req.product, req.mutation, any data)
)this.db.blockingSleep
blocks before it tries to read the actual product record from the DB (which is not locked because it co-exists with the dedicated lock record in the same table)So, basically, yes, the actual data and the locks are kept in the same table (business requirement, infrastructural restrictions etc.).
The extension of dynamodb-lock-client
in the PR actually assumes that the user knows that the data gets overridden if lockId === hash key of the saved entity
because the actual data and the lock cannot be combined in one record. The added functionality only adds support for the usage of the range key as such.
As I understand it, the forcing constraint is:
and with that being the case and the table used having both hash and a range, dynamodb-lock-client
is unusable.
Thanks for helping me understand @jepetko .
Hi @tristanls,
thank you for your support! I don't know whether closed lock is widely used but I could extend it accordingly as well. Of course, I can update the README too.
Further ideas:
Awesome teamwork. Thank you for that.
Hi @tristanls , the PR has been already merged; for when do you plan to release it? As temporary solution I'm using a manipulated copy of this repo (in production) and would like to switch to the npm package. Thank's a lot. Let me know if I could contribute.
Thank you for your patience @jepetko . I integrated and published 0.7.1.
Please note that I modified the interface from an array to an object. I opted for explicit parameter naming in the style of DynamoDB.DocumentClient. It's an arbitrary change but wanted to call it out. Additionally, I use "sort key" instead of "range key" in line with what I think is the latest DynamoDB vocabulary.
I also used a test framework I'm comfortable with, so the quality of tests now in the module may not be what you expect since I made them mostly unit tests instead of relying on an in memory db.
Hi @tristanls ,
thank you for this release!
I already migrated our stuff to the released npm package. I'm totally fine with switching from mocha to jest. I'm also fine with providing an object (range key, sort key) as configuration for the lock. It's actually much better than an array.
The reason why I was using a real database is the higher confidence when writing DynamoDB statements. I already experienced bugs related to bad statements even with passing tests because mocking/stubbing/spying was used instead of the memory DB.
@tristanls I'd much prefer mocha as we could use node-tdd and we could easily use recordings and hence visualize the requests that are actually being made. Would be best combined with an in memory dynamodb instance for "integration" style tests. Any particular reason you prefer yest?
I created #27 and #28 to track the need to reintroduce in memory db and to make recording availble.
@jepetko are you experiencing bugs with v0.7.1 presently? or were you relaying your experience regarding why you prefer in-memory db?
Hi @tristanls, so far no suspicious behavior.
This will add support for range keys as part of the lock.
Use case: The use case is to design an 1:n relationship between entities. Let's say you want to use the product identifier as the hash key. The mutation of the product serves as the range key.
Since for updating the product mutation a remote HTTP call needs to be performed, you will want to lock it for this period of time in order to avoid race conditions. Let's say the mutation update is related to the mutation 1-B. In this case both, 1-A and 1-C, don't need to be locked.
The PR includes:
acquireLock
still accepts single lock ID as first and callback as second argument (regression tests were written prior to the changes). If range key needs to be passed then the first argument becomes an array with two elements:Usage without range key:
Usage with range key:
Missing points: