nayaverdier / dyntastic

A DynamoDB library on top of Pydantic and boto3.
MIT License
55 stars 12 forks source link

Clarifications on item updates #21

Closed whitfin closed 3 months ago

whitfin commented 3 months ago

Hi! Awesome library, been trying it out and have a couple of questions.

I made a simple class:

import uuid
from datetime import datetime

class Counter(Dyntastic):
    __table_name__ = "counters"
    __hash_key__ = "id"

    id: str = Field(default_factory=lambda: str(uuid.uuid4()))
    value: int = 0
    created: datetime = Field(default_factory=datetime.now)
    updated: datetime = Field(default_factory=datetime.now)

And I made a Lambda handler that does this (the intent is to increment the value by 1):

def run(event: EventBridgeEvent, context: LambdaContext):
    counter = Counter.safe_get(event.detail["id"])

    if not counter:
        counter = Counter(id=event.detail["id"])
        counter.save()

    counter.update(
        A.value.plus(1),
        A.updated.set(datetime.now()),
    )

Although I see the counter being initialized in Dynamo, the updates don't seem to take effect. For reference this does work:

def run(event: EventBridgeEvent, context: LambdaContext):
    counter = Counter.safe_get(event.detail["id"])

    if not counter:
        counter = Counter(id=event.detail["id"])
        counter.save()

    counter.value += 1
    counter.updated = datetime.now()
    counter.save()

So I'm just wondering if maybe I'm misunderstanding something, or if I've stumbled on a bug. I'm also not clear on the difference between these two ways to update, I guess counter.update() is more efficient?

Any help you can provide here would be awesome, thank you for taking the time to read this over 🙏

nayaverdier commented 3 months ago

Looks like it's a mistake in the README, sorry about that. The update should use A.set(A.value.plus(1)) (equivalent to A.set(A.value + 1)).

As is, it should be throwing an exception like '_Plus' object has no attribute 'update_action', can you confirm whether you're getting that from your lambda?

When using .update(), the individiual updates get sent to dynamo as an UpdateExpression like the following:

{'UpdateExpression': 'SET #0 = #1 + :0, #2 = :1',
 'ExpressionAttributeNames': {'#0': 'value', '#1': 'value', '#2': 'updated'},
 'ExpressionAttributeValues': {':0': 1, ':1': '2024-06-07T19:35:24'}}

This is then evaluated by DynamoDB, which ensures the increment is done atomically. If you're curious to see how other updates translate to an UpdateExpression, you can use:

from dyntastic.attr import translate_updates
translate_updates(A.value.set(A.value.plus(1)), A.updated.set("..."))

When updating the attributes on the row instance and calling .save(), it would not take into account and changes between the get and the save. This includes any attributes that were added or removed from the row in DynamoDB, since all we have in memory is the row exactly as it was retrieved. So, if you are not guaranteed to have your updates coming from a single spot, it can cause data to be overwritten.

whitfin commented 3 months ago

@nayaverdier that makes perfect sense, I can try that out and report back shortly. I don't recall seeing any errors, but maybe I just overlooked them in CloudWatch.

Thanks for the explanation too, that's kinda what I figured would be the case but wanted to double check for future reference :)

Also thank you for documenting the multiple actions in the README too, I found that by accident :D

whitfin commented 3 months ago

Yep, this works perfectly :) going to close this out now. Thank you!

whitfin commented 3 months ago

@nayaverdier one follow up (didn't want to open up a new issue); is there a good way to update an item without fetching the full model? e.g. from my example say I have the counter identifier, but I don't want to fetch the counter before updating it (assume I know that the primary key exists in the table)

nayaverdier commented 3 months ago

Right now there's not a great way to do that, but it is possible.

# Only the primary key fields will be utilized in this example,
# but all non-default fields are necessary
row = MyTable(my_primary_key="...", ...)

# without refresh=False, this would trigger a `get` after updating
row.update(A.my_counter.set(A.my_counter + 1), refresh=False)

A bit annoying that all attributes need to be specified since they don't actually reflect the existing row. Feel free to open another issue if you have ideas for API changes to better support this use case.