kitar / laravel-dynamodb

A DynamoDB based Eloquent model and Query builder for Laravel.
MIT License
179 stars 27 forks source link

where -> delete #32

Closed ryanhungate closed 1 year ago

ryanhungate commented 1 year ago

Is there support to delete more than one record at a time? You can scan the dataset using the where, but not delete?

kitar commented 1 year ago

In a short answer, we don't support it at this time.

In DynamoDB, we need to specify the entire primary key to delete an item, so we can't delete multiple items with a condition.

Alternatively, DynamoDB has BatchWriteItem or Transaction to handle multiple item deletions at once, but our library still needs to be implemented. (#33, #8)

ryanhungate commented 1 year ago

@kitar yeah i'm working on this right now on my end. I'll post the working code snippet once I figure that out ;)

kitar commented 1 year ago

@ryanhungate Cool! Thanks for working on it :)

ryanhungate commented 1 year ago

@kitar i'm not sure the best way to implement this in your library, but what I ended up doing was create a trait that can be used in the dynamo model which acts more as a helper for MY use case of needing to delete a set of events. In our case we have a primaryKey and a sortKey - where we're using a "store_id" for the primary... but then an "event_id" for the sort key. This is what I came up with.

<?php

namespace App\Traits;

/**
 * Trait HasDynamoConnection
 * @package App\Traits
 */
trait HasDynamoConnection {
    /**
     * @return array[]
     */
    public function getModelKeyForBatch()
    {
        $key = [];
        $key[$this->primaryKey] = ['S' => $this->getAttribute($this->primaryKey)];
        if ($this->sortKey) {
            $key[$this->sortKey] = ['S' => $this->getAttribute($this->sortKey)];
        }
        return ['Key' => $key];
    }

    /**
     * @return array[]
     */
    public function getDeleteRequestForBatch()
    {
        return [
            'DeleteRequest' => $this->getModelKeyForBatch()
        ];
    }

    /**
     * @param array $operations
     */
    public function batchDynamoOperations(array $operations)
    {
        $model = new static;
        $table = $model->getTable();
        $client = $model->getConnection()->getClient();
        // collect the operations, and reject anything that doesn't have a proper request type.
        collect($operations)->reject(function($operation) {
            return !is_array($operation) ||
                   (
                       !array_key_exists('DeleteRequest', $operation) &&
                       !array_key_exists('PutRequest', $operation)
                   );
        })->values()
            // chunk 25 records at a time since this seems to be the limit for batching.
          ->chunk(25)
          ->map(function($chunk) use ($table, $client) {
            return $client->batchWriteItem([
                'RequestItems' => [
                    $table => $chunk->all(),
                ],
            ]);
        });
    }

    /**
     * @return \Aws\DynamoDb\DynamoDbClient
     */
    public function getDynamoDbConnection()
    {
        return $this->getConnection()->getClient();
    }
}

And then use somewhere:


$operations = DynamoModelHere::keyCondition('identity_id', '=', 'ryan.test')->query()->map->getDeleteRequestForBatch()->all();

return (new DynamoModelHere)->batchDynamoOperations($operations);
kitar commented 1 year ago

@ryanhungate Thank you! I'll look into your trait to see what we can do with our library.

ryanhungate commented 1 year ago

@kitar this is great, thanks so much for adding support for this feature. For us, it's mandatory since our operations rely on larger datasets that we use Dynamo for a time based ( save to db, queue up an atomic lock job every minute, push to remote services, then delete ) which is done very regularly, and could easily have hundreds if not thousands of events. Batching the deletes are kind of important for rate limits alone.

That being said, would you consider adding the methods in the model to getDeleteRequestForBatch so from within a "collection" of results that are returned, we can just do this:

// get a collection of models
$operations = DynamoModelHere::keyCondition('identity_id', '=', 'ryan.test')->query();

... do whatever ...

// delete the models in bulk
DynamoModelHere::batchDeleteItem($operations->map->getDeleteRequestForBatch()->all());

It's not the end of the world but I would think having this sort of short option which could be kept in the model code might be beneficial so you're not reinventing the wheel on every delete request trying to map arrays to the proper index/sort names. Just an idea. :)

kitar commented 1 year ago

Thanks for looking! It's very helpful to know how real-world applications using this library.

Also, thanks for the idea. I think it might be useful, but I'm not sure yet if we should implement it into this library. The premise is that we want to keep this library simple, and the following way doesn't seem like much of a hassle.

DynamoModelHere::batchDeleteItem(
    $items->map(fn ($item) => $item->only(['PK', 'SK']))
);

Let me know your opinion :)

ryanhungate commented 1 year ago

@kitar sure that does reduce it to something manageable, i just personally like wrapping everything like this into a function so you don't have to type array keys or remember schemas etc. I find the Dynamo syntax to be a bit ugly :) No worries on this at all, I can just add a macro to the collection itself to make that easier for me.