brefphp / dynamap

DynamoDB object mapper. Like Doctrine or Eloquent, but for DynamoDB.
MIT License
16 stars 1 forks source link

RFC: Supporting advanced design patterns with Dynamap #1

Open nealio82 opened 5 years ago

nealio82 commented 5 years ago

I've spent a little bit of time looking at Dynamap and thinking about how to adapt it for supporting more advanced access patterns than a simple multi-table key/value store (eg; encapsulating an entire application's data store within a single table, as is the AWS recommendation)

Adding GSIs and LSIs to support m:m and o:m 'relationships' are probably the most daunting part of gettting to grips with DynamoDB, and something which I think we can support relatively easily.

However, I think this will necessitate some deep changes in the way Dynamap currently works. Documentation will also need to be clear about how best to use / implement the design patterns with Dynamap / DynamoDB, including defining your access patterns before you start creating tables where possible, and also recommending best practices such as using UUIDs for identifiers.

I was thinking that we could automate the generation of GSIs and LSIs by using something similar to Doctrine association mapping, using annotations (or perhaps PHP array-based config similar to what the current version of Dynamap uses).

To do this, we'll probably need to create a simple schema generation tool and compare it against the table definition. The knock-on of this is that we'll also need to be able to create migrations of some sort (or maybe just a schema update tool).

At the moment I think the design constraints mean we'd probably need to define one or more 'root' entities (eg Article), and allow relationship mapping between those and linked entities by creating a GSI or LSI depending on if the relationship is m:m or o:m. However, I'm not 100% certain on this. I need to do more reading / experimenting / talking to my rubber duck to see if it's actually the case.

I propose that we alter the hierarchy of the mapping to move the table element higher, in order to reinforce the notion that multiple entities can live inside the same wide-column table:

eg, to have:

$mapping = [
    'tables' => [
        'name' => 'articles',
        'mappings' => [
            Article::class => [
                'keys' => [
                    'id',
                ],
            ],
            Author::class => [
                'keys' => [
                    'id',
                ]
            ]
        ]
    ]

rather than:

$mapping = [
    Article::class => [
        'table' => 'articles',
        'keys' => [
            'id',
        ],
    ],
   Author::class => [
        'table' => 'articles',
        'keys' => [
            'id',
        ],
    ],

Adding relationships in (assuming a PHP array config rather than annotations at this moment in time), we could end up with something like this:

$config = [
    'tables' => [
        'name' => 'articles',
        'config' => [
// Set AWS table options, eg: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ProvisionedThroughput.html
            'ProvisionedThroughput' => [
                'ReadCapacityUnits' => ...,
                'WriteCapacityUnits' => ...,
                ],
        ],
        'mappings' => [
            Article::class => [
                'keys' => [
                    'id',
                ],
                'relationships' => [
                     Author::class => 'manyToOne', // inverse of 'oneToMany' on Author below
                     Tag::class => 'manyToMany' // will create a GSI
                ],
            ],
            Author::class => [
                'keys' => [
                    'id',
                ]
                'relationships' => [
                     Article::class => 'oneToMany', // will create an LSI
                ],
            ],
            Tag::class => // ...
        ]
    ]

Note: to support multiple entities in the same table, PK & SK fields will probably be need to be prepended with the entity name, eg Article-1319f90c-d8b2-46d9-ac2b-0255eee97374

I was planning on looking at roave/better-reflection to reconstitute the objects after fetching from DynamoDB & to inspect properties for mapping, but at the moment I'm not sure if it's necessary or not.

Edit: for the time-being I think I'll stick to PHP's own reflection API until I need better-reflection.

For reference, these videos talk about modelling relationships inside a single table with DynamoDB (both are well worth watching): https://www.youtube.com/watch?v=ziqm6q-JsGQ https://www.youtube.com/watch?v=HaEPXoXVf2k

WDYT?

mnapoli commented 5 years ago

First of all I'm very sorry to catch up only now. That sounds very interesting!

The way you present it makes total sense. One thing I would like to keep though (and maybe that's supported by what you offer): DynamoDB should stay usable as a "dumb" key-value store as well. Some small projects just need one table to store simple values. And learning about GSI or LSI can be a step to high at first.

$mapping = [
    'tables' => [
        'name' => 'articles',
        'mappings' => [
            Article::class => [
                'keys' => [
                    'id',
                ],
            ],
        ]
    ]

☝️ that seems doable with what you suggest so I guess it's fine!

I really like how you abstract the complexity away. I have to admit I should spend much more time learning about GSI/LSI and mapping entity relations…

nealio82 commented 4 years ago

It's my turn to apologise for leaving this for so long! I still plan to implement the above, but owing to external commitments I've completely run out of time to focus on this. I'll keep going as-and-when I can (maybe I'll find some time over Christmas) but I just can't give it as much attention as I'd like to at the moment. I just didn't want you to think I'd forgotten about / abandoned it... :D

mnapoli commented 4 years ago

No problem, it's the same for me! These things come and go when we have a need related to this. For the moment what already exists works, but I'm sure we'll move this things further in the future.

M1ke commented 4 years ago

Just stumbled across this and wondering if it's gone any further in your head/IDE @nealio82? I've played around with a similar concept after reading some of Alex DeBrie's work on single table DDB design; I also very much like to take advantage of language features in defining the rules of the data store, so I've written some abstract classes that require functions created for defining the object keys. However I've not had a real world project to actually push me to develop the single table concept properly.

I've looked around and basically every DDB mapper treats it like "oh it's Eloquent" (ActiveRecord really isn't a good fit when you're charged per write) or "oh it's Doctrine" (which I feel misses Doctrine's advantages of associations, and UnitOfWork is overkill when you have to write back the whole object anyway). Whereas really we need to admit DDB is intended as something different and make GSI/LSI a first class citizen of whatever library becomes the main one in this space.

That is to say, I'm interested in where you or @mnapoli might take this in future and would be up for contributing if it went in a more "real DDB" direction.

mnapoli commented 4 years ago

My current state of mind is to actually avoid making a project that tries to make it all. It's been a mistake I've done too often.

So it may make more sense to have 2 (or more) projects:

That way each project can be optimized for its use case. And nothing prevents to share a few bits if appropriate.