awslabs / aws-sdk-rust

AWS SDK for the Rust Programming Language
https://awslabs.github.io/aws-sdk-rust/
Apache License 2.0
2.95k stars 242 forks source link

High level DynamoDB client (or ODM) #70

Open Veetaha opened 3 years ago

Veetaha commented 3 years ago

Community Note

Terminology

ODM - object-document mapper

Problem

There currently is a low-level DynamoDB client implemented in the SDK, it works with opaque AttributeValue types and direct DynamoDB APIs https://github.com/awslabs/aws-sdk-rust/blob/7e43b19fd6fcc753bf5ceff4b2f5d13f6db799d8/sdk/dynamodb/src/model.rs#L6206-L6277

This low-level SDK crate provides no convenience APIs (batteries) that simplify the regular idiomatic (Best Practices?, single table design?) usage of DynamoDB.

Working with raw AttributeValues, and ad-hoc implementing common workflows is very inconvenient and error-prone.

Solution

I propose we add a new crate that wraps aws-sdk-dynamodb low-level library and exposes the following "batteries-included" APIs (the list can be extended):

ODM

Implement serialization and deserialization (i.e. object-document mapping) of strongly-typed structs and enums (both plain and discriminated unions) into aws_sdk_dynamodb::AttributeValue via proc macros.

We can use serde to do the bulk of the job. I recommend taking over the job done in serde_dynamo crate, and also learn the approaches dynomite crate does.

The latter crate is more popular, but it is not very actively maintained. However, from my viewpoint, abusing serde as much as possible would be a better approach than implementing proc macros for converting between AttributeValue and strongly-typed structs and enums by hand, but that's debatable.

Condition and update expression builder

Condition expressions are used in queries and scans, and update expressions are used in update operations, and they both use custom DynamoDB syntax. It's okay to use raw strings with the standard Rust format!() macro for simple cases, but sometimes expressions are very dynamic and the expression might depend on lots of different variables and conditions.

Building raw condition expression syntax dynamically is very error-prone, the high-level wrapper crate should expose builders for expressions. See TypeScript's implementation of this concept in @aws/dynamodb-expressions package on npm.

Projection expression utilities

Add some methods, maybe proc macros to generate the types that represent a projection of different combinations of attributes. This requires some more thorough design, but the core problems to solve here:

The API might look something like:

#[derive(Serialize, Deserialize, Projections)]
struct UserRecord {
    partition_key: String,
    sort_key: String,

    #[project(Projection1)]
    name: String,

    // Come up with some syntax for nested properties projection (e.g. `ProjectionName = "<prop access>")
    #[project(Projection1, Projection2, Projection3 = "[0]")]
    departments: Vec<String>,

    #[project(Projection1, Projection2)]
    birth_date: chrono::NaiveDateTime,
}

// such that `#[derive(Projections)]` generates the following code:

#[derive(Serialize, Deserialize, Projection)]
struct Projection1 {
    name: String,
    departments: Vec<String>,
    birth_date: chrono::NaiveDateTime,
}

#[derive(Serialize, Deserialize, Projection)]
struct Projection2 {
    departments: Vec<String>,
    birth_date: chrono::NaiveDateTime,
}

#[derive(Serialize, Deserialize, Projection)]
struct Projection3 {
    // 0-th element of the original projected departments array
    departments: String,
}

// where #[derive(Projection)] implements the `Projection` trait

impl Projection for Projection1 {
    const PROJECTION_EXPRESSION: &'static str = "name, department, birth_date"
}

impl Projection for Projection2 {
    const PROJECTION_EXPRESSION: &'static str = "departments, birth_date"
}

impl Projection for Projection3 {
    const PROJECTION_EXPRESSION: &'static str = "departments[0]"
}

Pagination utilities

Implement methods for streaming pagination (see dynomite::DynamoDbExt to learn about existing implementations).

Other utilities for best practices and common workflows

Implement helper utilities according to AWS docs for DynamoDB best practices and single table design.

For example, we've implemented some proc macros for segmented identifiers (described in single table design) in our private repo. We plan to open-source this code, and we may do it earlier to facilitate the development of the high-level DynamoDB crate.

So the list might also include:

Additional context

This issue is nowhere an exhaustive description of the desired design for the high-level wrapper crate for aws_sdk_dynamodb, the ideas should be refined and probably extended. However, I think it might be a good starting point to begin the discussion and initiate the work on the MVP subset for the planned high-level APIs (e.g. start with only ODM feature and iterate from that next). We may decide to separate the described crate to other repo and split the planned features described here into more fine-grained issues if this makes sense to the maintainers.

Waiting for your feedback!

rcoh commented 3 years ago

This is an awesome deep dive into what's needed in a DynamoDB high level library. Thanks so much for pulling this together!

bryanburgers commented 3 years ago

:wave: Hey, primary maintainer of serde_dynamo here.

We started a new branch that supports aws-sdk-rust: serde_dynamo@3.0.0-alpha.0 and it seems to work fine in the limited testing I've done. As aws-sdk-rust gets closer to being release-ready, I'll put more effort into it, too.

Anyway, I'm just chiming in to say hey, we're still here, we'll help out as we can, etc. etc.

Our business currently uses 5–10 rusoto libraries and lambda_runtime. DynamoDb is our core data storage, which is why we created and maintain serde_dynamo. I'm pretty sure we'll take a nice, long, favorable look at switching to aws-sdk-rust once it's appropriate. So whether serde_dynamo gets rolled in or continues to be independent, I'm pretty sure it will support aws-sdk-rust.

rcoh commented 3 years ago

Awesome, took a quick look through the implementation. I suspect that https://github.com/awslabs/smithy-rs/discussions/387 might be relevant? I wonder if we should also add the into_xyz methods you have, (or impl TryInto?) Anyway, if there are methods the SDK could add to make your life easier, please let us know :-)

nuvacore commented 2 years ago

Is anything being done on this?

Velfi commented 2 years ago

Hey @nuvacore, not currently. We are considering proposals for high-level libraries like this but not until after we GA (which we haven't set a date for.)