nodecosmos / charybdis

Rust ORM for ScyllaDB and Apache Cassandra
MIT License
108 stars 6 forks source link
cassandra cql database orm rust scylla scylladb

Rust ORM for ScyllaDB and Apache Cassandra

⚠️ This project is currently in an early stage of development. Feedback and contributions are welcomed!

Crates.io [License]() Docs.rs Discord

scylla_logo cassandra_logo

Charybdis is a ORM layer on top of ScyllaDB Rust Driver focused on easy of use and performance

Usage considerations:

Performance consideration:

Table of Contents

Charybdis Models

Define Tables

  use charybdis::macros::charybdis_model;
  use charybdis::types::{Text, Timestamp, Uuid};

  #[charybdis_model(
      table_name = users,
      partition_keys = [id],
      clustering_keys = [],
      global_secondary_indexes = [username],
      local_secondary_indexes = [],
      static_columns = []
  )]
  pub struct User {
      pub id: Uuid,
      pub username: Text,
      pub email: Text,
      pub created_at: Timestamp,
      pub updated_at: Timestamp,
      pub address: Address,
  }

Define UDT

  use charybdis::macros::charybdis_udt_model;
  use charybdis::types::Text;

  #[charybdis_udt_model(type_name = address)]
  pub struct Address {
      pub street: Text,
      pub city: Text,
      pub state: Option<Text>,
      pub zip: Text,
      pub country: Text,
  }

🚨 UDT fields must be in the same order as they are in the database.

Note that in order for migration to correctly detect changes on each migration, type_name has to match struct name. So if we have struct ReorderData we have to use #[charybdis_udt_model(type_name = reorderdata)] - without underscores.

Define Materialized Views

  use charybdis::macros::charybdis_view_model;
  use charybdis::types::{Text, Timestamp, Uuid};

  #[charybdis_view_model(
      table_name=users_by_username,
      base_table=users,
      partition_keys=[username],
      clustering_keys=[id]
  )]
  pub struct UsersByUsername {
      pub username: Text,
      pub id: Uuid,
      pub email: Text,
      pub created_at: Timestamp,
      pub updated_at: Timestamp,
  }

Resulting auto-generated migration query will be:

  CREATE MATERIALIZED VIEW IF NOT EXISTS users_by_email
  AS SELECT created_at, updated_at, username, email, id
  FROM users
  WHERE email IS NOT NULL AND id IS NOT NULL
  PRIMARY KEY (email, id)

Automatic migration

Basic Operations:

For each operation you need to bring respective trait into scope. They are defined in charybdis::operations module.

Insert

Find

Update

Delete

Configuration

Every operation returns CharybdisQuery that can be configured before execution with method chaining.

let user: User = User::find_by_id(id)
    .consistency(Consistency::One)
    .timeout(Some(Duration::from_secs(5)))
    .execute(&app.session)
    .await?;

let result: QueryResult = user.update().consistency(Consistency::One).execute(&session).await?;

Supported configuration options:

Batch

CharybdisModelBatch operations are used to perform multiple operations in a single batch.

Partial Model:

Callbacks

Callbacks are convenient way to run additional logic on model before or after certain operations. E.g.

Implementation:

1) Let's say we define custom extension that will be used to update elastic document on every post update:

    pub struct AppExtensions {
        pub elastic_client: ElasticClient,
    }

2) Now we can implement Callback that will utilize this extension:

    #[charybdis_model(...)]
    pub struct Post {}

    impl ExtCallbacks for Post {
        type Extention = AppExtensions;
        type Error = AppError; // From<CharybdisError>

       // use before_insert to set default values
        async fn before_insert(
            &mut self,
            _session: &CachingSession,
            extension: &AppExtensions,
        ) -> Result<(), CustomError> {
            self.id = Uuid::new_v4();
            self.created_at = Utc::now();

            Ok(())
        }

        // use before_update to set updated_at
        async fn before_update(
            &mut self,
            _session: &CachingSession,
            extension: &AppExtensions,
        ) -> Result<(), CustomError> {
            self.updated_at = Utc::now();

            Ok(())
        }

        // use after_update to update elastic document
        async fn after_update(
            &mut self,
            _session: &CachingSession,
            extension: &AppExtensions,
        ) -> Result<(), CustomError> {
            extension.elastic_client.update(...).await?;

            Ok(())
        }

        // use after_delete to delete elastic document
        async fn after_delete(
            &mut self,
            _session: &CachingSession,
            extension: &AppExtensions,
        ) -> Result<(), CustomError> {
            extension.elastic_client.delete(...).await?;

            Ok(())
        }
    }

Collections

For each collection field, we get following:

1) ### Model:

    #[charybdis_model(
        table_name = users,
        partition_keys = [id],
        clustering_keys = []
    )]
    pub struct User {
        id: Uuid,
        tags: Set<Text>,
        post_ids: List<Uuid>,
        books_by_genre: Map<Text, Frozen<List<Text>>>,
    }

2) ### Generated Collection Queries:

Generated query will expect value as first bind value and primary key fields as next bind values.

```rust
impl User {
    const PUSH_TAGS_QUERY: &'static str = "UPDATE users SET tags = tags + ? WHERE id = ?";
    const PUSH_TAGS_IF_EXISTS_QUERY: &'static str = "UPDATE users SET tags = tags + ? WHERE id = ? IF EXISTS";

    const PULL_TAGS_QUERY: &'static str = "UPDATE users SET tags = tags - ? WHERE id = ?";
    const PULL_TAGS_IF_EXISTS_QUERY: &'static str = "UPDATE users SET tags = tags - ? WHERE id = ? IF EXISTS";

    const PUSH_POST_IDS_QUERY: &'static str = "UPDATE users SET post_ids = post_ids + ? WHERE id = ?";
    const PUSH_POST_IDS_IF_EXISTS_QUERY: &'static str = "UPDATE users SET post_ids = post_ids + ? WHERE id = ? IF EXISTS";

    const PULL_POST_IDS_QUERY: &'static str = "UPDATE users SET post_ids = post_ids - ? WHERE id = ?";
    const PULL_POST_IDS_IF_EXISTS_QUERY: &'static str = "UPDATE users SET post_ids = post_ids - ? WHERE id = ? IF EXISTS";

    const PUSH_BOOKS_BY_GENRE_QUERY: &'static str = "UPDATE users SET books_by_genre = books_by_genre + ? WHERE id = ?";
    const PUSH_BOOKS_BY_GENRE_IF_EXISTS_QUERY: &'static str = "UPDATE users SET books_by_genre = books_by_genre + ? WHERE id = ? IF EXISTS";

    const PULL_BOOKS_BY_GENRE_QUERY: &'static str = "UPDATE users SET books_by_genre = books_by_genre - ? WHERE id = ?";
    const PULL_BOOKS_BY_GENRE_IF_EXISTS_QUERY: &'static str = "UPDATE users SET books_by_genre = books_by_genre - ? WHERE id = ? IF EXISTS";
}
```

Now we could use this constant within Batch operations.

```rust
let batch = User::batch();
let users: Vec<User>;

for user in users {
    batch.append_statement(User::PUSH_TAGS_QUERY, (vec![tag], user.id));
}

batch.execute(&session).await;
```

3) ### Generated Collection Methods: push_to_<field_name> and pull_from_<field_name> methods are generated for each collection field.

```rust
let user: User::new();

user.push_tags(tags: HashSet<T>).execute(&session).await;
user.push_tags_if_exists(tags: HashSet<T>).execute(&session).await;

user.pull_tags(tags: HashSet<T>).execute(&session).await;
user.pull_tags_if_exists(tags: HashSet<T>).execute(&session).await;

user.push_post_ids(ids: Vec<T>).execute(&session).await;
user.push_post_ids_if_exists(ids: Vec<T>).execute(&session).await;

user.pull_post_ids(ids: Vec<T>).execute(&session).await;
user.pull_post_ids_if_exists(ids: Vec<T>).execute(&session).await;

user.push_books_by_genre(map: HashMap<K, V>).execute(&session).await;
user.push_books_by_genre_if_exists(map: HashMap<K, V>).execute(&session).await;

user.pull_books_by_genre(map: HashMap<K, V>).execute(&session).await;
user.pull_books_by_genre_if_exists(map: HashMap<K, V>).execute(&session).await;
```

Ignored fields

We can ignore fields by using #[charybdis(ignore)] attribute:

#[charybdis_model(...)]
pub struct User {
    id: Uuid,
    #[charybdis(ignore)]
    organization: Option<Organization>,
}

So field organization will be ignored in all operations and default value will be used when deserializing from other data sources. It can be used to hold data that is not persisted in database.