sam-goodwin / eventual

Build scalable and durable micro-services with APIs, Messaging and Workflows
https://docs.eventual.ai
MIT License
174 stars 4 forks source link

Design: Entity Indexes (Dynamo GSI and LSI) #334

Closed thantos closed 1 year ago

thantos commented 1 year ago

Add support for Dynamo Index like privatives on entities. Being able to list in alternate orders and partition in alternate ways.

Use cases:

  1. alternate query patterns using keys or key prefixes
  2. alternate partitions/namespaces and orders within the namespaces
  3. alternate ordering for items in the entity with their existing namespaces

NamespaceIndex

An index which maintains the existing namespaces of an entity. Equal to a LSI (LocalSecondaryIndex).

const myIndex = entity.namespaceIndex("Timestamp");

Index

An index which has a different namespace and optionally a different key. Equal to a GSI (GlobalSecondaryIndex).

const myIndex = entity.index("Topic", "Timestamp");

Example:

UserID - namespace (partition) PostID - key (sort key) Timestamp (ISO) Topic Message Status (Active, Inactive) Status-Timestamp (Status+Timestamp)

const userPosts = entity<UserPost>("userPosts");

Want to

  1. Get by UserID and PostID
  2. Update post by PostID and UserID
  3. List by all posts per UserID in timestamp order (or reverse)
  4. List all of today's posts for a user
  5. List posts by topic in timestamp order
  6. List all of today's posts for a topic
  7. List all active posts in timestamp order
  8. List all active posts for a user in timestamp order

Can Currently

  1. Get by UserID and PostID (1)
  2. Update post by PostID and UserID (2)
  3. List by all posts per UserID in post id order (or reverse)
  4. List by PostID prefix in post id order (or reverse)

Workarounds

  1. Posts in timestamp order - The postID can use ulid (ordered uuid) to be naturally ordered

With Index

List by user in timestamp order

const timestampIndex = userPosts.namespaceIndex("Timestamp");
// gets all of the posts of a user in the timestamp order
timestampIndex.list({ namespace: userId });

List all of today's posts for a user

const timestampIndex = userPosts.namespaceIndex("Timestamp");
// gets all of the posts of a user in the timestamp order, filtered by the date part of the timestamp
timestampIndex.list({ namespace: userId, prefix: Date.now().toDateString() });

List posts by topic in timestamp order

// creates an index with a namespace of "Topic" ordered by "Timestamp".
const topicIndex = userPosts.index("Topic", "Timestamp");
// gets all of the posts of a topic in the timestamp order
topicIndex.list({ namespace: "topic" });

List all of today's posts for a topic

// creates an index with a namespace of "Topic" ordered by "Timestamp".
const topicIndex = userPosts.index("Topic", "Timestamp");
// gets all of the posts of a topic in the timestamp order, filtered by the date part of the timestamp
topicIndex.list({ namespace: "topic", prefix: Date.now().toDateString() });

List all active posts in timestamp order

// creates an index with a namespace of "Status" ordered by "Timestamp".
const activePostsIndex = userPosts.index("Status", "Timestamp");
// gets all of the active posts in the timestamp order
activePostsIndex .list({ namespace: "topic" });

List all active posts for a user in timestamp order

// creates an index with a namespace of "Status" ordered by "Timestamp".
const statusTimestampIndex = userPosts.namespaceIndex("Status-Timestamp");
// gets all of the active posts for a user in the timestamp order
statusTimestampIndex.list({ namespace: userId, prefix: "Active" });
sam-goodwin commented 1 year ago

The term "namespaceIndex" is very confusing to me. Is this a local index?

thantos commented 1 year ago

Been trying to find the right terms that are not coupled to dynamo and make sense with the terms we use.

Some options

  1. Namespace index (global) and sort index (local)
    1. Aka namespace index creates a new namespaces
  2. Index (global) and namespace index (local)
    1. Aka namespace index is within each namespace
  3. Only index, type is determined by a flag
    1. Separating seems better because the behavior is different and the limits are different

I started with number 2, but a discussion with chatgpt resulted in number 1 (but it had partition instead of namespace). I'm not attached to any of the naming. Just a starting point.

thantos commented 1 year ago

Closed by #348