ryan-mars / stochastic

TypeScript framework for building event-driven services. Easily go from Event Storming → Code.
MIT License
6 stars 1 forks source link

AWS: Serverless runtime & CDK infrastructure #12

Closed ryan-mars closed 3 years ago

ryan-mars commented 3 years ago
ryan-mars commented 3 years ago

Thoughts on key sharding for event store in DynamodDB. Access patterns sorted by frequency.

Given a service "Operations" with an aggregate named "Flight" with an id of "PA576" and an event of type "FlightDeparted"...

{
  id: "1s6DrF0P9U6z1cQplUINwLlyQG3", // KSUID
  time: "2021-05-05T03:12:22.000Z",
  source: "Flight",
  source_id: "PA576",
  bounded_context: "Operations",
  event_type: "FlightDeparted",
  payload: {
    from: "SFO", 
    departed_at: "2021-05-04T12:30:00-07:00"
  }
}

All events for a single aggregate id, in order sk BEGINS_WITH "EVENT"

pk: Flight#PA576                       // source#source_id
sk: EVENT#1s6DrF0P9U6z1cQplUINwLlyQG3  // EVENT#id 

Latest snapshot sk EQ "SNAPSHOT"

pk: Flight#PA576                     // source#source_id
sk: SNAPSHOT 

All events for an aggregate type within specified period

gsi1pk: Flight                       // source 
gsi1sk: 1s6DrF0P9U6z1cQplUINwLlyQG3  // id  KSUID from "2021-05-05T03:12:22.000Z"
KeyConditionExpression="#p = :p AND #s BETWEEN :start and :end"
ExpressionAttributeNames={
    "#p": "gsi1pk",
    "#s": "gsi1sk" 
},
ExpressionAttributeValues={
    ":p": { "S": "Flight" },
    ":start": { "S": "1s6DoW0E203SU12dTtKVGOiZtgD" }, // KSUID from "2021-05-05T03:12:00.000Z"
    ":end": { "S": "1s6Dw1V4QI7MR5CuC2SDOzw6d7g" }    // KSUID from "2021-05-05T03:13:00.000Z"               
}

🤔

sam-goodwin commented 3 years ago

What's a snapshot?

sam-goodwin commented 3 years ago

All events for an aggregate type within specified period

gsi1pk: Flight                      
gsi1sk: 1s6DrF0P9U6z1cQplUINwLlyQG3

Does this create a hot partition in dynamo? All flight events will be in one dynamo partition.

ryan-mars commented 3 years ago
gsi1pk: Flight                      
gsi1sk: 1s6DrF0P9U6z1cQplUINwLlyQG3

Does this create a hot partition in dynamo? All flight events will be in one dynamo partition.

Yes 🤦🏻‍♂️ I wasn't thinking. Maybe it could be randomly key sharded depending on volume. For instance Flight#01 - Flight#20 (up to the BatchGetItem max of 100)

ryan-mars commented 3 years ago

What's a snapshot?

A snapshot is saved state of the aggregate. It should only be used when old events must be deleted for data privacy reasons or when an aggregate has so many events that it is affecting write (command) performance.

Snapshots are added "as of" an event #. For our purposes the aggregate reducer would take the latest snapshot (if one exists) as initialValue and process all subsequent events.

Snapshots are not necessary for the demo milestone.

ryan-mars commented 3 years ago

Event replay should be the least frequent of all access patterns. Perhaps it would be better off done from S3 or EFS where it's easy to name files with a sortable event ID. See the "Going Plaid in S3" section of this article.