sam-goodwin / eventual

Build scalable and durable micro-services with APIs, Messaging and Workflows
https://docs.eventual.ai
MIT License
174 stars 4 forks source link

fix: support sparse indicies #358

Closed thantos closed 1 year ago

thantos commented 1 year ago

Sparse indices allow optional fields to be used in global or local index keys.

export const counter = entity("counter5", {
  attributes: {
    n: z.number(),
    namespace: z.union([
      z.literal("different"),
      z.literal("default"),
      z.literal("another"),
    ]),
    id: z.string(),
    optional: z.string().optional(),
  },
  partition: ["namespace", "id"],
});

export const countersByOptional = counter.index("countersOrderedByNamespace", {
  partition: ["id"],
  sort: ["optional"],
});
thantos commented 1 year ago

Good question. My plan was to not generate the attribute when any part was undefined. This part is missing from the PR...

The other option is to allow undefined in they query keys, but the behavior would be different when all vs part of the key is missing.

sam-goodwin commented 1 year ago

How about empty string? So two back to back delimiters? Let's look up what they recommend in the dynamodb guide? https://www.dynamodbguide.com/

thantos commented 1 year ago

How about empty string? So two back to back delimiters? Let's look up what they recommend in the dynamodb guide? https://www.dynamodbguide.com/

Creating a non-empty value would defeat the purpose of a sparse index.

In the case when you have a one part key type, the item would not show up in the index.

ex:

I have a deploymentId as the partition key, but not all items have a deployment ID. I can query by deploymentId

await deploymentIdIndex.query({ deploymentID: "123" });

I wouldn't want an partition of all things without a deploymentId. (if I did, I would just make a special value).

Or lets say I want to sort by endTime, but not all items are complete yet.

const endTimeIndex = myEntity.index({
   sort: ["endTime"]
});

await endTimeIndex.query({ somePartition });

I would not want to have items with no end or start time inserted in somewhere.

So in the case of a multi-attribute key part with an optional attribute?

const deploymentIdIndex = myEntity.index("deploymentIdIndex ", {
   partition: ["orgName", "deploymentId"]
});

await deploymentIdIndex.query({ orgName: "something", deploymentID: "123" });

Here it doesn't make sense to have an "OrgName#undefined" partition because we don't support undefined partitions in keys and the case you would a prefix vs an undefined key would be ambigous. It would also force all items without a deploymentId into a single partition, making it impossible to leverage sparse indices.

undefined Sort index part would be worse because the undefined values would always be at the top.

Creating a value would also change the behavior between a single attribute and a multi-attribute key, where a single can be sparse and a multi cannot.

I argue that we don't generate a key for any generated attribute that is missing part of it.

  1. We can always add an option to fill a default value later
  2. Users can put default values in place if they want this behavior ( deploymentId: "" )
sam-goodwin commented 1 year ago

I don't understand. Don't we use delimiters to query. What about the case where the sort key is a composite and the first value is optional.

sort: ["a", "b"]

.query({ a: undefined, })

thantos commented 1 year ago

I don't understand. Don't we use delimiters to query. What about the case where the sort key is a composite and the first value is optional.

sort: ["a", "b"]

.query({ a: undefined, })

Updated so the query key and the result of the query/scan will reflect that the key attributes are no longer optional.

image

image