tywalch / electrodb

A DynamoDB library to ease the use of modeling complex hierarchical relationships and implementing a Single Table Design while keeping your query code readable.
MIT License
1k stars 64 forks source link

`Set` within property of type `any` serialized to object instead of array #430

Open aqaengineering opened 1 week ago

aqaengineering commented 1 week ago

Describe the bug I have an entity schema (call it A) that includes an attribute (call it "b") of type any. The attribute is replicated from another entity with a different schema (call it B). Schema B includes an attribute (among others) of type set with string items. When retrieving entities of Schema A, attribute "b" correctly includes all of the properties of entity B, but the attribute of type set is an empty object (even if the the set isn't empty in the DynamoDB record of both entity A and B). Does Electro support this schema, or maybe a bug?

Schema

      model: {
        entity: "A",
        version: "1",
        service: "service",
      },
      attributes: {
        b: {
          type: "any",
        },
      }

      model: {
        entity: "B",
        version: "1",
        service: "service",
      },
      attributes: {
        s: {
          type: "set",
          items: "string"
        },
      }

ElectroDB Version 2.15.0

aqaengineering commented 1 week ago

Did some more investigating, looks to be a bug in Set serialization.

Given the above,

assigning b the following,

{ x: 1, s: new Set("y") }

results in the following output Dynamo expression (trimmed for clarity),

"b": {
    "x": 1,
    "s": {}
}

If I update the schema of attribute b to the following,

b: {
    type: "map",
    properties: {
        x: {
          type: "number"
        },
        s: { 
          type: "set",
          items: "string"
         }
    } 
}

assigning b the same as above results in the following output Dynamo expression (trimmed for clarity),

"b": {
    "x": 1,
    "s": [
        "y"
     ]
}

So the second example with the more explicit schema serializes to an array as expected, while the first incorrectly serializes to an object.

tywalch commented 1 week ago

Could you demonstrate this with a more complete code example? I want to make sure I fully understand, but it is a bit difficult for me without all the pieces. An example in the playground would be best but a snippet would also help 👍

aqaengineering commented 1 week ago

No problem, below is an ElectroDB playground example that shows s of type set incorrectly serialized to an empty object in the resulting Dynamo expression.

import { Entity, Service } from "electrodb";

const table = "your_table_name";

/* Tasks Entity */
const tasks = new Entity(
  {
    model: {
      entity: "tasks",
      version: "1",
      service: "taskapp"
    },
    attributes: {
      taskId: {
        type: "string",
        required: true
      },
      userStuff: {
        type: "any",
      },
    },
    indexes: {
      projects: {
        pk: {
          field: "pk",
          composite: ["taskId"]
        },
        sk: {
          field: "sk",
          composite: ["taskId"]
        }
      },
    }
  },
  { table }
);

/* Users Entity */
const users = new Entity(
  {
    model: {
      entity: "users",
      version: "1",
      service: "taskapp"
    },
    attributes: {
      userId: {
        type: "string",
        required: true
      },
      x: {
        type: "number",
      },
      s: {
        type: "set",
        items: "string"
      }
    },
    indexes: {
      projects: {
        pk: {
          field: "pk",
          composite: ["userId"]
        },
        sk: {
          field: "sk",
          composite: ["userId"]
        }
      },
    }
  },
  { table }
);
const app = new Service({ users, tasks });

const userStuff = {
  x: 1,
  s: new Set("y")
};

tasks
  .patch({ taskId: "abc123" })
  .set({ userStuff })
  .go();

Generated Dynamo expression,


{
    "UpdateExpression": "SET #userStuff = :userStuff_u0, #taskId = :taskId_u0, #__edb_e__ = :__edb_e___u0, #__edb_v__ = :__edb_v___u0",
    "ExpressionAttributeNames": {
        "#pk": "pk",
        "#sk": "sk",
        "#userStuff": "userStuff",
        "#taskId": "taskId",
        "#__edb_e__": "__edb_e__",
        "#__edb_v__": "__edb_v__"
    },
    "ExpressionAttributeValues": {
        ":userStuff_u0": {
            "x": 1,
            "s": {}
        },
        ":taskId_u0": "abc123",
        ":__edb_e___u0": "tasks",
        ":__edb_v___u0": "1"
    },
    "TableName": "your_table_name",
    "Key": {
        "pk": "$taskapp#taskid_abc123",
        "sk": "$tasks_1#taskid_abc123"
    },
    "ConditionExpression": "attribute_exists(#pk) AND attribute_exists(#sk)"
}
tywalch commented 6 days ago

I see -- you're right that this is unexpected, thank you for submitting this!

aqaengineering commented 6 days ago

Don't mention it! Let us know if there's anything we can do to help patch and/or test.

tywalch commented 1 day ago

@aqaengineering I took a look at this, and I have a question to clarify: What is your desired outcome here?

aqaengineering commented 1 day ago

@tywalch thanks for the follow-up. The desired outcome is to have the set property serialized to an equivalent representation when it's cast to any - so in this case would think an array containing the set element as a property in the JSON representation of userStuff. e.g.

{
  x: 1,
  s: ["y"]
}

If helpful, the use case where this has come up is in duplicating a Dynamo record into another record, wherein the schema of the first record is verbose and changing quite frequently, so isn't desirable to model explicitly in the schema of the entity to which it's being duplicated. any is a lot easier for us in this case and flexible, as we wouldn't benefit from explicit schema modeling.

This approach has worked well for us in the past, just more recently having added a set property to an entity this serialization issue has arisen.