ArsenyYankovsky / nova-odm

A schema-based data mapper for Amazon DynamoDB.
https://github.com/ArsenyYankovsky/nova-odm
Apache License 2.0
34 stars 3 forks source link

`NumberValue` type is incompatible with v3 Document Client marshaller logic #12

Open hmmmsausages opened 8 months ago

hmmmsausages commented 8 months ago

Hi,

I recently encountered perhaps a bit of a niche issue, which may only affect users that read from the DynamoDB via @nova-odm but write to it via the v3 DynamoDB document client, for example in order to do transactional writes, which aren't supported by @nova-odm, nor were supported by the v2 predecessor @aws/dynamodb-data-mapper.

For example in scenarios where someone is feeding data unmarshalled by @nova-odm into the official v3 DynamoDB document client, which then marshals it again for DynamoDB, one will end up with wrongly converted number values if these were unmarshalled to the type NumberValue.

So instead of:

{
  "id": { "S": "123" },
  "amount": { "N": "4000" }
}

one would end up with a record like this in the DynamoDB:

{
  "id": { "S": "123" },
  "amount": { "M": { "value": { "S": "4000" } } }
}

This cross-compatibility between the marshalling logic of the @aws/dynamodb-data-mapper and the aws-sdk (v2) used to work without problems, as NumberValue objects from the data mapper were correctly interpreted by the marshaller of the DynamoDB document client in the v2 SDK.

Working example in v2 ecoystem:

import { NumberValue } from "@aws/dynamodb-auto-marshaller";
import AWS from "aws-sdk";

const unmarshalledRecord = {
  id: "123",
  amount: new NumberValue("4000"),
};

const marshalledRecord = AWS.DynamoDB.Converter.marshall(unmarshalledRecord);
// { "id": { "S": "123" }, "amount": { "N": "4000" } }

Failing example in v3 ecosystem:

import { NumberValue } from "@nova-odm/auto-marshaller";
import { marshall } from "@aws-sdk/util-dynamodb";

const unmarshalledRecord = {
  id: "123",
  amount: new NumberValue("4000"),
};

// marshall(unmarshalledRecord);
// throws "Error: Unsupported type passed: 4000. Pass options.convertClassInstanceToMap=true to marshall typeof object as map attribute."
//
// as NumberValue instance cannot be mapped to any known type so is treated as random object
// which be default results in an error if the marshaller is not explicitly told to convert it to a map

const marshalledRecord = marshall(unmarshalledRecord, {
  convertClassInstanceToMap: true,
}); // {"id":{"S":"123"},"amount":{"M":{"value":{"S":"4000"}}}}

Possible Cause

After looking into the differences of the v2 and v3 marshalling logic, it seems that v3 previously established the type of a variable via a utility function typeOf (lib/dynamodb/types.js#L3) which is used before the conversion takes place (ib/dynamodb/converter.js#L29C1-L29C1) and which basically only really looks at the name of the constructor. So if two different classes have the same name, then they would be treated equally, i.e. in this case the NumberValue instance constructed by the @aws/dynamodb-data-mapper would be treated the same way as an instance of the NumberValue class defined in the v2 SDK (ib/dynamodb/numberValue.js#L11).

The new v3 SDK instead only performs an instanceof comparison (util-dynamodb/src/convertToAttr.ts#L41) against the NumberValue class defined in the @aws-sdk/util-dynamodb module (util-dynamodb/src/NumberValue.ts#L16). As @nova-odm's own NumberValue class does not inherit from that class though, the instanceof check fails and the NumberValue object is treated like a random class which depending on the marshaller configuration of the v3 SDK ends up being converted to a map instead.

Possible Solution

I'm not entirely sure what's the best way to reestablish this cross-compatibility, but perhaps the NumberValue implementation of @nova-odm could inherit from the v3 NumberValue definition as provided by the @aws-sdk/util-dynamodb module.

My personal workaround for the moment is to refrain from using the v3 document client and instead use the v3 generic DynamoDB client and marshal the records with the marshaller from @nova-odm.

Any thoughts or ideas?

ArsenyYankovsky commented 7 months ago

One possible workaround (if you're sure your numbers will not lose prevision when converted into a number JS type) is to enable automatic unwrapping like this for your attribute:

@attribute({ unwrapNumbers: true })

I will investigate ways of ensuring compatibility going further. I think we should consider using the NumberValue from @aws-sdk/util-dynamodb and also support BigInt.

hmmmsausages commented 7 months ago

One possible workaround (if you're sure your numbers will not lose prevision when converted into a number JS type) is to enable automatic unwrapping like this for your attribute.

Yes good point. I was also thinking of that initially but then I didn't want to risk unexpected behaviour in cases where I may have to deal with numbers that are outside of the primitive number min-max range. Hence for the moment just relying on the @nova-odm marshaller when reading / writing with the official v3 SDK to play it safe.

Thanks for planning to look into it. 👍

ArsenyYankovsky commented 7 months ago

After looking a bit on the code, could you please provide a bigger example of how to reproduce getting NumberValue back from nova-odm? At least in my test it seems to go to this to unmarshall a number: https://github.com/ArsenyYankovsky/nova-odm/blob/5b37c5016dd26fcfdf42612359b7d13daf97c667/packages/marshaller/src/unmarshallItem.ts#L84.

hmmmsausages commented 7 months ago

I think the trick here is to get the Marshaller from the @nova-odm/auto-marshaller module involved.

So the unmarshalValue function needs to deal with a schema type of Any, Collection or Hash, so it enters the following case in the switch statement:

https://github.com/ArsenyYankovsky/nova-odm/blob/5b37c5016dd26fcfdf42612359b7d13daf97c667/packages/marshaller/src/unmarshallItem.ts#L46-L57

Because then it will end up using the unmarshalValue function of the Marshaller class which seems to do the following for numbers:

https://github.com/ArsenyYankovsky/nova-odm/blob/5b37c5016dd26fcfdf42612359b7d13daf97c667/packages/auto-marshaller/src/Marshaller.ts#L180-L184

I don't really know why NumberValue objects are only created in those circumstances. I could be wrong but it seems to me that DynamoDB numbers will not be unmarshalled to NumberValue objects if the properties in question are top-level properties. Don't know if that's an oversight by the original developers or if it has some actual meaning. Perhaps that's another thesis that could be verified by testing if the data mapper unmarshals top-level number props from DynamoDB records incorrectly if they're outside the number primitive ranges.

I've found the following unit-test which uses a to-be unmarshalled test input which will end up with NumberValue objects in its unmarshalled state. I think that test input can be used to reproduce getting a NumberValue back from nova-odm: https://github.com/ArsenyYankovsky/nova-odm/blob/5b37c5016dd26fcfdf42612359b7d13daf97c667/packages/marshaller/src/unmarshallItem.spec.ts#L63-L92