aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
3.13k stars 580 forks source link

DynamoDBDocument can't unmarshall null #3846

Closed vbanderov closed 1 year ago

vbanderov commented 2 years ago

Checkboxes for prior research

Describe the bug

When trying to query for a record containing null value through DynamoDBDocument an error is thrown:

TypeError: Cannot read properties of null (reading 'S')

SDK version number

@aws-sdk/client-dynamodb@3.142.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v16.16.0

Reproduction Steps

import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocument } from '@aws-sdk/lib-dynamodb';

const client = new DynamoDBClient({
  // ...
});

const document = DynamoDBDocument.from(client);

await document.put({
  TableName: 'table',
  Item: { 
    HashKey: 'hashkey',
    foo: null
   }
});

await document.query({
  TableName: 'table',
  ExpressionAttributeValues: { ':hk': 'hashkey' },
  KeyConditionExpression: `HashKey = :hk`,
});

Observed Behavior

An uncaught error is thrown:

error: TypeError: Cannot read properties of null (reading 'S')

Expected Behavior

The item is successfully returned by the query command.

Possible Solution

Probably need to check if value is null before trying to read property 'S' on it. Not sure where that happens though.

Additional Information/Context

No response

RanVaknin commented 2 years ago

@vbanderov ,

Im not sure what example you used for this. But this looks like a mix of V2 and V3?

From what I can see you are trying to:

  1. put an item in your table
  2. retrieve that item from your table. (You used query but it seems like you're asking for a single object, so you can use get instead)

Put your Item in the DynamoDB table:

import { PutCommand, GetCommand } from "@aws-sdk/lib-dynamodb";
import { ddbDocClient } from "./libs/ddbDocClient.js"; // I initialized mine in a spearate file.

const putParams = {
  TableName: "TEST_TABLE",
  Item: {
    Season: 15,
    Episode: 301,
    Title: {NULL: true},
  },
};

const put = async () => {
  try {
    const data = await ddbDocClient.send(new PutCommand(putParams));
    return data;
  } catch (err) {
    console.error(err);
  }
};
put();

Outputs:

{
  '$metadata': {
    httpStatusCode: 200,
    requestId: 'REDACTED',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  Attributes: undefined,
  ConsumedCapacity: undefined,
  ItemCollectionMetrics: undefined
}

To Retrieve your item, instead of using query for a single item, I'd use get:

import { PutCommand, GetCommand } from "@aws-sdk/lib-dynamodb";
import { ddbDocClient } from "./libs/ddbDocClient.js";

/*
// Set the parameters
const putParams = {
  TableName: "TEST_TABLE",
  Item: {
    Season: 15,
    Episode: 301,
    Title: {NULL: true},
  },
};

const put = async () => {
  try {
    const data = await ddbDocClient.send(new PutCommand(putParams));
    console.log(data)
    return data;
  } catch (err) {
    console.error(err);
  }
};
put();
*/

const getParams = {
  TableName: "TEST_TABLE",

  Key: {
    Season: 15,
    Episode: 301,
  },
};

const get = async () => {
  try {
    const response = await ddbDocClient.send(new GetCommand(getParams));
    if (!response || !response.Item) {
      return null;
    } else {
      console.log('Code Found:', response.Item);
      return response.Item;
    }
  } catch (err) {
    console.error(err);
    return null
  }
}
get();

Outputs:

Code Found: { Season: 15, Title: { NULL: true }, Episode: 301 }
github-actions[bot] commented 2 years ago

This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

trobert2 commented 2 years ago

@RanVaknin I just hit this error and I am using the V3 of the document client ("3.154.0").

export const getArticleNew = async (articleId: string, language: string): Promise<Record<string, any>> => {
  const response = await getDynamoDbDocumentClient().send(
    new GetCommand({
      Key: {
        id: `article:${articleId}`,
        sort: `language:${language}`
      },
      TableName: DYNAMODB_TABLE_NAME
    })
  )
  if (!response || !response.Item) {
    Logger.debug('no items')
    throw new Error(
      `no items found for article "${articleId}" in the "${language}" language`
    )
  }

  return response.Item
}

error:

TypeError: Cannot read properties of null (reading 'S')

after changing the Null field to a string, the code works fine. Failing document item:

{
  "id": {
    "S": "article:2074847"
  },
  "sort": {
    "S": "language:en"
  },
  "page_image": {
    "NULL": true
  }
}

EDIT: when reverting to version 3.131.0 the code works again:

"@aws-sdk/client-dynamodb": "3.137.0",
"@aws-sdk/lib-dynamodb": "3.137.0",

It appears to start failing with 3.141.0

RanVaknin commented 2 years ago

@trobert2 Im not sure I understand what seems to be the issue. Are you getting this erron on getCommand or putCommand?

Can you add your putCommand code?

trobert2 commented 2 years ago

@RanVaknin

Can you add your putCommand code?

I already provided the code that issues the error and what is in the database (see "Failing document item"), exported as json from the console web view.

Why would the put command be relevant? This fails regardless if you write the null field with boto3, web ui or typescript. Just copy paste the object I provided in the guy or click "add field" and select null under the type. image

Partition key

This is an error that occurs when reading from the database.

RanVaknin commented 2 years ago

@trobert2 ,

The reason I was asking for your putCommand code is to see how you are storing your items. (I was hoping to see your js putRequest, wasnt sure if you were using the console to store it)

Something is indeed odd.

I'm able to Put and Get your items without any issue:

  "dependencies": {
    "@aws-sdk/client-dynamodb": "^3.154.0",
    "@aws-sdk/lib-dynamodb": "^3.154.0"
  }
const putParams = {
  TableName: "trobert2",
  Item: {
    id: "article:2074847",
    sort: "language:en",
    page_image: null ,
  },
};

const put = async () => {
  try {
    const data = await ddbDocClient.send(new PutCommand(putParams));
    console.log(data)
    return data;
  } catch (err) {
    console.error(err);
  }
};
put();

response:

{
  '$metadata': {
    httpStatusCode: 200,
    requestId: 'REDACTED',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  Attributes: undefined,
  ConsumedCapacity: undefined,
  ItemCollectionMetrics: undefined
}
image

GetItem:


const getParams = {
  TableName: "trobert2",

  Key: {
    id: "article:2074847",
    sort: "language:en",
  },
};

const get = async () => {
  try {
    const response = await ddbDocClient.send(new GetCommand(getParams));
    if (!response || !response.Item) {
      return null;
    } else {
      console.log('Code Found:', response.Item);
      return response.Item;
    }
  } catch (err) {
    console.error(err);
    return null
  }
}
get();
Code Found: { sort: 'language:en', id: 'article:2074847', page_image: null }

Maybe I am missing something? Let me know your thoughts.

Thanks, Ran~

github-actions[bot] commented 2 years ago

This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

trobert2 commented 2 years ago

hello everyone,

I'm proposing this be re-opened based on the new information I am about to share. this seems to still be an issue in version 3.172.0 which I have used for replication.

@RanVaknin I'm sorry, but I don't really think there was actually new information requested in order to mark the ticket with response-requested.

That being said, I have identified the crux of the problem. It comes from ddbDocClient (or dynamodbDocClient in my case) and how that object has been created. All participants to this conversation (including myself) omitted this information so far.

  const dynamoDb = new DynamoDBClient({ region: AWS_REGION, logger: Logger })

  const dynamodbDocClient = DynamoDBDocumentClient.from(dynamoDb)

removing logger: Logger enables the code to run successfully. So I am assuming there are breaking changes added to that implementation (referring to the logger) that have been submitted since version higher than 3.154.0.

The logger implementation I am using is based on "winston": "3.8.2" I was able to replicate the issue with this simple logger:

import winston from 'winston'

const localLogger = winston.createLogger({})

@vbanderov, would you be so kind to also confirm that you also used a custom logger?

I would like to kindly ask for this issue to be re-opened. I think there should either be more information around the logger object that can be passed as configuration or more restrictions should be added to that type to ensure that this functions as it is expected.

Thank you and kind regards

jeremyjannotta commented 2 years ago

We are experiencing this same issue, using DynamoDB + DynamoDBDocument and the logger option. aws-sdk version 3.190.0. It appears to return the error on get of a property that is stored as NULL.

TypeError: Cannot read property 'S' of null
    at AttributeValueFilterSensitiveLog (/data/servers/project/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1144:13)
    at /data/servers/project/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1250:65
    at Array.reduce (<anonymous>)
    at GetItemOutputFilterSensitiveLog (/data/servers/project/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1248:40)
    at /data/servers/project/node_modules/@aws-sdk/client-dynamodb/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:16:21

When I omit the logger option, or specifically the info child of the logger object I pass in, then the error disappears. So effectively we cannot enable the logger for DynamoDB queries.

Looking at AttributeValueFilterSensitiveLog, looks like it attempts to check for type properties of an object, without first checking if the object exists.

dambaron commented 2 years ago

Can confirm this issue. My team and I are getting the same trace using the DynamoDBClient with version 3.209.0

loganhersh commented 2 years ago

We are also experiencing this issue though only as of 3.209.0, however it does seem to be the same problem that started this thread.

The core problem is that DynamoDBDocumentClient does not define its own outputFilterSensitiveLog function and so it uses the one defined in DynamoDBClient. That function is designed to handle marshalled records but when using DynamoDBDocumentClient, the records are already unmarshalled when this function is called. So it is essentially expecting a record like:

{
  "field1": { "S": "value1" },
  "field2": { "S": "value2" },
}

but it is getting:

{
  "field1": "value1",
  "field2": "value2",
}

Then when it tries to extract the value here it throws the error because it first tries to access obj.S where obj is now just the unmarshalled value of the field.

The error only occurs when the value is null because for any other value type, take a string for example: "value1".S is semantically valid. The result of this "double-unmarshalling" though is that all values for the dynamodb record appear as undefined. You can see this by providing a logger in the DynamoDBClient config and then executing a QueryCommand ensuring the records in the response have no null values (to prevent the error from occurring). The output looks like this:

{
  ...,
  output: {
    ...,
    Items: [
      {
        field1: undefined,
        field2: undefined,
      }
    ]
  }
}

The reason this has begun happening for certain people after 3.209.0 is because previously if you did not provide a logger, then the logger was set to an empty object. The logger middleware only invokes outputFilterSensitiveLog when typeof logger.info === "function", which previously wasn't true because logger was an empty object. However, after this commit the logger now defaults to the NoOpLogger which defines the info function. So anyone who doesn't provide their own logger will now be invoking the function that throws the error (i.e. the pre-existing bug originally reported in this issue).

TLDR: DynamoDBDocumentClient does not define it's own logging filter functions and so it uses those that are defined by DynamoDBClient. Those filter functions effectively unmarshall the records and so they expect the records to be marshalled. The logging middleware which invokes those functions does so AFTER the document client has already unmarshalled the records and so the filter functions try to unmarshall AGAIN causing the error.

It seems like the document client needs to define its own filter functions that do not unmarshall the records since they will already be unmarshalled.

vydhi-v commented 2 years ago

Commenting for visibility. Noticing same issue as recent comments while using latest version of DynamoDBDocumentClient without a logger.

JamesKyburz commented 2 years ago

This issue has occurred since the nooplogger was added.

Previously the logger middleware didn't run because logger.info wasn't set, now that it is it crashes. That said the actual bug is when attempting to filter sensitive log information no check for null is in place.

This code works with 3.204.0 but crashes in 3.209.0

cat package.json
/tmp/dynamdb-null-issue ᐅ cat package.json
{
  "name": "dynamdb-null-issue",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "Apache-2.0",
  "dependencies": {
    "@aws-sdk/client-dynamodb": "^3.210.0",
    "@aws-sdk/lib-dynamodb": "^3.210.0"
  }
}

cat index.mjs
import { DynamoDBClient } from '@aws-sdk/client-dynamodb'
import {
  DynamoDBDocumentClient,
  GetCommand,
  PutCommand
} from '@aws-sdk/lib-dynamodb'

const client = new DynamoDBClient({
  endpoint: 'http://localhost:8000',
  region: 'us-east-1',
  credentials: {
    accessKeyId: 'x',
    secretAccessKey: 'x'
  },
  // logger: null
})

const dynamodb = DynamoDBDocumentClient.from(client, {
  marshallOptions: {
    convertEmptyValues: false,
    removeUndefinedValues: true,
    convertClassInstanceToMap: false
  },
  unmarshallOptions: {
    wrapNumbers: false
  }
})

await dynamodb.send(
  new PutCommand({
    TableName: 'dynamodbStackTable',
    Item: {
      pk: 'null-issue',
      sk: 'null-issue',
      value: 'ok',
      null: null
    }
  })
)

const result = await dynamodb.send(
  new GetCommand({
    TableName: 'dynamodbStackTable',
    Key: {
      pk: 'null-issue',
      sk: 'null-issue'
    }
  })
)

console.log('result %s', JSON.stringify(result, null, 2))

/tmp/dynamdb-null-issue ᐅ node index.mjs
/private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1144
    if (obj.S !== undefined)
            ^

TypeError: Cannot read properties of null (reading 'S')
    at AttributeValueFilterSensitiveLog (/private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1144:13)
    at /private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1250:65
    at Array.reduce (<anonymous>)
    at GetItemOutputFilterSensitiveLog (/private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1248:40)
    at /private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:16:21
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async file:///private/tmp/dynamdb-null-issue/index.mjs:41:16

/tmp/dynamdb-null-issue ᐅ node index.mjs /private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1144 if (obj.S !== undefined) ^

TypeError: Cannot read properties of null (reading 'S') at AttributeValueFilterSensitiveLog (/private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1144:13) at /private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1250:65 at Array.reduce () at GetItemOutputFilterSensitiveLog (/private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1248:40) at /private/tmp/dynamdb-null-issue/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:16:21 at processTicksAndRejections (node:internal/process/task_queues:96:5) at async file:///private/tmp/dynamdb-null-issue/index.mjs:41:16



Either downgrading to 3.204.0 or uncommenting the `logger: null` fixes the issue.
loganhersh commented 2 years ago

@JamesKyburz I wouldn't necessarily consider that a fix, but instead just a workaround. Also this is exactly what I said a few comments above.

This issue certainly needs re-opened and a proper fix applied since it shouldn't be required to explicitly define a null logger and looking at the code in the clients, it doesn't seem like that is a reliable workaround. Also these workarounds don't help anyone that actually use the logger and can't set it to null.

I would work on a PR to fix it but I don't have the time at the moment. I'll work on it in my free time, though I'm not sure when I will be able to finish it.

trobert2 commented 2 years ago

I also don't think this is the actual solution, but a dirty workaround. @JamesKyburz @loganhersh, I think you would have to set logger: undefined if you use typescript with strict checks. null won't work. image

Victor-ARC commented 2 years ago

Hello!

Just adding some info here. We have @aws-sdk/client-dynamodb version 3.212.0 and this error persists. We're also using @aws-sdk/lib-dynamodb version 3.212.0. Here's a look at an extract of our code in case it's useful to anyone.

// File.ts

// We're getting a DynamoInstance already initialized called Dynamo....
// Not shown for simplicity.
...
const { Item } = await Dynamo.get<Settings>(`${email}#SETTINGS`, 'USER');
...
// Dynamo.ts

import {
  DynamoDBDocumentClient,
  GetCommand,
  GetCommandOutput
} from '@aws-sdk/lib-dynamodb';
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';

...

Class DynamoInstance {
#DynamoClient: DynamoDBDocumentClient;
#TableName: string;

constructor(
  accessKeyId: string,  //Comes from ENV
  secretAccessKey: string,  //Comes from ENV
  region: string,  //Comes from ENV
  TableName: string,  //Comes from ENV
  removeUndefined?: boolean
){
this.#DynamoClient = DynamoDBDocumentClient.from(
      new DynamoDBClient({
        credentials: { accessKeyId, secretAccessKey },
        region
      }),
      { marshallOptions: { removeUndefinedValues: removeUndefined || false } }
    );

    this.#TableName = TableName;
}

...

async get<T>(PartitionKey, SortKey){
  return this.#DynamoClient.send(new GetCommand({
    tableName: this.#TableName,
    key: { PK: PartitionKey, SK: SortKey }
  })) as Promise<Omit<GetCommandOutput, 'Item'> & { Item: T | undefined }>;
}
...
}

I know it may be a bit confusing, but basically, what my team is doing here is to create a simple class whose methods accept the info as parameters and return a typed output based on the desired TS interface. The point is, we're passing ${email}#SETTINGS as our Partition Key, and it looks a lot like it is trying to parse ExpressionAttributeNames into the value of PK. An example of this PK would be "example@some.com#SETTINGS".

I hope this helps someone to debug this issue or at least as a reference for version 3.212.0.

loganhersh commented 2 years ago

@trobert2 @Victor-ARC Not sure if it helps in your cases, but you can also set the logger to an empty object: logger: {} in the DynamoDBClient config. The function that throws the error only gets called when logger.info is defined so in this case an empty object would work as well. Still just a workaround but it may be more compatible with strict ts usage than null.

Since DynamoDBClient and DynamoDBDocumentClient share the same sensitive log filter function, any actual solution would need to either change that function to conditionally unmarshall records, or they should not share that function and DynamoDBDocumentClient should define its own.

RanVaknin commented 2 years ago

Hi everyone.

I apologize for the long wait times. We don't comb through closed issues to see activity on it, the activity here got caught by some automation we had. If an issue persists after closing please open a new one

I'm still not able to reproduce this issue even when specifying a logger:

const ddbClient = new DynamoDBClient({
    region: "us-east-1",
    logger: {debug: (data)=>{console.log(data)}}
})

Can someone please upload a gist or a minimal repo with complete code for me to try and replicate this? Once we have this reproduced I can have the devs take a look.

Thanks, Ran~

loganhersh commented 2 years ago

@RanVaknin thanks for taking a look. You should be able to reproduce the issue by:

  1. Remove the logger from DynamoDBClient config
  2. Create a DynamoDBDocumentClient from the DynamoDBClient: const ddbDocClient = DynamoDBDocumentClient.from(ddbClient, translateConfig);
  3. Send a QueryCommand where one of the return records has a null value

The error occurs when the following are true:

RanVaknin commented 2 years ago

Hi @loganhersh ,

Thanks for the detailed explanation. Im finally able to verify that I'm getting the same exception.

Will discuss it with the dev team and will have someone take a look ASAP.

Repro Code:

import { GetCommand,DynamoDBDocumentClient } from "@aws-sdk/lib-dynamodb";
import { DynamoDBClient } from "@aws-sdk/client-dynamodb"

const ddbClient = new DynamoDBClient({region: "us-east-1"})

const marshallOptions = {
    convertEmptyValues: false,
    removeUndefinedValues: false,
    convertClassInstanceToMap: false,
};

const unmarshallOptions = {
    wrapNumbers: false,
};

const translateConfig = { marshallOptions, unmarshallOptions };

const ddbDocClient = DynamoDBDocumentClient.from(ddbClient, translateConfig);

// const putParams = {
//   TableName: "trobert2",
//   Item: {
//     id: "article:2074847",
//     sort: "language:en",
//     page_image: null ,
//   },
// };

// const put = async () => {
//   try {
//     const data = await ddbDocClient.send(new PutCommand(putParams));
//     console.log(data)
//     return data;
//   } catch (err) {
//     console.error(err);
//   }
// };
// put();

const getParams = {
  TableName: "trobert2",

  Key: {
    id: "article:2074847",
    sort: "language:en",
  },
};

const get = async () => {
  try {
    const response = await ddbDocClient.send(new GetCommand(getParams));
    if (!response || !response.Item) {
      return null;
    } else {
      console.log('Code Found:', response.Item);
      return response.Item;
    }
  } catch (err) {
    console.error(err);
    return null
  }
}
get();

Output

TypeError: Cannot read properties of null (reading 'S')
at AttributeValueFilterSensitiveLog (/test_folder/3846_dynamo_putItem_null/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1144:13)
    at /test_folder/3846_dynamo_putItem_null/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1230:128
    at Array.reduce (<anonymous>)
    at GetItemOutputFilterSensitiveLog (/test_folder/3846_dynamo_putItem_null/node_modules/@aws-sdk/client-dynamodb/dist-cjs/models/models_0.js:1230:40)
    at /test_folder/3846_dynamo_putItem_null/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:16:21
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async get (/test_folder/3846_dynamo_putItem_null/sample.js:32:22)

npm list:

test_folder/3846_dynamo_putItem_null
├── @aws-sdk/client-dynamodb@3.213.0
└── @aws-sdk/lib-dynamodb@3.213.0
trobert2 commented 2 years ago

@trobert2 @Victor-ARC Not sure if it helps in your cases, but you can also set the logger to an empty object: logger: {} in the DynamoDBClient config. The function that throws the error only gets called when logger.info is defined so in this case an empty object would work as well. Still just a workaround but it may be more compatible with strict ts usage than null.

Since DynamoDBClient and DynamoDBDocumentClient share the same sensitive log filter function, any actual solution would need to either change that function to conditionally unmarshall records, or they should not share that function and DynamoDBDocumentClient should define its own.

actually you cannot. The empty object does not satisfy the interface. image

Please note the "if you use typescript with strict checks." in my previous comment. You would have to provide an object implementing all the functions required to satisfy the interface OR undefined (as you can see in my previous comment's screen shot).

tushar-sg commented 1 year ago

facing the same issue in the latest "3.216.0".

Similar to https://github.com/aws/aws-sdk-js-v3/issues/4222

kuhe commented 1 year ago

@trobert2 thanks for researching the issue and revealing the logger interaction, I have submitted #4249 as a fix, pending review.

kuhe commented 1 year ago

The fix is in https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.224.0.

loganhersh commented 1 year ago

Thanks @RanVaknin and @kuhe!

github-actions[bot] commented 1 year ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.