pynamodb / PynamoDB

A pythonic interface to Amazon's DynamoDB
http://pynamodb.readthedocs.io
MIT License
2.43k stars 427 forks source link

JSONAttribute(null=True) field TypeError: the JSON object must be str, bytes or bytearray, not 'NoneType' #627

Open monkut opened 5 years ago

monkut commented 5 years ago

pynamodb 3.3.3 Local testing done with Localstack: image: localstack/localstack

I have the following model:

class CollectionStateIndex(GlobalSecondaryIndex):

    class Meta:
        index_name = settings.DYNAMODB_COLLECTION_STATE_INDEXNAME
        read_capacity_units = 2
        write_capacity_units = 1

        projection = AllProjection()
        host = settings.DYNAMODB_ENDPOINT

    collection_id = UnicodeAttribute(hash_key=True)
    state = UnicodeAttribute(range_key=True)

class S3ItemModel(Model):
    s3_uri = UnicodeAttribute()
    collection_id = UnicodeAttribute()
    request_id = UnicodeAttribute(hash_key=True)
    state = UnicodeAttribute()
    created_at_timestamp = NumberAttribute()
    updated_at_timestamp = NumberAttribute()
    result = JSONAttribute(null=True)
    errors = JSONAttribute(null=True)

    collections_state_index = CollectionStateIndex()

    class Meta:
        table_name = settings.DYNAMODB_S3ITEMMODEL_TABLENAME
        region = settings.AWS_REGION
        host = settings.DYNAMODB_ENDPOINT
        read_capacity_units = 2
        write_capacity_units = 1

And it appears that when performing a query for a result where the errors value is Null, the query fails to properly decode the item with the exception:

TypeError: the JSON object must be str, bytes or bytearray, not 'NoneType'

item = next(results)
File "/var/task/pynamodb/pagination.py", line 183, in __next__
item = self._map_fn(item)
File "/var/task/pynamodb/models.py", line 529, in from_raw_data
kwargs[attr_name] = attr.deserialize(attr.get_value(value))
File "/var/task/pynamodb/attributes.py", line 439, in deserialize
return json.loads(value, strict=False)
File "/var/lang/lib/python3.6/json/__init__.py", line 348, in loads
'not {!r}'.format(s.__class__.__name__))
TypeError: the JSON object must be str, bytes or bytearray, not 'NoneType'

Query:

        results = S3ItemModel.query(
            hash_key=request_id,
        )
        item = next(results)

DynamoDB Record:

{
  "collection_id": {
    "S": "collection:11"
  },
  "created_at_timestamp": {
    "N": "1558593743"
  },
  "errors": {
    "NULL": true
  },
  "request_id": {
    "S": "a0cddfe0-aa9f-5dex-88b5-bf808f249edc"
  },
  "result": {
    "S": "[]"
  },
  "s3_uri": {
    "S": "s3://bucketname/key"
  },
  "state": {
    "S": "processed"
  },
  "updated_at_timestamp": {
    "N": "1558593743"
  }
}

Why is deserialization failing for the errors field?
With JSONAttribute(null=True) shouldn't it be able to handle null values for the errors field?

ikonst commented 5 years ago

Indeed, there's something wrong with our handling of DynamoDB NULL type.

hoIIer commented 4 years ago

when I have attr = NumberAttribute(null=True) or without null=True, if the value is null (None), I get this same error... inspecting the code I see that in deserialize it's doing json.loads(value), which throws said error if value=None

removed-account commented 4 years ago

Any update?

garrettheel commented 4 years ago

To help me understand the issue here, are these items (containing {"NULL": true} attributes) written via some other client than pynamodb? pynamodb omits storing attributes altogether when the value is null, which is why this usually works fine.

cjh79 commented 3 years ago

@garrettheel I just came across this problem as well. Yes, in our case, the offending items were added using boto3 directly, including some NULL fields. Now we are moving to PynamoDB, but looking up older pre-Pynamo items is causing this stack trace.

I think PynamoDB ought to be able to handle this -- even if it requires a special setting or something. In the meantime, is there any workaround you can suggest? Even a monkey patch?

garrettheel commented 3 years ago

@cjh79 can you try this against pynamodb==5.0.0b1? I believe it should address this

cjh79 commented 3 years ago

@garrettheel thank you, yes, that version fixes it. Would you recommend against putting 5.0.0b1 into production? I've read the release notes, and none of the breaking changes should affect us, but how stable is this release?

garrettheel commented 3 years ago

@cjh79 it hasn't been widely tested yet, so you may want to wait until the stable release if you're risk averse. that said, there are no changes in this release that I see as particularly risky - the major version bump was mostly due to the Python 3+ requirement

cjh79 commented 3 years ago

@garrettheel is there any timeline on getting 5.0.0 out of beta?