boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
8.93k stars 1.86k forks source link

Numbers such as 1e100 cannot be retrieved from DynamoDB #2500

Open nyh opened 4 years ago

nyh commented 4 years ago

DynamoDB has an unusual number type, allowing numbers with up to 38 decimal digits of precision and exponent between -128 and 126. In particular the number 1e100 (i.e., 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000) is fine - it has the exponent 100, and just one digit of precision.

However, while boto3 successfully stores such a number in DynamoDB, it later fails to retrieve it:

        table.update_item(Key={'p': p},  UpdateExpression='SET a = :val',
            ExpressionAttributeValues={':val': Decimal("1e100"})
        print(table.get_item(Key={'p': p}, ConsistentRead=True)['Item']['a'])

The get_item function fails, with the following exception:

self = <boto3.dynamodb.types.TypeDeserializer object at 0x7f3b0cbfc550>
value = '10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'

    def _deserialize_n(self, value):
>       return DYNAMODB_CONTEXT.create_decimal(value)
E       decimal.Rounded: [<class 'decimal.Rounded'>]
/usr/lib/python3.8/site-packages/boto3/dynamodb/types.py:277: Rounded

We can see the number "10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" was successfully retrieved from DynamoDB, but then boto3 failed to parse it!

I think I know what causes this bug: DYNAMODB_CONTEXT in boto3/dynamodb/types.py asks the "Decimal" library to verify that the number to be parsed (in DynamoDB's response) must only have 38 digits of precision, but "10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" looks like it has 100 digits of precision. But this is not true - because all these digits are zero, this number actually has just one digit of precision, and it is fine. Moreover, not only does "Decimal" check the precision of the number incorrectly, this entire check is pointless because we are parsing here the response from DynamoDB - it cannot have an error in its format.

nyh commented 4 years ago

Adding the following hack to the application works around the bug:

    import boto3.dynamodb.types
    import decimal
    boto3.dynamodb.types.DYNAMODB_CONTEXT = decimal.Context(prec=100)

(note that this monkey-patch will also disable checking of the number being written, but in any case DynamoDB will refuse numbers which are too big, so it's not a big loss).

swetashre commented 4 years ago

@nyh - Thank you for your post. I am able to reproduce the issue. Marking this as bug. In the meantime you can use boto3.client instead of resource.

nyh commented 4 years ago

Thanks. By the way, I tried to bypass the Python type serialization using boto3.client but couldn't get it to work - client.update_item() for example seemed to modify its parameters just like resource. Can you please point me to an example of such a workaround? Thanks.

swetashre commented 4 years ago

@nyh - Are you not able to use client.update_item() ?

In [1]: import boto3

In [2]: client = boto3.client('dynamodb')

In [3]: res = client.update_item( TableName='test',Key={'id':{'N':'10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'}})
nyh commented 4 years ago

@swetashre again, this is unrelated to the original issue, but I still can't get the "client" workaround to work - trying to take the client from an already existing resource. For example, I tried

client = test_table.meta.client
client.update_item(TableName=test_table_s.name,
            Key={'p': {'S': p}},
            UpdateExpression='SET a = :val',
            ExpressionAttributeValues={':val': {'N': s}})

The "Key={'p': {'S': p}}" isn't taken verbatim - the {'S': p} is actually translated into a map ({"M": ...}) exactly as a resource would do it. What am I doing wrong? How do I disable this layer of translation? Thanks.

JustinTArthur commented 3 years ago

Submitted a pull request to allow truncation of insignificant digits during persistence. #2913

JustinTArthur commented 3 years ago

While the PR is being reviewed, I have an alternate deserializer anyone can use with the low-level boto3 client or botocore called ddbcereal.

aviadpriel commented 1 year ago

Following is any update on this issue?