Open MacHu-GWU opened 3 months ago
This isn't really a bug with ion-python.
In the Ion text format 1
is an Integer and 1.
is a Decimal, see: https://amazon-ion.github.io/ion-docs/docs/spec.html
Per the DynamoDB Import Docs: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html#S3DataImport.Requesting.Formats.Ion
They import an Ion Decimal as a Dynamo DB Number. I do not know why they don't map an Ion Integer to a Dynamo DB Number, but they don't. That's a possible feature request for Dynamo DB.
Assuming that it's faster to change your code then get DynamoDB to change, and that your code block is your python code:
To serialize an Ion Decimal from Python you need to create a decimal.Decimal
. That will emit in your Ion stream as an Ion Decimal.
See https://github.com/amazon-ion/ion-python/blob/master/amazon/ion/simpleion.py#L33
I would further advise that for your production code you serialize your Ion as Binary for the imports. Obviously the text format is great for debugging and developing, but the binary format has improved data density and will be faster to import.
Please check out the pydoc in simpleion
and let us know how we can improve that if needed.
Thanks @rmarrowstone .
I believe it is still a bug, but not in amazon ION python, it is actually about DynamoDB Import.
The simpleion.dumps() method gives you the correct value $ion_1_0 {Item:{id:1,name:"Alice"}}
(I expect the id to be integer). However, the DynamoDB import table feature doesn't recognize it. In my TableCreationParam, I defined the attribute type is N
, however, DynamoDB import table feature raises an error for that.
@rmarrowstone another issue is that the DynamoDB import table document didn't mention how to use ion binary format to prepare the data. And the document says that Items in an Ion file are delimited by newlines. Each line begins with an Ion version marker, followed by an item in Ion format., which implies that I should use text to code my data. Then how can I do this?
I would further advise that for your production code you serialize your Ion as Binary for the imports.
Sadly it does look like they only support the Text format, so that was some bad advice, sorry. It would be more optimal if they supported the binary format, but alas...
I am trying to generate ion data file manually using this library so that I can use it for DynamoDB import table, this is my DynamoDB item in python dictionary.
the
amazon.ion.simple_ion.dumps
method gives me:$ion_1_0 {Item:{id:1,name:"Alice"}}
, note that there's no dot after number 1. Then the import_table API fails.However, if I manually add the dot behind the number 1, making it to be
$ion_1_0 {Item:{id:1.,name:"Alice"}}
, then it works.I also tried to export a manually createdDynamodb table and I found out that the export ION file has the dot after the integer number.
I also tried the
loads
method, I think the integer without dot is a valid value for deserialization. However, it doesn't work with DynamoDB table import.How do I ensure that there's an dot after any integer in the text view of my data?