aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.94k stars 701 forks source link

dynamodb: read_items support for ScanIndexForward #3017

Open mccauleyp opened 1 week ago

mccauleyp commented 1 week ago

Is your idea related to a problem? Please describe. Thanks for the great resource! It'd be great if the dynamodb.read_items method could be enhanced slightly to support changing the ScanIndexForward option on the DynamoDB query call.

https://aws-sdk-pandas.readthedocs.io/en/3.7.2/stubs/awswrangler.dynamodb.read_items.html https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb/client/query.html

Describe the solution you'd like I'd suggest exposing a scan_index_forward boolean kwarg on the method and explicitly propagating that down into the boto3 call, but potentially you could expose boto3_additional_kwargs similar to the existing pyarrow_additional_kwargs to allow for arbitrary additional kwargs.

jaidisido commented 1 week ago

Adding ScanIndexForward as an argument would be relatively simple. My concern here is that the dynamodb.read_items API is not specific to Query. It abstracts all of GetItem, Query and Scan operations. If we are to support every individual arguments, the API would quickly become overloaded. A dynamodb_kwargs argument would help but then it would require a complex logic to filter the right arguments for each underlying operation