Closed twelch closed 1 month ago
Might also be able to use partiql batch commands - https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/javascriptv3/example_code/dynamodb/actions/partiql/partiql-batch-delete.js
another option might be to re-synth the stack without the table and re-deploy (to delete the table), then re-synth the stack again with the table back (to create it again).
There may also be resources relying on/connected to this table and it wouldn't easily let you just delete it. Manually deleting it would create drift which may cause an error.
Scan, Query, BatchGetItem. Which to use?
It looks like a scan is required to get all of the items in a table.
A query is faster than a scan, but it requires you to provide a hash key (aka partition key). We don't store multiple items under a single hash key, so this doesn't help us for the use case of deleting all tasks, or deleting all tasks given a service name (which is used as range key, not partition key).
Paginator wrapper is neat way of querying with paging but only works if you have a single partition key:
Since we could have 1000, even 10,000 task results in a table for a given project, and each db item probably averages 50-100KB, this could be up to 1GB in size.
Another option to just "delete" all items, would be to set a unique prefix to table names on creation (see GeoprocessingStack.ts)
tableName:
gp-${stack.props.projectName}-tasks``
The table prefix could be stored as CfnOutput value, which can then be read at deploy time using describe-stacks command which outputs JSON.
The prefix value can be re-used, unless user indicates at deploy time that they would like to regenerate the tables, clearing all values.
aws cloudformation describe-stacks --stack-name gp-california-reports-tim --region us-west-1
aws-sdk version of this is here - https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/cloudformation/command/DescribeStacksCommand/
paginate* functions are now offered by AWS for handinling pagination of get query. e.g. paginateScan
3rd party version available with parallelization - https://github.com/shelfio/dynamodb-parallel-scan
gp
clear
commands are for clearing gp function results cached in thetasks
dynamodb table.The problem is it's very slow, running often crashes with out of memory error, and at least for the
clearResults
command it doesn't always clear all of the records.In aws-sdk v3, to increase performance, dynamodb
batchWriteItem
is supposed to support delete. Should be able to batch delete at least 25 items at a time.Suggest unit testing this using the local dynamodb service
https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/dynamodb/command/BatchWriteItemCommand/ https://stackoverflow.com/a/9159431/4159809