Closed pdbossman closed 2 months ago
This feature could be implemented by calling DescribeTimeToLive
to retrieve the column name that contain the expiration timestamp, and then filtering out items that are expired. I wonder if we should make this behavior optional at all. Are there cases where we would like to preserve expired items?
Validation may temporarily fail as DynamoDB would still return the records while we don't. But practically thinking, I'd always want to discard expired items...
I don’t think validation would return the records since we will also exclude them when reading the source table.
The way this is implemented in PR #206 is by configuring the Scan
operation on the source table to filter out the expired items. This means these items are not even loaded from the source database. This is the case both when we perform the migration and when we perform the validation.
Is this fine?
With DynamoDB, the user can specify an expiration date-time in a column, and that column can be enabled/disabled to apply TTL. DynamoDB will then periodically scan the table and delete this data.
This means it's possible to be streaming a significant amount of data that is expired. This can slow the migration down by itself. It can also create a large overhang of expired items to be scanned and deleted post migration.
Due to all of the above, it'd be desirable to have an option to discard items being copied from source to target that have already expired.