The keys for the feed table on DynamoDB are:
Partition Key: facebook_id
Sort Key: created_time
And for the GSI are:
Partition Key: facebook_id
Sort Key: is_parsed
This lets the GSI be a sparse index as R2-D2 deletes is_parsed (initially false after a post is parsed). But, DynamoDB scan returns the responses in ascending order of facebook_id, which is haphazard and not in a latest-first fashion, which would be better.
A GSI like below might help:
Partition Key: is_parsed
Sort Key: created_time
And fetch the latest posts using a query rather than a scan since the query API allows for a ScanIndexForward flag to scan in reverse order.
Possible issues:
Hot partitions - likely not a big issue considering the low volume of data
1 partition can have 10 GB of data at max. Also not an issue, except a long time into the future when we're seeding the DB for the first time again. But really, hitting the 10 GB cap is unlikely.
The keys for the
feed
table on DynamoDB are: Partition Key:facebook_id
Sort Key:created_time
And for the GSI are: Partition Key:
facebook_id
Sort Key:is_parsed
This lets the GSI be a sparse index as R2-D2 deletes
is_parsed
(initiallyfalse
after a post is parsed). But, DynamoDB scan returns the responses in ascending order offacebook_id
, which is haphazard and not in a latest-first fashion, which would be better.A GSI like below might help: Partition Key:
is_parsed
Sort Key:created_time
And fetch the latest posts using a
query
rather than ascan
since thequery
API allows for aScanIndexForward
flag to scan in reverse order.Possible issues: