Closed jrconlin closed 4 years ago
This presumes using the AWS CLIv2
draft 1:
TABLE=$1
aws dynamodb scan --table-name $TABLE --filter-expression 'chidmessageid = :chid and attribute_not_exists(chids)' --select COUNT --expression-attribute-values='{":chid":{"S":" "}}'
The TABLE name can be found via aws dynamodb list-tables
command
From above bug:
aws dynamodb scan --table-name $TABLE --filter-expression 'chidmessageid = :chid and attribute_not_exists(chids)' --select COUNT --expression-attribute-values='{":chid":{"S":" "}}'
{
"Count": 5713425,
"ScannedCount": 571685091,
"ConsumedCapacity": null
}
so of 571,685,091 records, 5,713,425 contain no ChannelIDs (meaning no subscriptions are present), or approximately 0.01% of our user base.
I'm not sure this query covers the users (I'll call them "broadcast only") in question.
"Broadcast only" users would have only HELLO
'd for the sake of receiving broadcasts, but never subscribed to any Push channels. In that case they wouldn't have any entry at all in the message table, just a router entry.
Fair points. The problem is that doing a search for the actual number of clients would involve a table scan and a iteration of queries, which I think we should never, ever do. Thanks for the discussion about this we had, where we determined that we would not get any additional writes, but would have more junk data in the router table. We should add a TTL to webpush to handle that, as well as consider purging older records from the router db.
Due to https://bugzilla.mozilla.org/show_bug.cgi?id=1617136, we wanted to find out how much costs might increase. This requires finding how many UAIDs do not have channels associated with them.
Need to compose a AWS DynamoDB query to find these unbridled UAID records. compared to existing data.