guardian / grid

The Guardian’s image management system
https://www.theguardian.com/info/developer-blog/2015/aug/12/open-sourcing-grid-image-service
Apache License 2.0
1.44k stars 120 forks source link

script to purge stuff from S3 following 'reaping migration' etc. #4111

Closed twrichards closed 1 year ago

twrichards commented 1 year ago

script to purge stuff from S3 following https://github.com/guardian/grid/pull/4066 (plus clearing other detritus)

twrichards commented 1 year ago

We're going with an AWS Athena approach as the volumes of inventory information (50+ gzipped CSVs per day, ~50MB each) is too much to deal with locally.

May open a similar PR just for a script to extract all the IDs from ES (which we can load into Athena).