Closed rahulbot closed 8 months ago
The " Idle public IPv4 address per hour" $4 item is almost certainly an unused "elastic" (static) IP address
500GB for each new month seems to be a good estimate for WARC files:
root@ramos:/srv/data/docker/indexer/worker_data/archiver# du -hs 202?/[01]*
291G 2023/11
473G 2023/12
455G 2024/01
480G 2024/02
98G 2024/03
Multiplying that out: 12*500G/month / 1024 => 5.9T/year From web-search, historic data looks like it has higher volume, so those numbers may be low, but given that the currently loaded data (and associated WARC files) are almost 12 months, I think 6TB/year is a reasonable figure for WARC growth.
3.222.XXX.232: The address with allocation id [eipalloc-01f7f53e7bb256fe3] cannot be released because it is locked to your account. Please contact AWS Support to unlock it
mediacloud-icinga
- this was attached to the icinga instance
mediacloud-herewegoagain-frontend
- was attached to the frontend EC2 instance, frontend apps were long migrated
mediacloud-herewegoagain-data-dokku
mediacloud-herewegoagain-misc-v1
- was root volume of misc
instance (docker swarm maneger)The mediacloud-corrupt-postgresql-XXX
are snapshots from the Postgres chunks, Database B, C, D. The required stories from these had been extracted to the respective S3 buckets. Unless there's a foreseeable use/need to look at these database sections, then these can be deleted.
.
Deleted S3 buckets as per the Excel file.
@thepsalmist to do cost analysis ion Backblaze
Backblaze sorage cost $6 per TB/month
Total storage cost S3 = 109.2TB * ^/TB = $655.2
Current AWS cost S3 = $2438.60 (Mar 2024)
Based on the Transfer requests Tier 1 397,127 & Tier 2 109,969,437 = $43.69 vs AWS $45.98
FYI: AWS responded on the DTO request saying it is all-or-nothing. To get free you have to take everything out of AWS, which is confusing. The result is that in the short term we won't get any credits to support re-indexing costs.
Closing this issue as no longer active because we've taken actions or split off to new to-do items.
We want to reduce the ongoing monthly off-site storage costs as much as possible. Two main tasks here: (1) audit AWS bill/use to reduce cost as much as possible and (2) consider alternative off-site storage services.
Notes on the first task (audit AWS):
Notes on the second task (consider alternatives):
Sharing monthly and/or yearly numbers would be great.