mediacloud / story-indexer

The core pipeline used to ingest online news stories in the Media Cloud archive.
https://mediacloud.org
Apache License 2.0
2 stars 5 forks source link

Migrate ILM Backups to Back Blaze #309

Closed pgulley closed 3 months ago

pgulley commented 4 months ago

Currently ILM are going to S3

thepsalmist commented 3 months ago

Elasticsearch ILM backups already in backblaze,

  1. Same bucket size in S3 and B2 ~ 3.2 TB

Image Image

  1. Elasticsearch data directory from Ramos,Woodward.Bradley

2.0T ./elasticsearch

thepsalmist commented 3 months ago

Last B2 ILM snapshot

"snapshots": [ { "snapshot": "snapshot-2024.08.01-nozfnxqaqyim8mmev6knyq", "uuid": "eIt7BZVXTTeBjto24kZ2Tw", "repository": "mc_story_indexer", "version_id": 8500008, "version": "8500008", "indices": [ "mc_search-000001", "mc_search-000003", "mc_search-000002" ], "data_streams": [], "include_global_state": true, "metadata": { "policy": "bi_weekly_slm" },