Closed thepsalmist closed 7 months ago
... API to automate snapshot creation every 91 days (Rollover should be every 90 days)
My immediate thoughts/questions (may not be accurate):
- Are we 100% sure that the two policies we be schedule at the exact same time? Otherwise we can't guarantee that 91st day (for snapshot) will be a day after the 90th day (for rollover).
- The two schedules even if they start at the same time, will drift over time:
Rollover: 90th day, 180th day, ... 900, 990, ... Snapshot: 91st day, 182th day, ... 910, 1001, ...
I think you want them to happen every 90 days but one should start a day earlier/later
- Are the snapshots incremental?
Yes the snapshots are incremental
Yes, the scheduling is a little bit tricky. Since the count starts on the day of deployment to prod. These would actually overlap since initial rollover would have happened by then. Looking at this!
Seeing this comment:
Are we 100% sure that the two policies will be schedule at the exact same time? Otherwise we can't guarantee that 91st day (for snapshot) will be a day after the 90th day (for rollover).
I had thought there was some way to trigger actions (like snapshotting) as part of ILM, and that scheduling wouldn't be an issue...
Seeing this comment:
Are we 100% sure that the two policies will be schedule at the exact same time? Otherwise we can't guarantee that 91st day (for snapshot) will be a day after the 90th day (for rollover).
I had thought there was some way to trigger actions (like snapshotting) as part of ILM, and that scheduling wouldn't be an issue...
Yes Ideally, had we gone with the full ILM phase rollovers from hot->warm->cold/frozen
snapshotting would automatically be done once we rollover int the cold/frozen
phase.
So right now with all our indices in the hot
phase, we're trying to achieve this using Snapshot Lifecycle Management (SLM) APIs
@thepsalmist Can you add a small description (in this PR or docs) what this change mean vs what we were aiming for with the 90-day rollover?
@thepsalmist Can you add a small description (in this PR or docs) what this change mean vs what we were aiming for with the 90-day rollover?
Resolved
This PR implements taking Elasticsearch snapshots to S3. Implementation uses ES's Snapshot Lifecycle Management (SLM) API to automate snapshot creation every 14 days. Snapshots created are incremental and therefore should allow us have a point-in-time restore point for every 2 weeks of ES data.
PS: Removed the dependency of ILM 90-day rollover.