Open kotwanikunal opened 1 year ago
@kotwanikunal maybe we can use the aliases api to atomicly replace the original index, which can operate as follows:
_aliases
apiWould love this as well. Opened an issue at the forums but missed this. Going to script it myself for the time being.
Going to script it myself for the time being.
@sandervandegeijn Could you please provide your script or at least advise what logic you used to make this? Should I rotate index daily, snapshot it, delete it, restore back to the same name with remote_snapshot configured? Or would I need to use an alias?
I really am looking forward to this being built in.
https://github.com/sandervandegeijn/opensearch-searchable-snapshot-management
sure, it's by no means abstracted enough as a general library, bit it serves our purposes. I should move some configs to command line parameters. But to get an idea it's good enough.
It will gather all indices, everything older than 7 days will be snapshotted (same name as the index), removed and restored as a searchable snapshot index with the same name as the original. The data will be available as is, so everything is searchable through the dashboards visalisations and such as the users were used to with the same index patterns.
After 185 days everything gets removed.
I like the aliases idea btw, this gives the option to make a difference between local en remote indices in their names while maintaining functionality. Script will be less complex as well. I might implement this later this week.
The distinction between both in dashboards / index management is non existent at this time.
@sandervandegeijn Thanks for this. Looks good. One question, it looks like one should use this in addition to an ISM policy to perform backups. Can you backup a remote snapshot? Would a backup policy just be backup all indices (including remote) and prefix with backup for their name (to avoid backups being marked as remote snapshots and being picked up by your script)?
I assume you just run this as a cron job every hour or something? Probably needs to be more frequent than daily due to it needing time to complete snapshot before it can restore completed snapshot as remote?
Regarding aliases, would there be any impact to performance? I thought I read a while back that using aliases has an impact to performance.
I don't know, we would store that on the same storage so there is no benefit for us to do that. If you want to do it, maybe it's better to save an index to two repos before deleting it.
I'm going to restore the snapshots under a different index name, something like remote-* and create an alias. This thread gave me that idea.
We run it every hour yes, but you can do it less frequent. Index will be there unit the next run at whatever time that's going to happen.
We are using aliases extensively in our other clusters, never had any issues.
We find this feature quite crucial for enabling extensive use of searchable snapshots, which are otherwise a great way to support "cold storage". It's a pity this one is not implemented yet. We all have to implement our own scripting to automate this process, which can be way more prone to errors. Is there an interest in moving this feature forward?
I am very much interested in having this feature implemented.
What should I do to contribute to this? I think I have implemented the solution
Is your feature request related to a problem?
What solution would you like?
What alternatives have you considered?
Do you have any additional context?